BLOG

Welcome! The purpose of this website is to have a place where I, Andrew Powell, can share with others the electronic and software-based projects I work on! I need to point out, though, I'm slowly starting to rely more and more on my recently created HACKADAY.IO, essentially a social media site for other hobbyists / makers! I also got a YouTube channel. I probably post there more than anything.

35. Temple BCI Hackathon and SREC / lwIP Nexys 4 Project

posted Oct 3, 2016, 3:32 AM by Andrew Powell   [ updated Oct 3, 2016, 3:33 AM ]

Two new things I need to briefly mention in this blog. First of all, three other fellow graduate students and i participated in the Temple BCI Hackathon last week. We couldn't be happier with how well we performed, managing to place first out of seven teams! The link to the HACKADAY.IO project page can be found here.


Secondly, finished another project that will later play a larger role in a bigger, collaborative project. I will start discussing the details of the collaborative effort when more of the project is finished, so I won't go into much detail other than showing the demonstration video. This project's repository can be found with this link.




34. Scatter Gather DMA and Box Muller Transformation

posted Sep 20, 2016, 8:40 AM by Andrew Powell

Just completed another small project. As I've mentioned in an earlier post, I'm starting to post most of my projects on my HACKADAY.IO for its community. So, this project is no exception. I'll embed the video here. But, please check at the project log on HACKADAY for more information on the project itself.


33. Real-Time Visual Equalizer ( V2 )

posted Sep 14, 2016, 8:25 AM by Andrew Powell

This is a redo of an older project I was terribly unsatisfied with! Since I already did tons of explanation on the project's theory and implementation in this HACKADAY.IO page, I won't say much here! Enjoy the demonstration! :D


32. Linux on Zynq / Real-Time Video Processing

posted Sep 1, 2016, 7:41 AM by Andrew Powell

Recently completed another project with the Zybo! The video can be seen below. Also, the GitHub repository can be found here! Since I really enjoyed working on this project, I will describe the RTL design and some of the software here! By the way, after this post, I will start to post these small blogs on my HACKADY.IO since a lot of other hobbyists / makers are building a community there.  


RTL Design. The design from the perspective of the Vivado IP Integrator can be viewed below! I really enjoy the graphical approach to creating a top module; I find it all too easy to make a mistake when creating a regular top module in either Verilog or VHDL.  

RTL design

So, in every project I do I try to learn something new. In the case of this project, one of the new things I needed to learn was the AXI VDMA. In general, a DMA is a component within a computer system that allows a peripheral device to access memory, without having to go through a host device. The host device is typically a more general-purpose processor that runs the system's higher-level, application software. DMAs are particularly useful when the host device needs to speed its precious clock cycles on time critical ( i.e. real-time ) tasks and can't even afford to  lose time with costly context-switching. Prior to this project, most of my experience with DMAs was with the AXI DMA ( performs transfers between memory and a device ) and the AXI Central DMA ( CDMA ) ( performs transfers between memory and memory ). 

Of course, the VDMA is nothing but a specialized DMA for transferring video frames! Similar to the AXI DMA, main memory is accessed through AXI4-Full interfaces, whereas access to a device is done through the simpler AXI4-Stream interface. The major difference is that synchronization on each new video frame is done with either one of the user defined signals in the AXI4-Stream interface or a fsync signal on the VDMA itself. The last signal in the AXI4-Stream interface signifies the end of a video line. But, a lot of the lower level details are in fact handled by the VDMA and the Xilinx HLS Video library. I should point out the only core I created myself is the HLS Video Core.

Close look at how the HLS Video core is connected to a VDMA core. The AXI memory-map to slave ( M_AXI_MMS ) and slave to memory-map ( M_AXI_S2MM ) connects to main memory through the PS.

Despite the fact the RTL design utilizes a significant amount of resources , one of my biggest regrets with the project is that I don't do as much video processing in the FPGA as I originally wanted. Specifically, I only perform the resizing of 320x240 video frames to 640x460 video frames in HLS Videocore. The other image processing algorithms ( i.e. Sobel, Harris, and Canny ) are computed with OpenCV image processing functions in software. In my future project with my HD camera, I definitely intend to focus a bit more hardware acceleration with the FPGA!  

There are two VDMAs implemented in this project. The Digilent AXI Display Core, which drives the VGA interface, acquires video frames with one VDMA, and the HLS Video core depends on the second. Both VDMAs are configured in free-running / parking modes of operation, however the VDMA connected to the AXI Display Core only needs to read.

Software. I of course needed to learn how to configure the USB controller as a host PHY device with EHCI. Luckily, Xilinx has a wiki where they demonstrate how to properly edit the device tree and enable the right drivers for the kernel ( it's almost nice that I'm doing this type of project so late ). Not mentioned in the wiki, though, was the fact the usb libraries need to be included in the root file system. And, in order to run the usb utilities, those need to be enabled in the root file system, as well. Apart from changing the device tree, all of these configurations are done with the petalinux-config tool, of course!

Another issue I ran into was how to take advantage of interrupts from a user application, without having to create a separate Linux driver module for each core. Plus, I especially wanted to use the Standalone drivers, even for the GPIO. Previously, my go to driver for the AXI GPIO core in Linux had been the SysFs driver, but I didn't like the fact I would have to constantly access files in order to utilize a few GPIO signals. Instead, I took full advantage of the generic userspace I/O ( UIO ) driver, which not only lets me import Standalone drivers but also makes accessing interrupts super easy! I still have plans on implementing a Linux driver module ( just a little goal of mine ), but I want to explore HLS a bit more in depth first. 

So, here's a small outline on how the software is structured. I'm going to avoid adding code snippets since I find including a ton of snippets is not as appealing as a graphical representation, for instance the screenshots taken from the IP Integrator. 
  1. Perform the necessary memory maps from virtual to specified physical memory! The Standalone and the drivers I write are intended to run closer to the hardware; in other words, the virtual addresses granted by the Linux kernel for user applications are no good! 
  2. Configure the camera and set the resolution to 320x240! Programming this step is simple, once you ensure the kernel has the write drivers and libraries! And, by simple, I am referring to the OpneCV VideoCapture class.
  3. Configure the display! Because I am using the Digilent AXI Display core, I only needed to make a few modifications to their driver. In fact, I needed to make similar modifications to the other Standalone drivers since they all depend on physical addresses, not the kernel's virtual addresses. ( In an effort not to repeat myself, I won't repeat this step but know that it's implied. ) 
  4. Associate OpenCV with the video frame buffers. This step involves associating the video frame buffers --- which are placed outside the memory space of the kernel --- with OpenCV Mat objects. This step is crucial since I want to depend on the functionality of OpenCV instead of re-implementing all the filters! More on this detail in the main loop of the application!
  5. Configure GPIO driver for user I/O. Nothing too fancy here, other than I am once again using the Standalone driver. A separate thread is launched in order to avoid polling the GPIO core for new input; the thread is written such that it waits on the GPIO's interrupt.
  6. Configure the HLS Video core. This step not only involves setting up the HLS Video core, but also the respective VDMA. Nice feature of the HLS tool is that a software driver is automatically generated for both Linux ( using the generic UIO driver ) and Standalone... but I didn't like how the Linux / UIO driver was structured, so I ended up doing what I did for the other cores that had Standalone drivers!
  7. Run the main loop! This step of course is composed of multiple steps! Before getting to what those steps are, I want to point out frames are buffered to ensure the visuals shown on the display appear to change smoothly; without frame buffering, you can actually see the changes occurring at a line and pixel level. Thus, existing outside the memory space of the kernel are a process frame and display frame, both of which have a resolution of 640x480 and 4-channel 8-bit pixels. 
    1. Display the frame for which the processing is finished. Pretty self explanatory. The details of this operation are abstracted away by the Digilent driver. Since the source code is freely available, you can look can see that changing the frame involves configuring the VDMA to park at a selected frame. Not sure whether the term "park" is specific to the AXI VDMA core or general to other DMAs specialized for video, but it basically means the VDMA is configured to continuously read over a single frame.
    2. Perform the processing on a separate frame. After a frame is captured from the web cam, a filtering algorithm can be performed. The slide switches on the Zybo board enable the filtering operations: Gray, Sobel, Harris, and Canny. To be honest, the focus on the project wasn't necessarily on the filters, mainly because they're OpenCV function calls. The important sub-steps are to copy the filtered frames into an available frame buffer that exists outside the memory space of the kernel, and then trigger the VDMA core associated with the HLS Video core to perform the resizing.
That's it! There's a lot of text that went into explaining the main points of the project. So, I would suggest looking at the source code in the GitHub repository ( see link at beginning of blog ). 

What's next? Well, I am disappointed with two aspects of this project. 1 ) I didn't do as much video processing in the FPGA as I originally planned. I could have done more, but I also planned to develop the embedded design such that different filters can be selected. As far as I can see, you can't easily multiplex different video streams within a single HLS core. In retrospect, I should have learned how to utilize the AXI4-Stream Switch, but I completely overlooked its existence. 2 ) I've had an HD camera for a while, but never bothered to get the cable to connect it to the Zybo board's HDMI port. And, I've seen a video on YouTube of someone streaming video from the HD camera to the Zybo! If I can do the same, I can avoid having to reduce the resolution to such a low amount!

The HD camera I'll hopefully be able to use! 

Moreover, this project only showcases real-time video processing, but doesn't do anything with the extracted features ( i.e. the filtered images ). So, in the future, I plan to conceive a scenario and build an embedded design that will force to me to perform video processing in real-time, but also perform pattern recognition / decision making!

But before another video capture project, I want to return to doing another audio / video-oriented project. Specifically, I have already started remaking ( and improving ) a project I did a few months ago on the Nexys 4 DDR: Visual Equalizer! 

31. Linux on Zynq / VGA and OpenCV

posted Aug 17, 2016, 3:04 PM by Andrew Powell

Finished my latest project on using the VGA interface from a bunch of user applications running in PetaLinux. In the past, I had developed my own VGA core with hard-coded 640 x 480 resolution, a resolution which didn't seem to work with many monitors. For the latest project, I instead opted to use the Digilent AXI Display Core and the Xilinx AXI VDMA, for which the resolution can be configured during run-time and I can easily use the HDMI interface for future projects. Couldn't find a Linux driver for the VDMA, and I didn't realize I could have used to the AXI Thin Film Transistor ( TFT ) Controller, so I ported the baremetal drivers for both the AXI VDMA and Display Core to the Linux applications. In a later project, I will have to try the AXI TFT Controller since, from my limited research, it appears there's a Linux driver module I should be able to enable and run Qt applications with.

Anyways, the project contains a few user applications. The first video demonstrates the first two, including a ported version of the Digilent example and a separate application that I wrote. The second video showcases the application using OpenCV to decode JPEG and PNG images. Enjoy! Here's the repository.


30. Linux on Zynq / Piano Project

posted Aug 4, 2016, 7:46 AM by Andrew Powell

Completed the next project on learning Linux for the Zynq! In short, the zybo behaves like a piano! The video basically covers everything covered in this blog, but I felt as though I didn't do a good job of explaining the goals and challenges of the project. The repository for the project can be found here under "zybo_petalinux_piano".



So, the original goals --- and the expected challenges --- of the project were to 1 ) figure out how a user application running in Linux can access the memory map of a core running in programmable logic and 2 ) learn how to configure and use the Linux SPIdev Sysfs driver. However, I had some unexpected challenges dealing with 3 ) making sure the kernel loaded properly from flash. 

1 ) Accessing devices from user-space
Accessing physical addresses from a bare-metal application is, well, pretty straightforward since virtual memory isn't really a thing. You just go ahead access whatever physical address. From a user-space application whose memory space doesn't directly map to physical addresses, there's a necessary extra step in which the right block of physical addresses must be mapped to a  block of virtual addresses. This step is obviously something a device driver would do in order to free the user from having to worry about such a low-level detail ( and, not to mention, protect the memory space ). However, I'm going to leave writing a Linux driver --- which will likely consist of finding another driver and using it as a template --- for another project. 

2 ) Using SPIdev
Unlike I2Cdev, SPIdev took a bit more effort to get working, surprisingly. Not sure why or how, but the configurations related to I2Cdev are completely automated when configuring PetaLinux project with the hardware description. Specifically, I didn't need to add any lines to the device tree, nor did I have to manually configure the project to include the right drivers in u-boot and the kernel. I simply wrote the application and it worked like a dream! Alas, I needed to do everything for the SPIdev that I didn't have to do for I2Cdev. Wasn't a big deal, especially considering it forced me to learn a bit more on how to configure the device tree, u-boot, and the kernel.

3 ) Loading kernel from flash
What I found surprising was the difficultly in getting u-boot to load the kernel correctly. I ran into the issue once I got the infamous  "Wrong Image for bootm command" message from u-boot, which is infamous because many other people ran into similar errors that resulted in the same message. From what I gathered from a bunch of forum posts and blogs, the infamous message is typically the consequence of u-boot trying to load the kernel image from a wrong place in memory. More accurately, the kernel might actually be located in the wrong place, but of course u-boot doesn't know that. 

What I didn't realize was the command "petalinux-package --boot --fsbl <FSBL> --fpga <BITSTREAM> --u-boot" doesn't automatically include the kernel when it generates the boot file. The "--kernel" argument needs to be added at the end of the command, else the generated boot file will have a size far less than that of the image itself. I had totally thought the inclusion of the kernel was implied, mostly because UG1144 gave me that impression initially. Upon closer inspection, though, I now realize the instructions on creating a boot image "contains first stage boot loader image, FPGA bitstream, and u-boot". So, the misuse of the command "petalinux-package --boot" was totally on me. 

However, even though the size of the boot file started to make sense, u-boot still couldn't load the kernel. The culprit this time was the "petalinux-package --boot" command doesn't actually put the kernel at the correct offset on its own. In fact, I'm 99% certain it places all the data files directly next to each other, even though the kernel is supposed to be located in its own partition. I mean, was it wrong to assume the packaging command would automatically know to do this, considering the project was already configured for specific partitions? Anyways, I had thought the fix would be to simply add the "--offset" argument with the correct offset to the packaging command, but u-boot still couldn't load the image! 

So, I dropped "petalinux-package --boot" altogether, and instead I run bootgen from the SDK. Works like a charm.


29. Sumo Project / Running the Tracking and Border Algorithm on Prototype 2

posted Jul 28, 2016, 12:27 PM by Andrew Powell   [ updated Aug 1, 2016, 7:09 AM ]

Not going to let this be a long post. Basically, Prototype 2 can now officially do what Prototype 1 did. Instead of having the robot run at a slow speed, I raised the speed of the motors so it looks a bit cooler. ( update ) The first iteration of the Border Algorithm is implemented, as well! Here's the robot in action:




28. Linux on Zynq / New Series!

posted Jul 25, 2016, 4:55 AM by Andrew Powell   [ updated Jul 25, 2016, 4:56 AM ]

Finally got around to my new series on learning the basics of PetaLinux on Zynq! I've already posted a few videos demonstrating my progress, posted those projects to a repository, and, at some point, I'll create a project page on my HACKADAY.IO! 

Video on learning the I2C Device driver

The rough plan thus far is to learn how I learn best: complete projects that force me to learn. In other words, I'll be completing embedded designs on the Digilent Zybo that will force me to learn Linux drivers --- specifically GPIO, I2C, SPI, DMA, and USB --- and different ways with which the Linux kernel can be interacted from a host machine --- specifically TFTP and NFS. Along the way, I'll most likely end up learning other useful tools, as well. 

Even though I have used Ubuntu many times, I can't say I know how Linux operating systems works at a fundamental level. And, after finishing the first several projects in this series, I'm starting to see how useful it is to have fundamental knowledge on Linux. So, in addition to completing several projects leading up to a final project, I'm going to follow an online guide describing Linux to beginners. As I've said, I am by no means new to Linux. However, I find it's extremely useful for embedded development to even brush up on some basic knowledge, considering all Linux operating systems are --- well --- Linux operating systems. 

Video on learning the GPIO SysFs driver

Hopefully, by the end of the summer, my goal for this final project is to develop an embedded system on the Zybo that can perform image processing with a live video feed taken from a USB web cam. Not entirely sure how feasible this feat is. But, from a project someone else did, I found out you can get the source code of Linux drivers if you know where to look. So, in theory, if I can find the source of my web cam's driver, I can compile it under PetaLinux! Haven't tried yet since this project is so far down the line. But, I believe!

27. Sumo Project / Prototype 2's Electronic System

posted Jul 22, 2016, 1:41 PM by Andrew Powell   [ updated Jul 22, 2016, 1:41 PM ]

Just finished the soldering and testing the components for Prototype 2! Pretty excited, especially since I don't have to continue soldering ( that is, assuming nothing will break )! The next set of steps now are to complete the Border Algorithm, the set of rules that determine what to do if the robot tries to cross the border and when the robot will return to moving forward. The Border and Tracking Algorithm will then have to be integrated together and optimized, especially the Tracking Algorithm. Part of the optimization will include implementing a calibration mode, so that the sensors will report consistent values for the same distances.

Anyways, check out the latest video!


26. Sumo Project / It lives!

posted Jul 6, 2016, 7:14 PM by Andrew Powell   [ updated Jul 11, 2016, 5:16 PM ]

So yeah, Prototype 1 can officially execute the control algorithm explained in the last post! Check it out!

First video of the Tracking Algorithm. The robot only turns in the direction of an object.

Updated Tracking Algorithm. The robot actually tries to follow another object. 

Don't think I explained this yet, but the plan is to create three prototypes by the end of the summer. Prototype 1 will focus on the tracking algorithm and ensuring all the components operate as intended. The robot will start looking pretty on Prototype 2. My mechanical teammate, Dmytri, will have a newly designed chassis ready and I aim to have the majority of the electronics soldered down onto a prototyping board. Optimizing the tracking algorithm's parameters will also be an important focus for Prototype 2.

Finally, I'm especially excited for Prototype 3. There will likely be an updated design for the chassis and motors. But, one of my biggest objectives will be to create a PCB! 

1-10 of 35

Comments