Tuesday, November 25, 2014

AXI interfacing with Chisel

I recently started using Chisel, a hardware construction language from UC Berkeley implemented as a Scala DSL. Although it still has some rough edges, it's definitely usable and I really like how it can map the same description into Verilog or a cycle-accurate C++ simulation.

I'm currently working on making hardware accelerators irregular applications (like graph traversal and sparse matrix operations - oh wait, we can do one with the other).  Although simulation goes a long way for testing and evaluation, I prefer FPGA implementations (even though it takes so much more time and effort, the end product actually is a computer). I found that Chisel also helps out quite a bit with productivity there. Along the way, I made a little something that someone else might find useful: A collection of AXI4 interface definitions and simple peripherals in Chisel. Here's the github repo:

https://github.com/maltanar/axi-in-chisel

I started working on this because I wanted more hands-on experience with Chisel and a faster way of making my hardware prototypes work with AXI interfaces found on e.g Xilinx FPGA/programmable SoCs. There are 50+ signals/ports on a full-blown AXI interface, which can be daunting. However, these can be organized into address and data channels based on the decoupled (ready/valid) abstraction, which fits nicely with Chisel's Decoupled-style interfaces and custom types.

It is worth mentioning that the code here targets the Verilog backend and hardware synthesis only, since there are no testbenches. The generated Verilog should be straightforward to use with the Xilinx IP packager. The peripherals aren't extensively tested, but they performed as expected on a ZedBoard (pushed through Vivado for synthesis).

Right now the repository is a haphazard collection of Chisel source files, with varying degree of comments in each:

SimpleReg - a translation of the AXI Lite slave template (register file) generated by Vivado

SumAccel - read 3 consecutive words from address 0x10000000 using AXI Lite master and sum them (the result can be read through the AXI Lite slave interface)

HPSumAccel - read specified number of words from specified address in large bursts and sum them. Note that number of words should be a multiple of the burst length (set to 512 32-bit words by default), since this doesn't handle chopping the burst into bits. The result and total elapsed cycles can be read through the slave interface.

I realize I could have used DMA IP cores to do most of this, but keeping the interfacing in Chisel has some benefits like tighter integration with the peripherals. Plus it's been a good learning experience :)

Saturday, May 3, 2014

A Programming Utility for the Avnet Spartan-6 Evaluation Kit

While playing around with FPGA-based sparse-matrix vector accelerators on the Avnet Spartan-6 LX16 Evaluation Kit I realized I needed a utility for uploading binary files to the board for testing. The board supports programming over a USB serial connection, but Avnet only provides a Windows-only utility for this purpose, which doesn't have the option to select and send binary files. I was also getting rather sick of having to switch to a Windows virtual machine just to run the programming utility, not to mention this beautiful user interface with jelly buttons and everything:


So I decided to make a small utility that would run on Linux, with simple text-based rx/tx as well as support for FPGA bitfile configuration and sending binary files. The "hardest" part was deciphering the text-based protocol for talking to the on-board PSoC, which is responsible for manipulating the necessary FPGA pins and pushing in the new configuration file. Since the PSoC firmware documentation was nowhere to be found on Avnet's download pages, I ended up using a serial port sniffer to see what the protocol looked like, which was something like this:

(lines prefixed with >> are TX from the progutil to the board)
>> get_config
ack
>> get_ver
3.0.2 ack
>> load_config 1
ack
>> drive_prog 0
ack
>> drive_mode 7
ack
>> spi_mode 1
ack
>> drive_prog 1
ack
>> read_init
ack ack
>> drive_mode 8
ack
>> fpga_rst 1
ack
>> ss_program
ack
>> 
ack
>> read_init
ack ack
>> read_done
ack ack
>> spi_mode 0
ack
>> load_config 0
ack
>> fpga_rst 0
ack

It took a little bit of trial-and-error and some reading of documentation on QThreads (for data rx-tx without blocking the user interface), and the result ended up looking like the following:



I cannot claim huge improvements on the user interface front, but at least it works on Linux and supports the minimal feature set needed to be useful.  I'm planning to add support for power measurements, and the code certainly could use lots of cleanup, but we'll see how much time I have for that. If you are interested, the sources are available at:

https://github.com/maltanar/s6lxek-progutil

I eventually managed to Google my way into finding the firmware documentation, which I've also put on github, in case someone has need of it at some point.