2022-04-19 – Part 1: Establishing communication
If I’m going to find the problem with EightThirtyTwo that’s preventing interrupts from working, I’m going to need some way of observing what’s going on. The CPU works in GHDL Simulation, works on Altera/Intel chips, and works on Xilinx chips so there must be something I’m doing which the Open Source toolchain doesn’t like. (Or I may have stumbled upon an actual bug…)
There are basically three problems to solve here:
- Capturing the state of internal signals
- Transporting those signals to the host computer
- Displaying them in a meaningful and readable format.
The question is really whether I should spend time and effort solving them in a general way, to be useful in future, or whether I should just duct-tape something together to solve this issue and move on. I’ll be honest, I’m going to do the latter, because I don’t yet have a clear or complete enough overview of the problem space to do the former – but that may change once I’ve learned some lessons from building “one to throw away”.
So firstly, capturing internal signals:
In order to be useful for debugging, the system must be able to capture data from a potentially large number of signals on successive clocks – this implies wide data, and lots of it – almost certainly more than we can hope to stream in realtime, so we need somewhere to put it. For this we’re going to need a FIFO queue, and it needs to be – at least potentially – both wide and deep. For debugging EightThirtyTwo I can probably set up triggers manually on a hard-wired basis – merely capturing the Program Counter for a couple of thousand cycles immediately after the interrupt signal goes high will give me plenty of information. But in the longer term it would be nice to be able to set triggers from the computer.
Secondly, transporting data to and from the host computer:
A couple of posts ago I wrote about my adventures talking to designs via the Xilinx Platform Cable, and the Tcl extension I ended up writing to provide the functionality I wanted. That’s basically the same functionality I need here – communicating between the FPGA design and the host computer – JTAG is the traditional method, and it makes sense to follow suit.
The DAPLink-compatible interface built into the IceSugarPro is not supported by xc3sprog and thus can’t be used with my Tcl extension – and adding support, while certainly possible, isn’t on my immediate agenda. The DAPLink is supported by OpenOCD, however, and since OpenOCD configuration files are in fact Tcl scripts I’m right at home in that environment. (Regrettably, though, as far as I can see there’s no Tk support.)
So the first thing is to figure out what JTAG facilities are available in the FPGA. The DAPLink on the IceSugar-Pro has the ability to select between two JTAG channels – the one used to configure the FPGA, and a second one which is connected to GPIOs on the FPGA. We could use this secondary channel if we implemented a full JTAG state machine, and we’d then have complete freedom to define as many Instructions as we wished, and as many Data Registers as we liked of whatever size we liked. For now, though, I’m going to stick to the first channel, and use the two user-defined registers offered by the ECP5 part – namely ER1 and ER2. These are accessed by shifting the 8-bit value 0x32 and 0x38, respectively, into the JTAG TAP’s Instruction Register. Having done that, we can shift arbitrary data into and out of one of two arbitrarily-sized Data Registers, in much the same way as with the Xilinx BSCANE or intel JTAG primitives. The primitive we need to use for the ECP5 is called “JTAGG” – and Tom Verbeure has some example code and a good description on github.
There are a couple of subtleties to take care of, namely that two key signals are registered, and thus are delayed by one clock compared with the control signals, but nonetheless I now have a test design running on the IceSugarPro which I think will serve well as a foundation for future debugging.
As I previously mentioned, to be useful we need to be able to capture wide data, and this implies a wide shift register for the JTAG circuitry. This isn’t 100% given since most FPGAs are capable of creating RAM blocks with differing port sizes, making it possible to split wide words into narrower chunks. If register width ends up being a problem this is something we can explore.
Tcl is ideally suited to dealing with wide data, however, because wherever possible it treats everything as a string – so 32-bit or 64-bit integer limits are no obstacle. Nonetheless, I want to be able to shift data in and out of the JTAG subsystem in smaller chunks. (If we’re capturing half a dozen different signals, it might be convenient – though admittedly slower – to fetch them one at a time while writing them to, for example, a .vcd file)
My test/demo project uses a 256 bit wide register (and FIFO queue), and tests both reading and writing the register in smaller pieces: a complete shift must still be 256 bits, or whatever the register’s width happens to be – but this can be a single 256-bit shift, 256 individual 1-bit shifts, or anything in between.
Finally, displaying the results:
I’ve already hinted at one possibility here, by mentioning .vcd files. Creating such files and leaving the actual presentation to GtkWave or suchlike may be the simplest solution. As previously mentioned, the Tcl subset used by OpenOCD doesn’t appear to allow the use of Tk – however it is capable of acting as a web server, which makes for some very interesting possibilites.
I’ve created a repo on github to contain my IceSugar-Pro related experiments. I’m assuming that you’ll be using the oss-cad-suite, so if you want to try out JTAG demo, activate the oss-cad-suite environment, enter the jtag directory and type “make build” to compile, “make config” to configure the FPGA, and “make run” to execute the example Tcl script under OpenOCD.