When I was debugging the ZPUFlex CPU core, I found myself using the ever-useful SignalTap to trace what was going on inside the CPU. One technique I wanted to use was to follow the program flow, and compare it against a simulated run through the program, thus spotting CPU bugs where the two diverged. To do this I needed a ZPU simulator.
There are two already in existence that I know of: Firstly, the one in the offical ZPU git repo, written in Java and requiring GDB to work. There is, however, no pre-built GDB binary for the ZPU, so gettng this up and running will be a challenge.
The second pre-existing simulator is the ZPUino one. This has some x86 assembly language in it, so can’t easily be built on a 64-bit Linux install, and is also tightly coupled to the ZPUino way of doing things.
Therefore, since I’d never written a CPU simulator before, I set about writing my own, and it’s now complete enough that it can run some of the demo apps from the ZPUDemos repo. The simulator can be found here in source form: https://github.com/robinsonb5/ZPUSim
Let’s start in the time-honoured tradition with a Hello World example, and run the Apps/HelloWorld example from the ZPUDemos repository.
The ZPUFlex core has a number of options which significantly change its operation. The simulator also has a few options which need to be set to match the CPU core:
The base address of the Stack RAM can be set in the core by way of VHDL generics, and it’s critical that this matches between the core and simulator. This is done with the -o or -offsetstack options. The ZPUDemos/RS232Bootstrap and ZPUDemos/SDBootstrap projects map the stack to 0x40000000, so stack addresses have bit 30 set, so we specify -o30.
If your code runs from a combined ROM / StackRAM and you’ve remapped the stack, then use the -b option. This sets the initial program counter to the start of Stack RAM, rather than zero.
So let’s simulate a few instructions from ZPUDemos/Apps/HelloWorld: (We use the -s parameter to set the number of steps to simulate)
> /path/to/zpusim -s10 -o30 Hello.bin PC: 0 Op:b SP: 40007ff8, 40007ff8 nop 0 0 ffffffff ffffffff ffffffff ffffffff PC: 1 Op:b SP: 40007ff8, 40007ff8 nop 0 0 ffffffff ffffffff ffffffff ffffffff PC: 2 Op:b SP: 40007ff8, 40007ff8 nop 0 0 ffffffff ffffffff ffffffff ffffffff PC: 3 Op:88 SP: 40007ff8, 40007ff4 im 8 8 0 0 ffffffff ffffffff ffffffff PC: 4 Op:e5 SP: 40007ff4, 40007ff4 im (cont) 465 465 0 0 ffffffff ffffffff ffffffff PC: 5 Op:4 SP: 40007ff4, 40007ff8 poppc 0 0 ffffffff ffffffff ffffffff ffffffff PC: 465 Op:88 SP: 40007ff8, 40007ff4 im 8 8 0 0 ffffffff ffffffff ffffffff PC: 466 Op:da SP: 40007ff4, 40007ff4 im (cont) 45a 45a 0 0 ffffffff ffffffff ffffffff PC: 467 Op:b SP: 40007ff4, 40007ff4 nop 45a 0 0 ffffffff ffffffff ffffffff PC: 468 Op:8c SP: 40007ff4, 40007ff0 im c c 45a 0 0 ffffffff ffffffff PC: 469 Op:ab SP: 40007ff0, 40007ff0 im (cont) 62b 62b 45a 0 0 ffffffff ffffffff PC: 46a Op:4 SP: 40007ff0, 40007ff4 poppc 45a 0 0 ffffffff ffffffff ffffffff PC: 62b Op:fd SP: 40007ff4, 40007ff0 im 7d fffffffd 45a 0 0 ffffffff ffffffff PC: 62c Op:3d SP: 40007ff0, 40007ff0 pushspadd 40007fe4 45a 0 0 ffffffff ffffffff PC: 62d Op:d SP: 40007ff0, 40007fe4 popsp 0 0 0 40007fe4 45a 0 PC: 62e Op:91 SP: 40007fe4, 40007fe0 im 11 11 0 0 0 40007fe4 45a PC: 62f Op:d8 SP: 40007fe0, 40007fe0 im (cont) 8d8 8d8 0 0 0 40007fe4 45a PC: 630 Op:51 SP: 40007fe0, 40007fe4 storesp 1 8d8 0 0 40007fe4 45a 0 PC: 631 Op:fd SP: 40007fe4, 40007fe0 im 7d fffffffd 8d8 0 0 40007fe4 45a PC: 632 Op:9f SP: 40007fe0, 40007fe0 im (cont) fffffe9f fffffe9f 8d8 0 0 40007fe4 45a PC: 633 Op:3f SP: 40007fe0, 40007fdc emulate 1f 634 fffffe9f 8d8 0 0 40007fe4
Each line shows the current program counter, the opcode, the stack pointer before and after the intsruction, the instruction itself, then the top six stack entries after the instruction has taken effect.
If we remove the -s parameter, we tell the simulator to run indefinitely, which results in the following output:
... PC: 5c9 Op:d SP: 40007f98, 40007f94 popsp 40007f98 40007f94 5b0 48 0 0 PC: 5ca Op:73 SP: 40007f94, 40007f90 loadsp 3 48 40007f98 40007f94 5b0 48 0 PC: 5cb Op:52 SP: 40007f90, 40007f94 storesp 2 40007f98 48 5b0 48 0 0 PC: 5cc Op:c0 SP: 40007f94, 40007f90 im 40 ffffffc0 40007f98 48 5b0 48 0 Reading from UART
The puts() routine in the firmware reads from the UART to check that the TX is available; the simulator currently waits for user input every time the register is read, so we need to supply some data. The easiest way to this is just to add “ > /path/to/zpusim -o30 Hello.bin </dev/zero
Now the simulator spews massive amounts of output to the terminal, which isn’t all that useful if we’re not trying to trace through a specific routine. It’s possible to separate the program trace and the program’s output, however; characters written to the UART are sent to stdout, while everything else goes to stderr – so if we add “2>/dev/null” to the command line, the output is reduced to something more manageable:
> /path/to/zpusim -o30 Hello.bin </dev/zero 2>/dev/null Hello, world! 0 2 5 8 10 13 16 18 21 24