The EightThirtyTwo ISA – Part 9 – 2019-10-28
Having managed to get the CPU working well enough to run a Hello World program, I’ve since persuaded it to run the LZ4 decompression program I posted a week or two back, too – and was pleasantly surprised to find it running happily at 133MHz on my DE2 board.
I’ve been pondering how best to handle interrupts on this CPU, and realised that I’ve painted myself into a bit of corner by using the tmp register as a link register when branching; this means that without adding some kind of temporary storage register or stack logic especially for the task, I can’t change the program flow externally – such as in response to an interrupt – without losing the contents of the tmp register.
Even if I add such a storage register or stack operation, I also have no way to undo that process at the end of an interrupt handler without adding an extra instruction. I still have a little encoding space left for zero-operand instructions but I’d still like to avoid doing this if I can since it will increase the logic size somewhat. I’d also prefer not to add any kind of stack operation because (a) it will enforce the use of a particular register for stack, which is currently completely optional – and (b) I’ll still lose the contents of tmp when we restore the program counter unless I add some kind of “rts” instruction.
The solution I’ve chosen for now trades potential response time for simplicity and lack of extra logic for interrupt handling:
The problem to be solved is that the tmp register is trashed by servicing an interrupt, so I decided to sidestep the problem entirely by only allowing interrupts when the next instruction is about to write to tmp. In practice this means that the pipeline will intercept the first of a run of li instructions, or an mt instruction. In principle ld, ldinc or ldbinc could also be intercepted but I haven’t yet tried that. On return from the interrupt routine, the intercepted instruction will run, writing a new value to tmp, and thus its old contents being lost won’t matter.
Intercepted instructions are replaced with operations that place the program counter (with CPU status bits encoded into the upper four bits) in both inputs of the ALU, the ALU operation set to xor, the first output set to write the resulting zero to r7, and the second output to write the passed-through value to tmp.
Thus, on a interrupt we jump to location zero, with the zero-flag set. This minimises the extra logic required to support interrupts, required no extra value-saving registers, stack logic, or even an interrupt vector; the initial entry point and interrupt vector are shared as location zero, with the zero flag set on interrupt, and clear on powerup
Some rather delicate register dancing is required when saving and restoring the return address, however; because on return we need to execute the instruction we intercepted to trigger the interrupt we must decrement the return address by one before jumping to it. The startup and interrupt handling code looks like this:
vector: // Startup code and interrupt vector. On interrupt the zero flag will be set. cond NEQ li IMW1(PCREL(entry-1)) li IMW0(PCREL(entry)) add r7 interrupt: // We fall through to here if this is an interrupt rather than a power-on event. exg r6 // Swap the stack pointer with the return address stmpdec r0 // Save r0 to the stack stmpdec r6 // Save the return address to the stack stmpdec r1 // Save any other scratch registers to the stack mr r6 // Return the stack pointer to r6. // Service the interrupt here... ldinc r6 // Restore r1 and any other scratch registers - but save one... mr r1 ldinc r6 // Move the return address to temp mr r0 // and thence to r0 li IMW0(-1) add r0 // Decrement return address ldinc r6 // Move saved value from r0 to tmp. exg r0 // Swap tmp and r0, restoring r0 and placing the adjust return address in tmp mr r7 // Jump to adjusted return address. entry: // Main program entry...
If this looks convoluted or verbose… well… perhaps it is – but consider that it’s only 18 bytes!
I have an equivalent to my earlier ZPU Interrupt demo running, which simply alternates between printing the words “tick” or “tock” in response to a timer interrupt. The only slight disappointment is that my interception logic is part of the critical path, and thus brings the fmax of the demo program down to about 129MHz on the DE2.