Since my last post I’ve been continuing to work on the ZPU small core, adding hardware implementations of the optional instructions while trying to keep the core as small as possible.
Certain instructions are closely related enough that once one is implemented the others come almost for free. Sub, Eq, Neq, Lessthan and Lessthanorequal come under that category.
The simplest way to compare two values is to subtract one from the other, and see whether the result is positive, negative or zero, so all five instructions need the result of a subtraction: (sp+1)-(sp).
Eq and Neq could be implemented as a direct comparison, but if the result of the subtraction is available anyway, we can compare that result against zero, which is theoretically more efficient. (There’s a good chance that even if we did a bitwise comparison, the synthesis tool would spot that the subtraction result’s available and use it anyway – the tools are very smart these days!)
Thus we perform the subtraction on every rising clock edge, like so:
if COMPARISON_SUB=true then comparison_sub_result<=unsigned(memBRead(wordSize-1)&memBRead) - unsigned(memARead(wordSize-1)&memARead); end if;
We make comparison_sub_result 33 bits wide rather than 32 because we need an extra bit for signed comparisons. If we were only doing unsigned comparisons we wouldn't need to worry about this.
Since we also need to know whether this result is zero to implement four of the instructions, we'll make a comparison_eq signal and assign it like so. This is done outside a clock edge, so it's combinatorial:
if COMPARISON_SUB=true and comparison_sub_result='0'&X"00000000" then comparison_eq<='1'; else comparison_eq<='0'; end if;
The implementation of the instructions is surprisingly straightforward. Sub is the simplest:
when State_Sub => memAAddr <= sp; memAWriteEnable <= '1'; memAWrite <= comparison_sub_result(wordSize-1 downto 0); state <= State_Fetch;
Eq and Neq are also fairly straightforward. The two instructions are complementary, Eq pushes 1 onto the stack if the two operands are equal, and 0 otherwise. Neq reverses this, pushing 0 if the operands are equal. We could implement the two instructions separately, but it's also possible to implement them together, like so:
when State_EqNeq => memAAddr <= sp; memAWriteEnable <= '1'; memAWrite <= (others =>'0'); memAWrite(0) <= comparison_eq xor opcode(4); state <= State_Fetch;
The opcode for Eq is 46 ("101110"), and the opcode for Neq is 48 ("110000"), so to reverse the sense of the text for "Neq" we can use a simple exclusive-or against opcode bit 4.
Lessthan and Lessthanorequal are a bit trickier, because they perform signed comparison.
Unsigned comparison is simple - we just have to subtract one operand from the other, and check the highest bit of the result, which tells us whether the result underflowed. For signed comparison we can do exactly the same, but we need one extra bit's headroom in the subtraction result. That highest bit is zero if op1 is less than or equal to op2, so we need to invert it. We also need to take care of the difference between lessthan and lessthanorequal. The code for signed comparison ends up looking like this:
when State_Comparison => memAAddr <= sp; memAWriteEnable <= '1'; memAWrite <= (others => '0'); memAWrite(0) <= not (comparison_sub_result(wordSize) xor (not opCode(0) and comparison_eq)); state <= State_Fetch;
So how does the core perform with these changes?
My current test case which writes to the framebuffer in halfwords performs like this:
- Emulated sub, eq, lessthan, etc., and emulated eqbranch/neqbranch: 0.58 fps (579 logic elements)
- Hardware sub, eq, lessthen, etc., and hardware eqbranch/newbranch: 1.55 fps (745 logic elements)
Full source for the project can be found on github for anyone that's interested.