Nailing Down the Instruction Set

The EightThirtyTwo ISA – Part 3 – 2019-08-12

My initial plan for the EightThirtyTwo ISA was to define eight general purpose registers, and use the ninth “temp” register as an accumulator. Immediate values would be assigned here, as would the result of any arithmetic operations. The best way to test the concept is, of course, to write some actual code. So let’s see what simple looping looks like with this concept:

  li 99 // (needs two li instructions since 99 doesn't fit within a signed 6-bit value)
mr r0
.bottles_of_beer:
// take one down, pass it around
li 1
sub r0
mr r0
cond NEQ // Disable execution if we've reached zero
li .bottles_of_beer
mr r7 // if not, do another iteration
cond EX // return to normal execution

That’s not too bad – weighing in at 10 bytes – way better than MIPS, but 68K beats us easily thanks to its dbf instruction. Let’s try something more involved…

The first thing I’m going to want to do with this ISA, if I succeed in turning it into an actual CPU core, is to build a SOC similar to the one used in the ZPU Demos project. This SOC defines a UART at address 0xffffffc0. This register has ‘tx ready’ and ‘rx ready’ flags as well as the actual data. Writing to this register looks something like this:

  li 0xffffffc0 // UART register (needs two li instructions)
  mr r0
  li msg
  mr r1
.loop
  ld r0 // read UART status
  mr r2
  li 0x100 // tx ready? (needs two li instruction)
  and r2
  cond EQ
  im .loop
  mr r7
  cond EX

  ld r1    ; For simplicity, ignore byte extraction for the moment.
  cond NEQ
  st r0
li 1
add r1
mr r1
  im .loop
  mr r7
  cond EX
.done    // 23 instructions / bytes

We ignored byte extraction in the above code snippet, but that’s clearly something that we need. It would look something like this:

  ld r0    // read 32-bits, assuming that the SOC will ignore bits 0 and 1 of the address.
mr r1    // 32-bit word
  im 3
  and r0    // mask off all but lowest two bits
mr r0
  im 3
  exg r0
  sub r0    // subtract from 3 - how many times do we need to shift
mr r0
  im 3
  shl r0    // multiply by 8
  lsr r1    // shift word by that many bits
  im 255
and r1 // 14 instructions

That’s quite a lot of code, and since byte manipulation is a common operation this suggests that adding byte load and store instructions would be worthwhile if I can possibly find room for them. Since it’s unusual to access bytes in isolation, I’ll implement load and store both with post-increment, since this is the most common use case.

What I’m also noticing is that every time I use an arithemetic or logic instruction I’m immediately writing the result to a register, because I can’t build an immediate value with ‘li’ without overwriting what’s in the accumulator. The only way to avoid this is to pre-generate immediates and write them to registers in advance. In this and some other experimental code snippets, code density and ease of programming seems to be slightly improved if the result goes directly to the nominated register intead of to the temp register. (There is one important exception to this, which I will cover in a later part.)

For this reason, I have decided that the result of arithmetic and logic instructions will now go to the register nominated in the instruction, rather than to the temp register.

The semi-final instruction set, which I think is more-or-less chosen now (unless I hit major implementation problems!), looks like this:

  • cond – enable conditional execution, 3-bit operand specifies the condition, one of EQual, NotEQual, StrictlyLessThan, LessthanorEqual, StrictlyGreatherThan, GreaterthanorEqual, EX(ecute) or N(o)EX(ecute)
  • mt – move register to temp
  • mr – move temp to register
  • exg – exchange register with temp
  • ldi – load indexed (nominated register + r5) [may be removed – I haven’t tried using it yet]
  • st – store to memory pointed to by rn
  • ld – load from memory pointed to by rn
  • sti – store indexed (nominated register + r5) [may be removed – I haven’t tried using it yet]
  • add
  • cmp
  • ldinc – load and post-increment (for stack)
  • sub
  • stdec – store and post-decrement (for stack)
  • and
  • or
  • xor
  • addt – add and transfer registers old contents to temp. (Used for PC-relative branches, but could also have uses in indexing.)
  • shl – shift left
  • asr – arithmetic shift right
  • lsr – logical shift right
  • ror – rotate right
  • rorc – rotate right through carry [may be removed – I haven’t tried using it yet]
  • stbinc – store byte with post increment
  • ldbinc – load byte with post increment
  • li – load immediate

Leave a Reply

Your email address will not be published. Required fields are marked *