How to avoid writing an assembler

The EightThirtyTwo ISA – Part 4 – 2019-08-14

In order to experiment properly with the instruction set I need some way of assembling and running programs. Luckily I wrote a ZPU simulator some time ago, and this is easily re-purposed for the EightThirtyTwo instruction set. The simulator is part of the EightThirtyTwo project on github.

In the longer term, to be useful this ISA will need a fully-functional toolchain, which means writing a backend for GCC – or perhaps LLVM, or maybe even lcc, vbcc, or some other C compiler. That’s going to be a rabbit hole in its own right, however, and for now we just want a quick and easy way of turning an assembly listing into a binary file of bytes ready for execution.

The gcc toolchain is perfectly capable of creating binary files of bytes, pretty much independently of the CPU (endian-issues aside) – we can simply create an assembly language source file full of .byte directives.

Even better, if we name our assembly file with the extension .S instead of .s, gcc runs it through the C Preprocessor first. This means we can simply create a header file containing a bunch of #defines for our instructions, like so:

#define cond .byte 0x00 +
#define mr .byte 0x08 +
#define sub .byte 0x10 +
#define cmp .byte 0x18 +
#define st .byte 0x20 +
#define stdec .byte 0x28 +
#define stx .byte 0x30 +
#define stbinc .byte 0x38 +
......

We then define our registers and condition codes:

#define r0 0
#define r1 1
#define r2 2
#define r3 3
#define r4 4
#define r5 5
#define r6 6
#define r7 7

#define NEX 0
#define SGT 1
#define EQ 2
#define GE 3
#define SLT 4
#define NEQ 5
#define LE 6
#define EX 7

and we can now assemble an EightThirtyTwo program using a bog-standard gcc install – the target shouldn’t matter! We can now assemble very basic instructions like so:

#include "assembler.pp"

start:
li 0x11
mr r2
li 0x7
add r2
cond NEX // Stop simulation

We can build this using the following commands:

gcc -c test.S
objcopy -Obinary test.o test.bi

Because we have to cascade multiple li instructions to load values that don’t fit within a single instruction word, we define some macros to do the necessary shifting and masking, which I’ve called IMWn(x), where n ranges from 5 to 0.

This is where we hit an interesting issue. Because we’re creating an object file and never linking it, any labels we define in our source file are never resolved to final addresses. As long as we’re dealing with absolute values there’s no problem, but if we attempt to shift the address of a branch target, we get an error message: “Error: invalid operands (.text and ABS sections)”. To resolve this, we simply subtract the address of our start label (which will be zero), and now gcc’s assembler’s happy because it’s dealing with an absolute number, not a yet-to-be-finalised relocatable symbol.

I made a couple more tweaks to the ISA, the most notable of which is to the cond instruction; the conditional execution mode will be cancelled by any instruction that writes to r7 (the Program Counter) – whether or not it’s executed. This means that we can often omit the cond EX instruction that was previously needed to return to unconditional execution. With that minor tweak in mind, we can finally look at some proper code – so here’s Hello World for the EightThirtyTwo ISA:

include  "assembler.pp"

start:
li IMW0(message-start)
mr r1
li IMW1(0xffffffc0) // UART register
li IMW0(0xffffffc0)
mr r0
_loop:
li IMW1(0x100) // UART TxReady flag
li IMW0(0x100)
mr r2
ld r0
and r2
cond EQ
li IMW0(PCREL(_loop))
add r7

ldbinc r1
cond NEQ
st r0
li IMW0(PCREL(_loop))
add r7

cond NEX // Terminate simulation

message:
.ascii "Hello, world!"
.byte 0

That gives us 19 bytes of code and 14 of data, giving us 33 bytes in total.

Leave a Reply

Your email address will not be published. Required fields are marked *