The EightThirtyTwo ISA – Part 19 – 2020-03-25
One my main goals when starting the EightThirtyTwo project was to minimise the amount of block RAM I needed to devote to firmware; indeed, this was even more of a goal than was raw speed.
So how does the architecture measure up to that goal?
The EightThirtyTwo ISA – Part 19 – 2020-03-08
In my experiments a few days ago I noticed an odd problem with the EightThirtyTwo toolchain – namely that the following construct worked just fine:
char string="Hello, world!";
but the following didn’t:
char *string="Hello, world!";
The EightThirtyTwo ISA – part 18 – 2020-02-19
Having added partial results forwarding I figured I’d better check that I hadn’t broken dual-thread mode in the process. (Spoiler alert: I had!)
The EightThirtyTwo ISA – Part 17 – 2020-02-16
I was looking today at ways of improving the throughput of the EightThirtyTwo CPU. The design as it stands is very simple, and didn’t make any attempt to perform result forwarding or instruction fusing. These are both strategies for improving the performance of certain constructs, and I wasn’t sure which of these two techniques I should use.
In brief, without either mechanism implemented, when the CPU encounters code such as:
it has to wait until the first instruction has finished writing to the tmp register before moving its new contents into the pipeline, and only then finally writing it to r0.
The EightThirtyTwo ISA – Part 16 – 2020-02-08
In my last post I touched briefly on the 832a assembler which I wrote as the first part of my solution to improving the code density of compiled C code.
An assembler that takes a single source file and spits out a ready-to-run binary file is not particularly difficult to write, but it’s not particularly useful either – in order to be useful we need to be able to link together multiple code modules.
Thus, 832a now has a cousin: 832l, the linker.
The EightThirtyTwo ISA – Part 15 – 2020-02-06
I’ve joked a few times in this series about being too lazy to write an assember – but it would be more true to say that the stop-gap solution I was using was adequate, so my time was better spent on the more enjoyable aspects of the project. I am now feeling the limitations of using the GNU assembler to produce a bytestream for a target it knows nothing about, and to improve either the performance or code density of the vbcc backend’s output any further, I need to address the problem I’ve had so far with cross-module references…
The EightThirtyTwo ISA – Part 2 – 2019-08-10
In part 1 I talked about why I wanted to design my own microprocessor, and how I’d settled upon an architecture using eight-bit instruction words and eight addressable thirty-two bit registers.
Because our instruction words won’t have room to encode two registers in them, each instruction will only take one operand – so we need a way of supplying a second operand to instructions such as add, xor, load immediate, etc. For this I’ve defined a ninth register, which I’ll call temp. This register isn’t addressable via the instruction word – it’s simply used implictly wherever it’s needed. One of the key decisions to make will be whether the result of arithmetic instructions will go the nominated register or the temp register.
It’s not entirely clear without giving the matter some thought (and ideally writing some test programs) exactly which instructions we need to implement, so initially I’ll list all the possible instructions that come to mind, and then figure out what’s mandatory, what’s optional, and what’s most beneficial in terms of keeping code size down, in the hope that we can hit our target of 25 instructions.
The EightThirtyTwo ISA – Part 1 – 2019-08-09
In 2019 there are any number of off-the-shelf CPU cores which can be used in FPGA projects, some imitating long-established CPUs such as x86, MC68000, Z80, MIPS, ARM, etc – and some more specifically targetted at the FPGA space, such as NIOS, Microblaze, ZPU, Moxie and suchlike. So why on earth would I consider creating a brand new one from scratch?
As always, because I wanted to learn something, and because even though there are so many existing options, there are still applications for which none of them is ideal.
My design goal is to create a CPU that’s reasonably small – not much bigger than ZPUFlex – so takes up somewhere in the region of 1500 logic elements, while being somewhat faster and offering better code density. Ideally I want something that can supplant the ZPU for control module applications and allow me to reduce the amount of block RAM I have to devote to the code.