The EightThirtyTwo ISA – Part 9 – 2019-10-28
Having managed to get the CPU working well enough to run a Hello World program, I’ve since persuaded it to run the LZ4 decompression program I posted a week or two back, too – and was pleasantly surprised to find it running happily at 133MHz on my DE2 board.
I’ve been pondering how best to handle interrupts on this CPU, and realised that I’ve painted myself into a bit of corner by using the tmp register as a link register when branching; this means that without adding some kind of temporary storage register or stack logic especially for the task, I can’t change the program flow externally – such as in response to an interrupt – without losing the contents of the tmp register.
The EightThirtyTwo ISA – Part 8 – 2019-10-20
Over the last few weeks I’ve been working on a VHDL implementation of the EightThirtyTwo ISA, using the excellent combination of GHDL and GtkWave to develop, debug, rip up, start over and ultimately produce something capable of sending “Hello, World!” to a UART.
The EightThirtyTwo ISA – Part 7 – 2019-09-26
Last time I said I would show some real-life code for this new ISA. I’ve already shown some very simple “Hello World” code, and I’m now in the slow, careful process of implementing a CPU to run this – and exploring the wonderful world of HDL simulation in the process! But in the meantime, by far the best way to get a feel for how well an instruction set works is to write some actual code for it, and see which issues and limitations you bump into.
The EightThirtyTwo ISA – Part 6 – 2019-09-15
In my baby-steps towards a working VBCC backend I now have something complete enough at least to send the text “Hello world!” to a UART. There’s a long way to go yet before it (a) works and (b) produces code that’s even remotely efficient, but it’s a start. In the process I’ve found and addressed some shortcomings in the ISA, and made a few tweaks to the instruction set and encoding.
The EightThirtyTwo ISA – Part 5 – 2019-08-24
Before proceeding very much further with the EightThirtyTwo design I wanted to have some way of generating code for it from a higher-level language, just to get a better feel for which instructions are useful for compilers, which will be pretty much unused and where I might find any gaps that make the generated code unneccessarily clumsy. The most obvious route to this goal is to find a C compiler that can be easily retargetted to a new architecture. (Anyone who’s attempted this before is probably laughing now at my use of the word ‘easily’!)
The EightThirtyTwo ISA – Part 4 – 2019-08-14
In order to experiment properly with the instruction set I need some way of assembling and running programs. Luckily I wrote a ZPU simulator some time ago, and this is easily re-purposed for the EightThirtyTwo instruction set. The simulator is part of the EightThirtyTwo project on github.
In the longer term, to be useful this ISA will need a fully-functional toolchain, which means writing a backend for GCC – or perhaps LLVM, or maybe even lcc, vbcc, or some other C compiler. That’s going to be a rabbit hole in its own right, however, and for now we just want a quick and easy way of turning an assembly listing into a binary file of bytes ready for execution.
The EightThirtyTwo ISA – Part 3 – 2019-08-12
My initial plan for the EightThirtyTwo ISA was to define eight general purpose registers, and use the ninth “temp” register as an accumulator. Immediate values would be assigned here, as would the result of any arithmetic operations. The best way to test the concept is, of course, to write some actual code. So let’s see what simple looping looks like with this concept:
li 99 // (needs two li instructions since 99 doesn't fit within a signed 6-bit value)
// take one down, pass it around
cond NEQ // Disable execution if we've reached zero
mr r7 // if not, do another iteration
cond EX // return to normal execution
That’s not too bad – weighing in at 10 bytes – way better than MIPS, but 68K beats us easily thanks to its dbf instruction. Let’s try something more involved…
2019-08-10 – Part 2 – Choosing the Instruction Set
In part 1 I talked about why I wanted to design my own microprocessor, and how I’d settled upon an architecture using eight-bit instruction words and eight addressable thirty-two bit registers.
Because our instruction words won’t have room to encode two registers in them, each instruction will only take one operand – so we need a way of supplying a second operand to instructions such as add, xor, load immediate, etc. For this I’ve defined a ninth register, which I’ll call temp. This register isn’t addressable via the instruction word – it’s simply used implictly wherever it’s needed. One of the key decisions to make will be whether the result of arithmetic instructions will go the nominated register or the temp register.
It’s not entirely clear without giving the matter some thought (and ideally writing some test programs) exactly which instructions we need to implement, so initially I’ll list all the possible instructions that come to mind, and then figure out what’s mandatory, what’s optional, and what’s most beneficial in terms of keeping code size down, in the hope that we can hit our target of 25 instructions.
2019-08-09 – Part 1 – The Pipe Dream
In 2019 there are any number of off-the-shelf CPU cores which can be used in FPGA projects, some imitating long-established CPUs such as x86, MC68000, Z80, MIPS, ARM, etc – and some more specifically targetted at the FPGA space, such as NIOS, Microblaze, ZPU, Moxie and suchlike. So why on earth would I consider creating a brand new one from scratch?
As always, because I wanted to learn something, and because even though there are so many existing options, there are still applications for which none of them is ideal.
My design goal is to create a CPU that’s reasonably small – not much bigger than ZPUFlex – so takes up somewhere in the region of 1500 logic elements, while being somewhat faster and offering better code density. Ideally I want something that can supplant the ZPU for control module applications and allow me to reduce the amount of block RAM I have to devote to the code.
A few years ago I ported three of the PACE arcade cores – namely Pacman, Pengo and Moon Patrol – to the Turbo Chameleon cartridge. Since then the original pacedev.net site seems to have gone down, but the source repo is now on github. Over the last few days, in odd moments I’ve cloned that repo, and added support for the new Chameleon V2 hardware – and now have the same three arcade cores working on the new hardware.
The cores can be downloaded here. Please note, these are straight ports of the cores to V2 hardware – do not attempt to flash them to an original Chameleon.
All three cores can be played using a PS/2 keyboard or a CDTV pad.
Coins can be added using the leftmost button on the Chameleon itself, the Run/Stop key on the C64, the 5 key on a PS/2 keyboard, or the Power button on a CDTV pad.
The Start button is mapped to the middle button on the Chameleon, the cursor up/down key on the C64, the 1 key on a PS/2 keyboard, or the Play button on a CDTV pad.
All three games can be played with a joystick, too – however Moon Patrol requires a second button.