Assembly language is all very well…

The EightThirtyTwo ISA – Part 5 – 2019-08-24

Before proceeding very much further with the EightThirtyTwo design I wanted to have some way of generating code for it from a higher-level language, just to get a better feel for which instructions are useful for compilers, which will be pretty much unused and where I might find any gaps that make the generated code unneccessarily clumsy. The most obvious route to this goal is to find a C compiler that can be easily retargetted to a new architecture. (Anyone who’s attempted this before is probably laughing now at my use of the word ‘easily’!)

My preconceptions when starting were that I had two options, GCC and LLVM (this turned out to be very far from the truth) – and that the latter would be the easier to work with. While in an ideal world I would have GCC support, I know from having worked with the ZPU toolchain that it’s a rapidly moving target, and if an architecture’s support isn’t updated it tends to suffer bitrot and become difficult to build as the ecosystem evolves. So for now I’ve discounted GCC.

I haven’t yet explored LLVM at all – but this may well end up being the long-term solution. For now I wanted something simpler, however. It turns out there are in fact a number of other C compilers out there – some that I’d never heard of, some that I know of old – and there may well be more that I haven’t yet found. I am genuinely surprised by just how many options there are!

  • PCC – the Portable C Compiler. (BSD licensed). This goes back to before I was born, and is still under active development now. It’s specifically designed to be retargetable to different architectures, but I haven’t yet figured out how to find valid –target paramters – for mips and m68k anything other than <arch>-netbsdelf seems to give an error. At this stage any friction makes me move on to examine the next option – but this is one I may well revisit.
  • TCC – the Tiny C Compiler. (GNU LGPL). A very small, lightweight and fast compiler – mainly aimed at x86 but does have other backends, and in theory should do what I need, but the backend interface looks like it has a steep learning curve.
  • SDCC – The Small Device C Compiler. Targetted at 8-bit CPUs and microcontrollers, but may be applicable here. The documentation seems to be incomplete though.
  • TACK – The Amsterdam Compiler Kit. Another compiler with backends for multiple architectures. This compiler was originally a commercial product, part of MINIX, subsequently open-sourced. The current maintainer writes that “it is very easy to port to a new architecture… but the generated code is terrible.” … so perhaps not the best option for now, however the reason for this assessment is that it doesn’t make good use of plentiful registers – which we don’t have, so it might still be worth exploring.
  • 8CC – an MIT Licensed one-man project to build a C11 compliant compiler. Currently only targets x86-64, but could no doubt be used as a starting point for targetting other architectures.
  • DICE – Dillon’s Integrated C Environment (Own license – may be used to produce commercial code, but the compiler itself, its source and derived works may not be sold.) This is a compiler I remember from my Amiga days, and used again when I became involved in the Minimig project. Targets m68k only, but could be used as a starting point for a new architecture if the code generator were replaced.
  • LCC – the Little C Compiler. (Unusual license, free for personal use). Again, written specifically with multiple architectures in mind, and the subject of Fraser and Hanson’s book A Retargetable C Compiler: Design and Implementation. The current version’s code generator is different from that described in the book, however – and I discounted this one after (a) reading comments on the web about the quality of its code generation, and (b) stumbling across David Given’s attempt to create a Z-machine backend – he says: “the lcc back end interface is pretty naff and is only partially documented, so I never finished anything.” He did in fact finish something, but for a different compiler, and on the strength of this, for my initial experiments I chose:
  • VBCCVolker Barthelmann’s C Compiler. (Unusual license, free for personal use, permission required for commercial use. Source can be distributed but only in unmodified form.) In his blog post about creating the Z-machine backend, David says vbcc “…is really neat, by the way; it’s got excellent global optimisations, an easy-to-understand back end interface, is small and incredibly fast.” Sounds good to me!

I have used VBCC in the past, since it’s a widely used and popular compiler when targetting the Amiga – but I hadn’t previously explored its code-generation facilities. It’s incredibly easy to get started – no need to edit Makefiles or headers – simply copy an existing backend, type make TARGET=<newtarget> and you’ll have a new compiler. There’s even a “generic risc” backend that serves as an ideal starting point. The documentation is also good.

[I must also give an honourable mention to ArchC at this point – if time allows I will definitely explore this further. It appears to be a framework for ISA research, with tools that will automatically build simulators, binutils-based assemblers and clang-based compilers from CPU models. Very interesting stuff, well worth exploring, but too deep a rabbit-hole for the immediate present.]

Leave a Reply

Your email address will not be published. Required fields are marked *