In my last post I mentioned that I had to employ some ugly hacks in the boot firmware for my ZPU project, to make sure certain structures ended up in SDRAM rather than the initial Boot ROM.
To illustrate the problem let’s look at a minimal test program:
short inconvenience; int main(int argc,char **argv) { inconvenience=0x0123; return(0); }
This little program declares a 16-bit word global variable, and then writes to it. The assembly output produced by
zpu-elf-gcc -Os -S bsstest.c
is as follows:
.file "bsstest.c" .text .globl main .type main, @function main: im 291 nop im inconvenience storeh im 0 nop im _memreg+0 store poppc .size main, .-main .comm inconvenience,2,4 .ident "GCC: (GNU) 3.4.2"
Note the storeh instruction half way down. That’s the source of my problem. I’ve implemented storeh in hardware for SDRAM, but not for the BlockRAM-based Boot code, and I’d really like to avoid doing the latter if possible, because doing a 16-bit write to a 32-bit wide RAM is going to be messy and eat up logic elements. The boot code is also rather on the large side, so it would be nice to avoid storing unitialised data in there at all if possible.
The quick solution is simply to make the variable an int (32-bits) instead of a short (16-bits) – but that doesn’t help if what’s declared is a structure that has critical alignment requirements, as is the case for the FAT filesystem structures that the boot code must deal with. The solution I used for my last post was instead of declaring a struct, to declare a pointer to a struct, and initialise that pointer to some random location in SDRAM. That works just fine when there are just one or two such structs, but it’s ugly and the chances of bugs creeping in as structs collide increase with the program’s complexity, so a better solution is needed.
The solution to this problem is to employ a linker script, which we supply to gcc with the -T option.
Before diving into the linker script, let’s take a quick look at my ZPUTest project’s address decoding:
case mem_addr(31 downto 16) is when X"0000" => -- Boot BlockRAM ... when X"FFFE" => -- VGA controller ... when X"FFFF" => -- Peripherals ... when others => -- SDRAM access ...
So the memory map looks like this:
- 0x0000 – 0xFFFF – Boot code in Block RAM, aliased to fill the 64k region.
- 0xFFFE0000 – 0xFFFEFFFF – VGA controller registers
- 0xFFFF0000 – 0xFFFFFFFF – Peripheral registers
- All other addresses – SDRAM (Actual range depends on the target board’s SDRAM chips).
- (The stack s also memory-mapped within my variant of the ZPU core itself, to 0x40000000)
So our goal with the linker script is to ensure anything we want to end up in SDRAM has an address greater than 0x10000.
We’ll start the linker script by defining some memory sections:
MEMORY { BOOT (rx) : ORIGIN = 0x00000000, LENGTH = 0x00002000 /* 8192 bytes */ SDRAM (rw) : ORIGIN = 0x007ff000, LENGTH = 0x00001000 /* 4k at top end of DE1's SDRAM */ STACK (rw) : ORIGIN = 0x40000000, LENGTH = 0x00000400 /* 1024 bytes */ }
The BOOT section is the BlockRAM at the beginning of the memory map. At the time of writing it’s currently 8KB (16 M4Ks) in size, but I’m hoping to get that down to 4KB in time. Once nice effect of defining memory regions like this is that we get a compile error if the code doesn’t fit.
The SDRAM section is set here to point to a small region at the end of the DE1 board’s 8 megabytes of SDRAM. It doesn’t really matter where this falls, but since the whole point of this project was a minimal core for loading files into RAM from SD Card, it needs to be somewhere that won’t be overwritten when we load files!
SECTIONS { /* first section is .fixed_vectors which is used for startup code */ . = 0x0000000; .fixed_vectors : { *(.fixed_vectors) }>BOOT
The .fixed_vectors section comes from the crt0.s startup code, and must be the first section in the BOOT region. It contains the initial jump instruction, a few memory-based registers for GCC (which isn’t happy building code for CPUs that don’t have any traditional registers at all) and the emulation code for instructions that aren’t implemented in hardware.
/* Remaining code sections */ . = ALIGN(4); .text : { *(.text) /* remaining code */ } >BOOT /* .rodata section which is used for read-only data (constants) */ . = ALIGN(4); .rodata : { *(.rodata) } >BOOT
The above adds the rest of the program code and any constants and strings to the Boot ROM.
/* .data section which is used for initialized data */ . = ALIGN(4); .data : { _data = . ; *(.data) SORT(CONSTRUCTORS) . = ALIGN(4); } >BOOT . = ALIGN(4); _romend = . ;
Initialised data is the only problem we haven’t addressed. This will still end up in the Boot ROM. The implication of this is that declaring a global variable “short val;” should now be fine, but “short val=0x1234;” will still cause problems.
The _romend symbol isn’t needed, but it’s convenient to include because we can use it to keep an eye on code size.
/* .bss section which is used for uninitialized data */ .bss : { __bss_start = . ; __bss_start__ = . ; *(.bss) *(COMMON) } >SDRAM __bss_end__ = . ; }
Finally, the BSS section, which will now end up in SDRAM. This also has the nice side-effect of reducing the space needed for the Boot ROM, since we no longer need to allow room for the BSS data.
Again, the __bss_start__ and __bss_end__ symbols aren’t strictly necessary, but they allow us to keep tabs on locations and sizes.
I do this with an extra target in the makefile, like so:
%.rpt: %.elf $(DUMP) -x $< | grep _romend # End of Boot ROM $(DUMP) -x $< | grep __bss_start__ # Start of BSS data in SDRAM $(DUMP) -x $< | grep __bss_end__ # End of BSS data in SDRAM
Every build finishes by printing this to the shell:
zpu-elf-objdump -x boot.elf | grep _romend # End of Boot ROM 000018ec g *ABS* 00000000 _romend zpu-elf-objdump -x boot.elf | grep __bss_start__ # Start of BSS data in SDRAM 007ff000 g .bss 00000000 __bss_start__ zpu-elf-objdump -x boot.elf | grep __bss_end__ # End of BSS data in SDRAM 007ff2f8 g *ABS* 00000000 __bss_end__
The Boot ROM currently ends at 0x18ec, while the BSS data occupies SDRAM between 0x7ff000 and 0x7ff2f8