Part 3 – Hello World!
This time round I’ve added the On-screen Display component, and the firmware verifies that it’s working correctly by way of the archetypal “Hello World!” message!
I’ve also added project files for the MIST board, and will add support for a Xilinx-based board in the near future.
The source tree to accompany this part is tagged in the git repo as Step2.
The OSD component itself provides a few hardware registers that can be accessed from software, along with a 512-byte character buffer.
The VHDL interface looks like this:
entity OnScreenDisplay is port( reset_n : in std_logic; clk : in std_logic; -- Video hsync_n : in std_logic; -- Sync inputs from the main core, used to time the vsync_n : in std_logic; -- window and pixel signals and position the OSD. enabled : out std_logic; pixel : out std_logic; window : out std_logic; -- Registers addr : in std_logic_vector(8 downto 0); data_in : in std_logic_vector(15 downto 0); data_out : out std_logic_vector(15 downto 0); reg_wr : in std_logic; char_wr : in std_logic; char_q : out std_logic_vector(7 downto 0) ); end entity;
The readable registers are implemented using simple combinational logic and will thus respond within a single clock, so we don’t bother with any kind of req / ack mechanism here, in the interests of keeping things simple.
Address and data from the CPU are placed on addr and data_in, and reg_wr is brought high to trigger a write to a register, and a char_wr is brought high to trigger a write to the character RAM.
data_out and char_q will output data from registers and character RAM, respectively, based on the addr input. This is a constant connection – no req signal is needed. If reading from the registers triggered some kind of action then we’d need a more complete req/ack mechanism here, but since reads are completely passive we don’t need to worry about it in this case.
The registers I’ve defined (in CtrlModule/Firmware/osd.h) look like this:
#define OSDBASE 0xFFFFFB00 #define HW_OSD(x) *(volatile unsigned int *)(OSDBASE+x) #define REG_OSD_XPOS 0 #define REG_OSD_YPOS 4 #define REG_OSD_PIXELCLOCK 8 #define REG_OSD_HFRAME 12 #define REG_OSD_VFRAME 16 #define REG_OSD_ENABLE 20
- REG_OSD_XPOS and REG_OSD_YPOS are writable registers which determine the position of the top left corner of the display. The position is specified in pixel clocks (horizontal) or rows (vertical) from the respective sync pulse.
- REG_OSD_PIXELCLOCK is a writable register which the firmware uses to specify how many system clocks elapse per pixel clock.
- REG_OSD_HFRAME and REG_OSD_VFRAME are readable registers. The OSD module counts the number of system clocks (horizontal) or rows (vertical) that elapse during the positive and negative portions of the sync pulse. The duration of the low pulse is stored in bits 15 downto 8 and the high pulse in 7 downto 0.
(Since high accuracy isn’t needed here, to make best use of the data range the vertical counts are divided by 8, making headroom for higher resolution. Similarly, the horizontal counts are based on system clocks rather than pixel clocks, so they’re divided by 64 to keep them well within range.)
- Finally, REG_OSD_ENABLE – a writable register with three bits defined: Bit 0 enables or disables the OSD, while bits 1 and 2 invert the horizontal and vertical sync polarity, respectively.
The character generator itself is very simple. The OSD maintains pixel counters which start when the display reaches the position specified in REG_OSD_[X|Y]POS. In order to supply a border to the window, these counters start from -4 and go up to 4 pixels past the actual OSD size.
To generate the actual character pixels, we have the aforementioned 512 byte Character RAM, and we also have a Character ROM which is a 1-bit wide ROM built by the Makefile in CtrlModule/CharROM.
The Character RAM is a dual port RAM which has one read-only port for the character generator, and read/write port for the CPU. The character RAM’s read address is generated like so:
charram_rdaddr <= std_logic_vector(ypixelpos(6 downto 3))&std_logic_vector(xpixelpos(7 downto 3));
Then the 7-bit ASCII value read from the character RAM is used to generate an address within the character ROM, like so:
charrom: entity Work.CharROM_ROM generic map ( addrbits => 13 ) port map ( clock => clk, address => char(6 downto 0)&std_logic_vector(ypixelpos(2 downto 0))&std_logic_vector(xpixelpos(2 downto 0)), q => charpixel );
The only problem with this is that the character generator will output data all the time, even during the OSD window borders, so we mask it off using some enable signals which are generated like so:
-- Enable vactive for ypixel positions between 0 and 127, inclusive. vactive<='1' when ypixelpos(11 downto 7)="00000" else '0'; -- Enable hactive for xpixel positions between 0 and 255, inclusive. hactive<='1' when xpixelpos(11 downto 8)="0000" else '0'; pixel <=charpixel and hactive and vactive;
All that remains is to connect the OSD within the CtrlModule component, and merge its output with the video display in the host project.
The control module itself now needs some extra signals:
In the entity definition…
-- Video signals for OSD vga_hsync : in std_logic; vga_vsync : in std_logic; osd_window : out std_logic; osd_pixel : out std_logic;
…and in the architecture header
-- OSD related signals signal osd_wr : std_logic; signal osd_charwr : std_logic; signal osd_char_q : std_logic_vector(7 downto 0); signal osd_data : std_logic_vector(15 downto 0);
The OSD is instantiated like this:
-- OSD myosd : entity work.OnScreenDisplay port map( reset_n => reset_n, clk => clk, -- Video hsync_n => vga_hsync, vsync_n => vga_vsync, pixel => osd_pixel, window => osd_window, -- Registers addr => mem_addr(8 downto 0), -- low 9 bits of address data_in => mem_write(15 downto 0), data_out => osd_data(15 downto 0), reg_wr => osd_wr, -- Trigger a write to the control registers char_wr => osd_charwr, -- Trigger a write to the character RAM char_q => osd_char_q -- Data from the character RAM );
In RTL/virtual_toplevel.vhd we now need to merge the new osd_window and osd_pixel signals with the host core’s video. This isn’t difficult; instead of feeding the host core’s video signals directly to the toplevel, we instead route them through an OSD_Overlay component, which takes care of merging the signals. (It’s also capable of applying a dimmed-scanline effect, which is a nice bonus for very little extra logic.)
The inner logic of the OSD overlay looks like this:
if osd_window_in='1' then red_out<=osd_pixel_in&osd_pixel_in&red_in(7 downto 2); green_out<=osd_pixel_in&osd_pixel_in&green_in(7 downto 2); blue_out<=osd_pixel_in&'1'&blue_in(7 downto 2); elsif scanline='1' and scanline_ena='1' then red_out<='0'&red_in(7 downto 1); green_out<='0'&green_in(7 downto 1); blue_out<='0'&blue_in(7 downto 1); else red_out<=red_in; green_out<=green_in; blue_out<=blue_in; end if;
When the OSD window is inactive, the RGB values are passed through either unaltered, or dimmed according to the scanline generating logic. When the OSD window is active, the RGB values are shifted right two places, dimming the image, and the top two bits of each colour are replaced. Red and Green's top two bits and Blue's single topmost bit are replaced with the OSD pixel signal, while Blue bit 6 is always set to 1, giving the blue-tinted OSD background colour.
Now all that remains is giving the ZPU access to the new OSD hardware.
In the previous part, I mentioned that I was decoding the address space in 256-byte chunks, and commented that it probably looked strange when where was only one register to decode - now this scheme will make more sense, since we have the existing register block at 0xFFFFFFF0, and I'm now going to add the OSD's control registers at 0xFFFFFFB0, and the character RAM at 0xFFFFFFC0/D0. This is very simple, and the entire decoding block now looks like this:
process(clk) begin if reset_n='0' then elsif rising_edge(clk) then mem_busy<='1'; osd_charwr<='0'; osd_wr<='0'; -- Write from CPU? if mem_writeEnable='1' then case mem_addr(maxAddrBit)&mem_addr(10 downto 8) is when X"B" => -- OSD controller at 0xFFFFFB00 osd_wr<='1'; mem_busy<='0'; when X"C" => -- OSD controller at 0xFFFFFC00 & 0xFFFFFD00 osd_charwr<='1'; mem_busy<='0'; when X"D" => -- OSD controller at 0xFFFFFC00 & 0xFFFFFD00 osd_charwr<='1'; mem_busy<='0'; when X"F" => -- Peripherals at 0xFFFFFF00 case mem_addr(7 downto 0) is when X"FC" => -- Host SW mem_busy<='0'; dipswitches<=mem_write(15 downto 0); when others => mem_busy<='0'; null; end case; when others => mem_busy<='0'; end case; -- Read from CPU? elsif mem_readEnable='1' then case mem_addr(maxAddrBit)&mem_addr(10 downto 8) is when X"B" => -- OSD registers mem_read(31 downto 16)<=(others => '0'); mem_read(15 downto 0)<=osd_data; mem_busy<='0'; when X"C" => -- OSD controller at 0xFFFFFC00 & 0xFFFFFD00 mem_read(31 downto 8)<=(others => 'X'); mem_read(7 downto 0)<=osd_char_q; mem_busy<='0'; when X"D" => -- OSD controller at 0xFFFFFC00 & 0xFFFFFD00 mem_read(31 downto 8)<=(others => 'X'); mem_read(7 downto 0)<=osd_char_q; mem_busy<='0'; when X"F" => -- Peripherals case mem_addr(7 downto 0) is -- We don't have any readable registers yet. when others => mem_busy<='0'; null; end case; when others => -- SDRAM mem_busy<='0'; end case; end if; end if; -- rising-edge(clk) end process;