Writing a new SDRAM controller – part 4 – 2021-08-06
The SDRAM controller I’ve described so far in this series is basically performing well – not quite as well as the one it’s intended to replace – but there’s still one more trick we can use to squeeze out a little more performance…
One of the complications of using SDRAM over, say, SRAM, is that the contents of memory have to be refreshed periodically, otherwise the charge in the memory cells leaks away over time and memory contents are lost. To avoid this, every row of every bank must be refreshed at least once every 64ms. (I’m not sure what’s so special about 64ms – but it seems to be the standard refresh interval for SDRAMs of all types except those intended for automotive applications, which typically have a refresh interval of 16ms. I’m not sure whether this is due to the increased reliability requirements where a failure could be dangerous, or whether its due to running in an environment with more heat, or a bit of both.)
The “easy” way to handle refresh is simply to issue Autorefresh commands to the chip. All banks must be idle, since the chip will refresh one row within each of the four banks whenever this command is issued. It maintains an internal counter, so we don’t have to think about it in any more depth than just making sure we refresh often enough.
But how often is enough? The SDRAM chips of interest here typically have 8,192 rows, so we need 8,192 refreshes every 64ms, so 128,000 refreshes per second, so one refresh every 7.8125µs. If we’re running our SDRAM controller at 128MHz that’s one refresh every 1000 cycles. If the SDRAM controller’s running at 96MHz that’s one refresh every 750 cycles. That’s actually quite a lot, when you consider that a refresh typically takes between 7 and 9 cycles depending on clock speed. Since the entire chip is effectively offline during the refresh cycle, that’s about 1% of the bandwidth lost to refresh.
Is this a problem in practice? It depends. For any given subsystem within a core there’s likely to be some downtime into which you can insert refreshes without impacting performance – the blanking periods of a video system are a prime example. Further, it’s not vital to perform a refresh on a fixed schedule of exactly 7.8125µs: as long as 8,192 of them have taken place over the course of 64ms it doesn’t matter if they’ve been clumped together in batches with longer gaps in between. (DDR RAM is stricter in this regard.) However, finding downtime for all subsystems simultaneously can be challenging. In the PC Engine core in particular, I wasn’t able to find sufficient refresh slots that would impact neither the video nor the ROM latency – so another approach was needed.
A normal access to a row within an SDRAM chip causes the contents of the memory cells to be extracted, then immediately re-written, and it’s this operation which the Autorefresh command triggers internally. As I said, it performs this operation concurrently to all four banks – but it’s perfectly possible to perform the same operation manually, simply by activating and then closing each row in turn. (For simplicity, I actually perform a dummy read with Autoprecharge, and ignore the result – since then I can use the same command slots as for regular reads and writes. I could also close the row using an explicit Precharge command – however this would have to happen a few cycles later than regular Reads and Writes, which would make scheduling and interleaving more complicated.)
There are two downsides: firstly, we can only address one bank at a time, so if we choose to refresh manually we’ll be issuing four times as many refresh commands in total – and secondly we have to maintain our own row counter for each bank, to make sure we visit every row in turn.
The payoff is that we don’t have to take the whole chip offline to refresh – we can refresh one bank while another is being used as normal. Thus, for banks containing video data we can do multiple refreshes during blanking intervals, leaving the bank clear for uninterrupted access during a scanline, while refreshing a ROM more regularly in the gaps between normal accesses.
There’s one other advantage, too (though I can’t help feeling that treating the chip this way is “impolite”!): If we know that we’re only using, say, a quarter of the rows within a bank, we can simply leave the other three quarters unrefreshed. They’ll lose their contents, of course, but since we’re not using them, we don’t care.
If you’re curious to see the “finished” SDRAM controller described in this series, it can be found here as part of the source code of the PCEngine core’s port to the Turbo Chameleon 64.