Home

Awesome

Table of Contents

Brief Description

This DDR3 controller was originally designed to be used on the 10-Gigabit Ethernet Project for an 8-lane x8 DDR3 module running at 800 MHz DDR, but this is now being designed to be a more general DDR3 memory controller with multiple supported FPGA boards. This is a 4:1 memory controller with configurable timing parameters and mode registers so it can be configured to any DDR3 memory device. The user-interface is the basic Wishbone.

This memory controller is optimized to maintain a high data throughput and continuous sequential burst operations. The controller handles the reset sequence, refresh sequence, mode register configuration, bank status tracking, timing delay tracking, command issuing, and the PHY's calibration. The PHY's calibration handles the bitslip training, read-DQ/DQS alignment via MPR (read calibration), write-DQ/DQS alignment via write leveling (write calibration), and also an optional comprehensive read/write test.

The optional comprehensive read/write tests made it easier to test the memory controller without needing an external CPU. These tests include a burst access, random access, and alternating read-write access tests. Only if no error is found on these tests will the calibration end and user can start accessing the wishbone interface.

This design is formally verified and simulated using the Micron DDR3 model.

Getting Started

The recommended way to instantiate this IP is to use the top module rtl/ddr3_top.v, a template for instantiation is also included in that file. Steps to include this DDR3 memory controller IP is to instantiate design, create the constraint file, then edit the localparams.

:heavy_check_mark: Instantiate Design

The first thing to edit are the top-level parameters:

ParameterFunction
CONTROLLER_CLK_PERIODclock period of the controller interface in picoseconds. Tested values range from 12_000 ps (83.33 MHz) to 10_000 ps (100 MHz).
DDR3_CLK_PERIODclock period of the DDR3 RAM device in picoseconds which must be 1/4 of the CONTROLLER_CLK_PERIOD. Tested values range from 3_000 ps (333.33 MHz) to 2_500 ps (400 MHz).
ROW_BITSwidth of row address. Use chapter 2.11 DDR3 SDRAM Addressing from JEDEC DDR3 doc (page 15) as a guide. Possible values range from 12 to 16.
COL_BITSwidth of column address. Use chapter 2.11 DDR3 SDRAM Addressing from JEDEC DDR3 doc (page 15) as a guide. Possible values range from 10 to 12.
BA_BITSwidth of bank address. Use chapter 2.11 DDR3 SDRAM Addressing from JEDEC DDR3 doc (page 15) as a guide. Usual value is 3.
BYTE_LANESnumber of bytes based on width of DQ. <sup>[1] </sup>
AUX_WIDTHwidth of auxiliary line. Value must be >= 4. <sup>[2] </sup>
WB2_ADDR_BITSwidth of 2nd wishbone address bus for debugging (only relevant if SECOND_WISHBONE = 1).
WB2_DATA_BITSwidth of 2nd wishbone data bus for debugging (only relevant if SECOND_WISHBONE = 1).
MICRON_SIMset to 1 if used in Micron DDR3 model to shorten power-on sequence, otherwise 0.
ODELAY_SUPPORTEDset to 1 if ODELAYE2 primitive is supported by the FPGA, otherwise 0. <sup>[3] </sup>
SECOND_WISHBONEset to 1 if 2nd wishbone for debugging is needed , otherwise 0.

After the parameters, connect the ports of the top module to your design. Below are the ports for clocks and reset:

PortsFunction
i_controller_clkclock of the controller interface with period of CONTROLLER_CLK_PERIOD
i_ddr3_clkclock of the DDR3 interface with period of DDR3_CLK_PERIOD
i_ref_clkreference clock for IDELAYCTRL primitive with frequency of 200 MHz
i_ddr3_clk_90clock required only if ODELAY_SUPPORTED = 0, otherwise can be left unconnected. Has a period of DDR3_CLK_PERIOD with 90° phase shift.
i_rst_nActive-low synchronous reset for the entire DDR3 controller and PHY

It is recommended to generate all these clocks from a single PLL or clock-generator.


Next are the main wishbone ports:

PortsFunction
i_wb_cycIndicates if a bus cycle is active. A high value (1) signifies normal operation, while a low value (0) signals the cancellation of all ongoing transactions.
i_wb_stbStrobe or transfer request signal. It's asserted (set to 1) to request a data transfer.
i_wb_weWrite-enable signal. A high value (1) indicates a write operation, and a low value (0) indicates a read operation.
i_wb_addrAddress bus. Used to specify the address for the current read or write operation. Formatted as {row, bank, column}.
i_wb_dataData bus for write operations. In a 4:1 controller, the data width is 8 times the DDR3 pins 8xDQ_BITSxLANES.
i_wb_selByte select for write operations. Indicates which bytes of the data bus are to be overwritten for the write operation.
o_wb_stallIndicates if the controller is busy (1)and cannot accept any new requests.
o_wb_ackAcknowledgement signal. Indicates that a read or write request has been completed.
o_wb_dataData bus for read operations. Similar to i_wb_data, the data width for a 4:1 controller is 8 times the DDR3 pins 8xDQ_BITSxLANES.

Below are the auxiliary ports associated with the main wishbone. This is not required for normal operation, but is intended for AXI-interface compatibility which is not yet available:

PortsFunction
i_auxRequest ID line with width of AUX_WIDTH. The Request ID is retrieved simultaneously with the strobe request.
o_auxRequest ID line with width of AUX_WIDTH. The Request ID is sent back concurrently with the acknowledgement signal.

After main wishbone port are the second-wishbone ports. This interface is only for debugging-purposes and would normally not be needed thus can be left unconnected by setting SECOND_WISHBONE = 0. The ports for the second-wishbone is very much the same as the main wishbone.

Next are the DDR3 I/O ports, these will be connected directly to the top-level pins of your design thus port-names must match what is indicated on your constraint file. You do not need to understand what each DDR3 I/O ports does but if you're curious, details on each DDR3 I/O pins are described on 2.10 Pinout Description from JEDEC DDR3 doc (page 13).

Finally are the debug ports, these are connected to relevant registers containing information on current state of the controller. Trace each o_debug_* inside ddr3_controller.v to edit the registers to be monitored.

:heavy_check_mark: Create Constraint File

:heavy_check_mark: Edit Localparams

The verilog file rtl/ddr3_controller contains the timing parameters that needs to be configured by the user to align with the DDR3 device. User should base the timing values on Chapter 13 Electrical Characteristics and AC Timing from JEDEC DDR3 doc (page 169). The default values on the verilog file should generally work for DDR3-800.

Note:

[1]: For x16 DDR3 like in Arty S7, use BYTE_LANES of 2. If the memory configuration is a SO-DIMM with 8 DDR3 RAM modules, each being x8 to form a total of 64 bits of data, then BYTE_LANES would be 8.
[2]: The auxiliary line is intended for AXI-interface compatibility but is also utilized in the reset sequence, which is the origin of the minimum required width of 4.
[3]: ODELAYE2 is supported if DDR3 device is connected to an HP (High-Powered) bank of FPGA. HR (High-Rank) bank does not support ODELAYE2 as based on UG471 7-Series Select Guide (page 134).
[4]: This is the open-sourced 10Gb Ethernet Project.


Lint and Formal Verification

The easiest way to compile, lint, and formally verify the design is to run ./run_compile.sh on the top-level directory. This will first run Verilator lint.

Next is compilation with Yosys, this will show warnings:

Warning: Replacing memory ... with list of registers.

Disregards this kind of warning as it just converts small memory elements in the design into a series of register elements.

After Yosys compilation is Icarus Verilog compilation, this should not show any warning or errors but will display the Test Functions to verify that the verilog-functions return the correct values, and Controller Parameters to verify the top-level parameters are set properly. Delay values for some timing parameters are also shown.

Last is the Symbiyosys Formal Verification, this will run the single and multiple configuration sby for formal verification. A summary is shown at the end where all tasks passed:

image

Simulation

For simulation, the DDR3 SDRAM Verilog Model from Micron is used. Import all simulation files under ./testbench to Vivado. ddr3_dimm_micron_sim.sv is the top-level module which instantiates both the DDR3 memory controller and the Micron DDR3 model. This module issues read and write requests to the controller via the wishbone bus, then the returned data from read requests are verified if it matches the data written. Both sequential and random accesses are tested.

Currently, there are 2 general options for running the simulation and is defined by a define directive on the ddr3_dimm_micron_sim.sv file: TWO_LANES_x8 and EIGHT_LANES_x8. TWO_LANES_x8 simulates an Arty-S7 FPGA board which has an x16 DDR3, meanwhile EIGHT_LANES_x8 simulates 8-lanes of x8 DDR3 module. Make sure to change the organization via a define directive under ddr3.sv (TWO_LANES_x8 must use define x8 while EIGHT_LANES_x8 must use define x16).

After configuring, run simulation. The ddr3_dimm_micron_sim_behav.wcfg contains the waveform. Shown below are the clocks:
image

As shown below, command_used shows the command issued at a specific time. During reads the dqs should toggle and dq should have a valid value, else they must be in high-impedance Z. Precharge and activate also happens between reads when row addresses are different. image

A part of internal test is to do alternate write then read consecutively as shown below. The data written must match the data read. dqs should also toggle along with the data written and read.
image

There are counters for the number of correct and wrong read data during the internal read/write test: correct_read_data and wrong_read_data. As shown below, the wrong_read_data must remain zero while correct_read_data must increment until it reaches the maximum (3499 on this example).
image

The simulation also reports the status of the simulation. For example, the report below:

[10000 ps] RD @ (0, 840) -> [10000 ps] RD @ (0, 848) -> [10000 ps] RD @ (0, 856) -> [10000 ps] RD @ (0, 864) -> [10000 ps] RD @ (0, 872) ->

The format is [time_delay] command @ (bank, address), so [10000 ps] RD @ (0, 840) means 10000 ps delay before a read command with bank 0 and address 840. Notice how each read command has a delay of 10000 ps or 10 ns from each other, since this has a controller clock of 100 MHz (10 ns clock period) this shows that there are no interruptions between sequential read commands resulting in a very high throughput.

A short report is also shown in each test section:

DONE TEST 1: LAST ROW
Number of Operations: 2304
Time Started: 363390 ns
Time Done: 387980 ns
Average Rate: 10 ns/request

This report is after a burst write then burst read. This report means there were 2304 write and read operation, and the average time per request is 10 ns (1 controller clock period of 100 MHz). The average rate is optimal since this is a burst write and read. But for random writes and reads:

DONE TEST 2: RANDOM
Number of Operations: 2304
Time Started: 387980 ns
Time Done: 497660 ns
Average Rate: 47 ns/request

Notice how the average rate increased to 47 ns/request. Random access requires occasional precharge and activate which takes time and thus prolong the time for every read or write access. At the very end of the report shows a summary:

TEST CALIBRATION
[-]: write_test_address_counter = 5000
[-]: read_test_address_counter = 2000
[-]: correct_read_data = 3499
[-]: wrong_read_data = 0

------- SUMMARY -------
Number of Writes = 4608
Number of Reads = 4608
Number of Success = 4604
Number of Fails = 4
Number of Injected Errors = 4

The summary under TEST CALIBRATION are the results from the internal read/write test as part of the internal calibration. These are the same counters on the waveform shown before where the wrong_read_data should be zero. Under SUMMARY is the report from the external read/write test where the top-level simulation file ddr3_dimm_micron_sim.sv sends read/write request to the DDR3 controller via the wishbone bus. Notice that the number of fails (4) matches the number of injected errors (4) which is only proper.

Demo Projects

Other Open-Sourced DDR3 Controllers

(soon...)

Developer Documentation

There is no developer documentation yet. But may I include here the notes I compiled when I did an intensive study on DDR3 before I started this project.

Acknowledgement

This project is funded through NGI0 Entrust, a fund established by NLnet with financial support from the European Commission's Next Generation Internet program. Learn more at the NLnet project page.

<img src="https://nlnet.nl/logo/banner.png" alt="NLnet foundation logo" width="20%" /> <img src="https://nlnet.nl/image/logos/NGI0_tag.svg" alt="NGI Zero Logo" width="20%" />