Home

Awesome

RiscyOO: RISC-V Out-of-Order Processors

This repository contains the RiscyOO processor written in Bluespec System Verilog (BSV). RiscyOO implements the RISC-V 64-bit instruction set with the IMAFD extensions, i.e., RV64G. It is a out-of-order superscalar cache-coherent multiprocessor which can boot multicore Linux and AWS F1 FPGA. The design document for the processor can be found at https://github.com/csail-csg/RiscyOO_design_doc.

Getting Started on a Local Ubuntu Machine

How to get started with this repository (should work on both Ubuntu 14.04 and 16.04):

Simulation on a Local Ubuntu Machine

Boot Linux on AWS F1 FPGA

Simulation is too slow to boot Linux, so we boot Linux on FPGA. It should be noted that we cross-compile benchmark programs to RISC-V and build Linux images on our local machines instead of on AWS. On AWS, we compile/synthesize the processor and run it on FPGA.

On AWS, we compile (and synthesize) the design on a C4 (e.g., c4.4xlarge) machine which runs the FPGA Developer AMI (our AMI version is 1.5.0) provided by AWS. After compilation, we run the design on FPGA using an F1 (e.g., f1.2xlarge) machine. As a result, this repo should be cloned to a place shared by C4 and F1. We are using Amazon EFS to share files between C4 and F1 machines.

In general, we build the hardware part of the design on C4, while we build the software part and run the design on F1. (This is mainly because C4 and F1 use different operating systems.) Therefore, the build of RISC-V tools should be done in the F1 machine. In fact, only tools/riscv-fesvr needs to be built for compiling and running the design.

Setup C4

Most of the setups in the Getting Started on a Local Ubuntu Machine section are not needed on C4. Here are the steps to setup C4.

Compilation (and synthesis) of hardware on C4

Setup F1

Most of the setups in the Getting Started on a Local Ubuntu Machine section are not needed on F1. Here are the steps to setup F1:

Run on the FPGA of F1

Other build configurations

$RISCY_HOME/procs/RV64G_OOO/Makefile contains several options to configure the build. For example, the makefile can be invoked in the following way to build for C4:

$ cd $RISCY_HOME/procs/RV64G_OOO
$ make gen.awsf1 CORE_NUM=$N DTC_PATH=/usr/bin/dtc CORE_SIZE=<TINY/SMALL/MEDIUM/...> CACHE_SIZE=<LARGE/MC_2MB/...> TSO_MM=<true/false> STORE_PREFETCH=<true/false> CHECK_DEADLOCK=<true/false> USER_CLK_PERIOD=<clock period in ns>

Below are the expanations for these options. It should be noted that these options can also be applied when building for simulation (i.e., for make build.verilator).

As an example, when we build the 4-core TSO multiprocessor on AWS, we invoke the makefile in the following way:

$ cd $RISCY_HOME/procs/RV64G_OOO
$ make gen.awsf1 CORE_NUM=4 DTC_PATH=/usr/bin/dtc CORE_SIZE=TINY CACHE_SIZE=MC_2MB TSO_MM=true STORE_PREFETCH=true USER_CLK_PERIOD=32

Since 4 OOO cores will make the FPGA pretty congested, we use the smallest core and cache configurations (TINY and MC_2MB, respectively). We also increase the clock period to 32ns.

Performance Counter

To collect performance data, we have deployed many performance counters in the processor design, and these counters can be queried by host software (see $RISCY_HOME/procs/cpp/PerfStats.h). In addition, we added two custom user-level CSRs: the stats CSR (address 0x801) and the terminate CSR (address 0x800). The stats CSR controls whether performance counters will be incremented, and the change made to the stats CSR by one core will be propagated to all other cores in a few cycles. Any write to the terminate CSR done by any core will shutdown the processor, and send a message to the host software. Then the host software will query all the performance counters. This is a better way than using ctrl-c to exit Linux, because using ctrl-c will just kill everything and performance counters will not be dumped.

$RISCY_HOME/riscv_custom/riscv_cumstom.h contains C macros to set these two CSRs, and $RISCY_HOME/riscv_custom/terminate contains a simple C program to shutdown the processor using the terminate CSR. This program is always included into the Linux image built using our script (see the Getting Started on a Local Ubuntu Machine section).

Known Issues

Directory Structure

Here we list some importand directories:

Obsolete directories: $RISCY_HOME/procs/RV64G_MultiCycle and $RISCY_HOME/procs/RV64G_InOrder contain multi-cycle and in-order processors derived from the OOO processor, respectively. However, both processors are no longer maintained, so please ignore these two folders.

VC707 FPGA

We have not tested VC707 FPGA for a while, so it is recommonded to use AWS FPGAs.

It is also possible to run the design on a VC707 FPGA, but the VC707 FPGA can only hold 1 core. We connect a VC707 FPGA to a Ubuntu machine through PCIe. The Ubuntu machine should be setup following the steps in the Getting Started on a Local Ubuntu Machine section.

The communication through the PCIe link may not be very stable, because the version of Connectal ($RISCY_HOME/connectal) in this repo is not the latest. These issues have been fixed in the latest version of Connectal, but we have not got a chance to upgrade to the latest version. Therefore, we suggest to use AWS if possible.

Publications