2019-07-15 14:53:31 -07:00
|
|
|
# Corundum Readme
|
|
|
|
|
|
|
|
GitHub repository: https://github.com/ucsdsysnet/corundum
|
|
|
|
|
|
|
|
## Introduction
|
|
|
|
|
2019-08-08 12:47:19 -07:00
|
|
|
Corundum is an open-source, high-performance FPGA-based NIC. Features include
|
2019-12-31 22:02:10 -08:00
|
|
|
a high performance datapath, 10G/25G/100G Ethernet, PCI express gen 3, a
|
|
|
|
custom, high performance, tightly-integrated PCIe DMA engine, many (1000+)
|
|
|
|
transmit, receive, completion, and event queues, MSI interrupts, multiple
|
|
|
|
interfaces, multiple ports per interface, per-port transmit scheduling
|
|
|
|
including high precision TDMA, flow hashing, RSS, checksum offloading, and
|
|
|
|
native IEEE 1588 PTP timestamping. A Linux driver is included that integrates
|
|
|
|
with the Linux networking stack. Development and debugging is facilitated by
|
|
|
|
an extensive simulation framework that covers the entire system from a
|
|
|
|
simulation model of the driver and PCI express interface on one side to the
|
|
|
|
Ethernet interfaces on the other side.
|
2019-08-08 12:47:19 -07:00
|
|
|
|
|
|
|
Corundum has several unique architectural features. First, transmit, receive,
|
|
|
|
completion, and event queue states are stored efficiently in block RAM or
|
|
|
|
ultra RAM, enabling support for thousands of individually-controllable
|
|
|
|
queues. These queues are associated with interfaces, and each interface can
|
|
|
|
have multiple ports, each with its own independent scheduler. This enables
|
|
|
|
extremely fine-grained control over packet transmission. Coupled with PTP time
|
2019-08-22 00:57:17 -07:00
|
|
|
synchronization, this enables high precision TDMA.
|
2019-08-08 12:47:19 -07:00
|
|
|
|
2020-03-28 00:49:03 -07:00
|
|
|
Corundum currently supports Xilinx Virtex 7, Ultrascale, and Ultrascale+ series
|
2019-12-06 14:56:54 -08:00
|
|
|
devices. Designs are included for the following FPGA boards:
|
2019-08-08 12:47:19 -07:00
|
|
|
|
2020-03-28 00:49:03 -07:00
|
|
|
* Alpha Data ADM-PCIE-9V3 (Xilinx Virtex Ultrascale+ XCVU3P)
|
2019-08-08 12:47:19 -07:00
|
|
|
* Exablaze ExaNIC X10 (Xilinx Kintex Ultrascale XCKU035)
|
2020-03-28 00:49:03 -07:00
|
|
|
* Exablaze ExaNIC X25 (Xilinx Kintex Ultrascale+ XCKU3P)
|
|
|
|
* NetFPGA SUME (Xilinx Virtex 7 XC7V690T)
|
2019-08-08 12:47:19 -07:00
|
|
|
* Xilinx VCU108 (Xilinx Virtex Ultrascale XCVU095)
|
2020-03-28 00:49:03 -07:00
|
|
|
* Xilinx VCU118 (Xilinx Virtex Ultrascale+ XCVU9P)
|
|
|
|
* Xilinx VCU1525 (Xilinx Virtex Ultrascale+ XCVU9P)
|
2019-07-15 14:53:31 -07:00
|
|
|
|
2019-12-31 22:02:10 -08:00
|
|
|
For operation at 10G and 25G, Corundum uses the open source 10G/25G MAC and
|
|
|
|
PHY modules from the verilog-ethernet repository, no extra licenses are
|
|
|
|
required. However, it is possible to use other MAC and/or PHY modules.
|
|
|
|
Operation at 100G currently requires using the Xilinx CMAC core with RS-FEC
|
|
|
|
enabled, which is covered by the free CMAC license on Xilinx Ultrascale+ parts.
|
|
|
|
|
2019-07-15 14:53:31 -07:00
|
|
|
## Documentation
|
|
|
|
|
|
|
|
### Modules
|
|
|
|
|
2019-12-31 22:02:10 -08:00
|
|
|
#### cmac_pad module
|
|
|
|
|
|
|
|
Frame pad module for 512 bit 100G CMAC TX interface. Zero pads transmit
|
|
|
|
frames to minimum 64 bytes.
|
|
|
|
|
2019-10-19 00:47:00 -07:00
|
|
|
#### cpl_op_mux module
|
|
|
|
|
|
|
|
Completion operation multiplexer module. Merges completion write operations
|
|
|
|
from different sources to enable sharing a single cpl_write module instance.
|
|
|
|
|
2019-07-20 00:56:21 -07:00
|
|
|
#### cpl_queue_manager module
|
2019-07-15 14:53:31 -07:00
|
|
|
|
2019-07-20 00:56:21 -07:00
|
|
|
Completion queue manager module. Stores device to host queue state in block
|
|
|
|
RAM or ultra RAM.
|
|
|
|
|
2019-10-19 00:47:00 -07:00
|
|
|
#### cpl_write module
|
2019-07-20 00:56:21 -07:00
|
|
|
|
2019-10-19 00:47:00 -07:00
|
|
|
Completion write module. Responsible for writing completion and event entries
|
|
|
|
into host memory.
|
|
|
|
|
|
|
|
#### desc_fetch module
|
2019-07-20 00:56:21 -07:00
|
|
|
|
2019-10-19 00:47:00 -07:00
|
|
|
Descriptor fetch module. Responsible for reading descriptors from host memory.
|
2019-07-20 00:56:21 -07:00
|
|
|
|
2019-10-19 00:47:00 -07:00
|
|
|
#### desc_op_mux module
|
|
|
|
|
|
|
|
Descriptor operation multiplexer module. Merges descriptor fetch operations
|
|
|
|
from different sources to enable sharing a single cpl_write module instance.
|
|
|
|
|
|
|
|
#### event_mux module
|
|
|
|
|
|
|
|
Event mux module. Enables multiple event sources to feed the same event queue.
|
2019-07-20 00:56:21 -07:00
|
|
|
|
|
|
|
#### interface module
|
|
|
|
|
|
|
|
Interface module. Contains the event queues, interface queues, and ports.
|
|
|
|
|
|
|
|
#### port module
|
|
|
|
|
|
|
|
Port module. Contains the transmit and receive engines
|
|
|
|
|
|
|
|
#### queue_manager module
|
|
|
|
|
|
|
|
Queue manager module. Stores host to device queue state in block RAM or ultra
|
|
|
|
RAM.
|
|
|
|
|
|
|
|
#### rx_checksum module
|
|
|
|
|
|
|
|
Receive checksum computation module. Computes 16 bit checksum of Ethernet
|
|
|
|
frame payload to aid in IP checksum offloading.
|
|
|
|
|
|
|
|
#### rx_engine module
|
|
|
|
|
|
|
|
Receive engine. Manages receive descriptor dequeue and fetch via DMA, packet
|
|
|
|
reception, data writeback via DMA, and completion enqueue and writeback via
|
|
|
|
DMA. Handles PTP timestamps for inclusion in completion records.
|
|
|
|
|
2019-12-06 14:56:54 -08:00
|
|
|
#### rx_hash module
|
|
|
|
|
|
|
|
Receive hash computation module. Extracts IP addresses and ports from packet
|
|
|
|
headers and computes 32 bit Toeplitz flow hash.
|
|
|
|
|
2019-07-20 00:56:21 -07:00
|
|
|
#### tdma_ber_ch module
|
|
|
|
|
|
|
|
TDMA bit error ratio test channel module. Controls PRBS logic in Ethernet PHY
|
|
|
|
and accumulates bit errors. Can be configured to bin error counts by TDMA
|
|
|
|
timeslot.
|
|
|
|
|
|
|
|
#### tdma_ber module
|
|
|
|
|
|
|
|
TDMA bit error ratio test module. Wrapper for a tdma_scheduler and multiple
|
|
|
|
instances of tdma_ber_ch.
|
|
|
|
|
|
|
|
#### tdma_scheduler module
|
|
|
|
|
|
|
|
TDMA scheduler module. Generates TDMA timeslot index and timing signals from
|
|
|
|
PTP time.
|
|
|
|
|
2019-08-22 00:57:17 -07:00
|
|
|
#### tx_checksum module
|
|
|
|
|
|
|
|
Transmit checksum computation and insertion module. Computes 16 bit checksum
|
|
|
|
of frame data with specified start offset, then inserts computed checksum at
|
|
|
|
the specified position.
|
|
|
|
|
2019-07-20 00:56:21 -07:00
|
|
|
#### tx_engine module
|
|
|
|
|
|
|
|
Transmit engine. Manages receive descriptor dequeue and fetch via DMA, packet
|
|
|
|
data fetch via DMA, packet transmission, and completion enqueue and writeback
|
|
|
|
via DMA. Handles PTP timestamps for inclusion in completion records.
|
|
|
|
|
2019-11-05 22:13:26 -08:00
|
|
|
#### tx_scheduler_ctrl_tdma module
|
|
|
|
|
|
|
|
TDMA transmit scheduler control module. Controls queues in a transmit
|
|
|
|
scheduler based on PTP time, via a tdma_scheduler instance.
|
|
|
|
|
2019-07-20 00:56:21 -07:00
|
|
|
#### tx_scheduler_rr module
|
|
|
|
|
|
|
|
Round-robin transmit scheduler. Determines which queues from which to send
|
|
|
|
packets.
|
|
|
|
|
2019-07-15 14:53:31 -07:00
|
|
|
### Source Files
|
|
|
|
|
2019-12-31 22:02:10 -08:00
|
|
|
cmac_pad.v : Pad frames to 64 bytes for CMAC TX
|
2019-10-19 00:47:00 -07:00
|
|
|
cpl_op_mux.v : Completion operation mux
|
2019-07-20 00:56:21 -07:00
|
|
|
cpl_queue_manager.v : Completion queue manager
|
2019-10-19 00:47:00 -07:00
|
|
|
cpl_write.v : Completion write module
|
|
|
|
desc_fetch.v : Descriptor fetch module
|
|
|
|
desc_op_mux.v : Descriptor operation mux
|
2019-07-20 00:56:21 -07:00
|
|
|
event_mux.v : Event mux
|
|
|
|
event_queue.v : Event queue
|
|
|
|
interface.v : Interface
|
|
|
|
port.v : Port
|
|
|
|
queue_manager.v : Queue manager
|
|
|
|
rx_checksum.v : Receive checksum offload
|
|
|
|
rx_engine.v : Receive engine
|
2019-12-06 14:56:54 -08:00
|
|
|
rx_hash.v : Receive hashing module
|
2019-07-20 00:56:21 -07:00
|
|
|
tdma_ber_ch.v : TDMA BER channel
|
|
|
|
tdma_ber.v : TDMA BER
|
|
|
|
tdma_scheduler.v : TDMA scheduler
|
2019-08-22 00:57:17 -07:00
|
|
|
tx_checksum.v : Transmit checksum offload
|
2019-07-20 00:56:21 -07:00
|
|
|
tx_engine.v : Transmit engine
|
2019-11-05 22:13:26 -08:00
|
|
|
tx_scheduler_ctrl_tdma.v : TDMA transmit scheduler controller
|
2019-07-20 00:56:21 -07:00
|
|
|
tx_scheduler_rr.v : Round robin transmit scheduler
|
2019-07-15 14:53:31 -07:00
|
|
|
|
|
|
|
## Testing
|
|
|
|
|
|
|
|
Running the included testbenches requires MyHDL and Icarus Verilog. Make sure
|
|
|
|
that myhdl.vpi is installed properly for cosimulation to work correctly. The
|
|
|
|
testbenches can be run with a Python test runner like nose or py.test, or the
|
|
|
|
individual test scripts can be run with python directly.
|
|
|
|
|
|
|
|
### Testbench Files
|
|
|
|
|
2019-07-20 00:56:21 -07:00
|
|
|
tb/axi.py : MyHDL AXI4 master and memory BFM
|
|
|
|
tb/axil.py : MyHDL AXI4 lite master and memory BFM
|
2019-07-15 14:53:31 -07:00
|
|
|
tb/axis_ep.py : MyHDL AXI Stream endpoints
|
2019-07-20 00:56:21 -07:00
|
|
|
tb/eth_ep.py : MyHDL Ethernet frame endpoints
|
|
|
|
tb/ip_ep.py : MyHDL IP frame endpoints
|
|
|
|
tb/mqnic.py : MyHDL mqnic driver model
|
2019-07-15 14:53:31 -07:00
|
|
|
tb/pcie.py : MyHDL PCI Express BFM
|
|
|
|
tb/pcie_us.py : MyHDL Xilinx Ultrascale PCIe core model
|
2020-03-28 00:49:03 -07:00
|
|
|
tb/pcie_usp.py : MyHDL Xilinx Ultrascale+ PCIe core model
|
2019-07-20 00:56:21 -07:00
|
|
|
tb/ptp.py : MyHDL PTP clock model
|
|
|
|
tb/udp_ep.py : MyHDL UDP frame endpoints
|
|
|
|
tb/xgmii_ep.py : MyHDL XGMII endpoints
|
2019-08-08 12:47:19 -07:00
|
|
|
|
2020-03-26 11:54:48 -07:00
|
|
|
## Publications
|
|
|
|
|
|
|
|
- A. Forencich, A. C. Snoeren, G. Porter, G. Papen, *Corundum: An Open-Source 100G NIC,* in FCCM'20
|
|
|
|
|
|
|
|
## Citation
|
|
|
|
If you use Corundum in your project please cite one of the following papers
|
|
|
|
and/or link to the github project:
|
|
|
|
|
|
|
|
```
|
|
|
|
@inproceedings{forencich2020fccm,
|
|
|
|
author = {Alex Forencich and Alex C. Snoeren and George Porter and George Papen},
|
|
|
|
title = {Corundum: An Open-Source {100G} {NIC}},
|
|
|
|
booktitle = {28th IEEE International Symposium on Field-Programmable Custom Computing Machines},
|
|
|
|
year = {2020},
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2019-08-08 12:47:19 -07:00
|
|
|
## Dependencies
|
|
|
|
|
|
|
|
Corundum internally uses the following libraries:
|
|
|
|
|
|
|
|
* https://github.com/alexforencich/verilog-axi
|
|
|
|
* https://github.com/alexforencich/verilog-axis
|
|
|
|
* https://github.com/alexforencich/verilog-ethernet
|
|
|
|
* https://github.com/alexforencich/verilog-pcie
|
|
|
|
* https://github.com/solemnwarning/timespec
|
|
|
|
|