1
0
mirror of https://github.com/corundum/corundum.git synced 2025-01-16 08:12:53 +08:00
corundum/README.md

282 lines
14 KiB
Markdown
Raw Normal View History

2019-07-15 14:53:31 -07:00
# Corundum Readme
2021-01-16 13:43:42 -08:00
[![Build Status](https://github.com/corundum/corundum/workflows/Regression%20Tests/badge.svg?branch=master)](https://github.com/corundum/corundum/actions/)
GitHub repository: https://github.com/corundum/corundum
2019-07-15 14:53:31 -07:00
2022-03-15 17:56:38 -07:00
Documentation: https://docs.corundum.io/
2022-03-13 23:44:11 -07:00
2021-09-13 20:40:39 -07:00
GitHub wiki: https://github.com/corundum/corundum/wiki
2020-04-29 16:00:20 -07:00
Google group: https://groups.google.com/d/forum/corundum-nic
Zulip: https://corundum.zulipchat.com/
2021-09-13 20:40:39 -07:00
2019-07-15 14:53:31 -07:00
## Introduction
2022-03-13 23:32:41 -07:00
Corundum is an open-source, high-performance FPGA-based NIC and platform for in-network compute. Features include a high performance datapath, 10G/25G/100G Ethernet, PCI express gen 3, a custom, high performance, tightly-integrated PCIe DMA engine, many (1000+) transmit, receive, completion, and event queues, scatter/gather DMA, MSI interrupts, multiple interfaces, multiple ports per interface, per-port transmit scheduling including high precision TDMA, flow hashing, RSS, checksum offloading, and native IEEE 1588 PTP timestamping. A Linux driver is included that integrates with the Linux networking stack. Development and debugging is facilitated by an extensive simulation framework that covers the entire system from a simulation model of the driver and PCI express interface on one side to the Ethernet interfaces on the other side.
2021-09-13 20:40:39 -07:00
Corundum has several unique architectural features. First, transmit, receive, completion, and event queue states are stored efficiently in block RAM or ultra RAM, enabling support for thousands of individually-controllable queues. These queues are associated with interfaces, and each interface can have multiple ports, each with its own independent scheduler. This enables extremely fine-grained control over packet transmission. Coupled with PTP time synchronization, this enables high precision TDMA.
Corundum also provides an application section for implementing custom logic. The application section has a dedicated PCIe BAR for control and a number of interfaces that provide access to the core datapath and DMA infrastructure.
2022-03-13 23:32:41 -07:00
Corundum currently supports devices from both Xilinx and Intel, on boards from several different manufacturers. Designs are included for the following FPGA boards:
2019-08-08 12:47:19 -07:00
2020-04-21 18:06:20 -07:00
* Alpha Data ADM-PCIE-9V3 (Xilinx Virtex UltraScale+ XCVU3P)
* Dini Group DNPCIe_40G_KU_LL_2QSFP (Xilinx Kintex UltraScale XCKU040)
* Cisco Nexus K35-S (Xilinx Kintex UltraScale XCKU035)
* Cisco Nexus K3P-S (Xilinx Kintex UltraScale+ XCKU3P)
* Cisco Nexus K3P-Q (Xilinx Kintex UltraScale+ XCKU3P)
2020-09-22 23:13:07 -07:00
* Silicom fb2CG@KU15P (Xilinx Kintex UltraScale+ XCKU15P)
2020-03-28 00:49:03 -07:00
* NetFPGA SUME (Xilinx Virtex 7 XC7V690T)
* BittWare 250-SoC (Xilinx Zynq UltraScale+ XCZU19EG)
* BittWare XUSP3S (Xilinx Virtex UltraScale XCVU095)
* BittWare XUP-P3R (Xilinx Virtex UltraScale+ XCVU9P)
* Intel Stratix 10 MX dev kit (Intel Stratix 10 MX 2100)
* Intel Stratix 10 DX dev kit (Intel Stratix 10 DX 2800)
* Intel Agilex F dev kit (Intel Agilex F 014)
* Terasic DE10-Agilex (Intel Agilex F 014)
2020-07-17 01:45:25 -07:00
* Xilinx Alveo U50 (Xilinx Virtex UltraScale+ XCU50)
2020-09-22 01:02:43 -07:00
* Xilinx Alveo U200 (Xilinx Virtex UltraScale+ XCU200)
* Xilinx Alveo U250 (Xilinx Virtex UltraScale+ XCU250)
2020-07-12 11:34:31 -07:00
* Xilinx Alveo U280 (Xilinx Virtex UltraScale+ XCU280)
* Xilinx Kria KR260 (Xilinx Zynq UltraScale+ XCK26)
2020-04-21 18:06:20 -07:00
* Xilinx VCU108 (Xilinx Virtex UltraScale XCVU095)
* Xilinx VCU118 (Xilinx Virtex UltraScale+ XCVU9P)
* Xilinx VCU1525 (Xilinx Virtex UltraScale+ XCVU9P)
* Xilinx ZCU102 (Xilinx Zynq UltraScale+ XCZU9EG)
2020-08-06 23:26:20 -07:00
* Xilinx ZCU106 (Xilinx Zynq UltraScale+ XCZU7EV)
2019-07-15 14:53:31 -07:00
2021-09-13 20:40:39 -07:00
For operation at 10G and 25G, Corundum uses the open source 10G/25G MAC and PHY modules from the verilog-ethernet repository, no extra licenses are required. However, it is possible to use other MAC and/or PHY modules.
Operation at 100G on Xilinx UltraScale+ devices currently requires using the Xilinx CMAC core with RS-FEC enabled, which is covered by the free CMAC license.
2019-12-31 22:02:10 -08:00
2019-07-15 14:53:31 -07:00
## Documentation
2022-03-15 17:56:38 -07:00
For detailed documentation, see https://docs.corundum.io/
2022-03-13 23:44:11 -07:00
2020-04-17 16:35:34 -07:00
### Block Diagram
2022-03-13 23:32:41 -07:00
![Corundum block diagram](docs/source/diagrams/svg/corundum_block.svg)
2020-04-17 16:35:34 -07:00
2022-03-13 23:32:41 -07:00
Block diagram of the Corundum NIC. PCIe HIP: PCIe hard IP core; AXIL M: AXI lite master; DMA IF: DMA interface; AXI M: AXI master; PHC: PTP hardware clock; TXQ: transmit queue manager; TXCQ: transmit completion queue manager; RXQ: receive queue manager; RXCQ: receive completion queue manager; EQ: event queue manager; MAC + PHY: Ethernet media access controller (MAC) and physical interface layer (PHY).
2020-04-21 18:06:20 -07:00
2019-07-15 14:53:31 -07:00
### Modules
2021-05-18 22:33:01 -07:00
#### `cmac_pad` module
2019-12-31 22:02:10 -08:00
2021-09-13 20:40:39 -07:00
Frame pad module for 512 bit 100G CMAC TX interface. Zero pads transmit frames to minimum 64 bytes.
2019-12-31 22:02:10 -08:00
2021-05-18 22:33:01 -07:00
#### `cpl_op_mux` module
2019-10-19 00:47:00 -07:00
2021-09-13 20:40:39 -07:00
Completion operation multiplexer module. Merges completion write operations from different sources to enable sharing a single `cpl_write` module instance.
2019-10-19 00:47:00 -07:00
2021-05-18 22:33:01 -07:00
#### `cpl_queue_manager` module
2019-07-15 14:53:31 -07:00
2021-09-13 20:40:39 -07:00
Completion queue manager module. Stores device to host queue state in block RAM or ultra RAM.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `cpl_write` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Completion write module. Responsible for enqueuing completion and event records into the completion queue managers and writing records into host memory via DMA.
2019-10-19 00:47:00 -07:00
2021-05-18 22:33:01 -07:00
#### `desc_fetch` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Descriptor fetch module. Responsible for dequeuing descriptors from the queue managers and reading descriptors from host memory via DMA.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `desc_op_mux` module
2019-10-19 00:47:00 -07:00
2021-09-13 20:40:39 -07:00
Descriptor operation multiplexer module. Merges descriptor fetch operations from different sources to enable sharing a single `desc_fetch` module instance.
2019-10-19 00:47:00 -07:00
2021-05-18 22:33:01 -07:00
#### `event_mux` module
2019-10-19 00:47:00 -07:00
Event mux module. Enables multiple event sources to feed the same event queue.
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
#### `mqnic_core` module
Core module. Contains the interfaces, asynchronous FIFOs, PTP subsystem, statistics collection subsystem, and application block.
#### `mqnic_core_pcie` module
Core module for a PCIe host interface. Wraps `mqnic_core` along with generic PCIe interface components, including DMA engine and AXI lite masters.
#### `mqnic_core_pcie_us` module
Core module for a PCIe host interface on Xilinx 7-series, UltraScale, and UltraScale+. Wraps `mqnic_core_pcie` along with FPGA-specific interface logic.
2021-05-18 22:33:01 -07:00
#### `mqnic_interface` module
2019-07-20 00:56:21 -07:00
Interface module. Contains the event queues, interface queues, and ports.
2021-05-18 22:33:01 -07:00
#### `mqnic_port` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Port module. Contains the transmit and receive datapath components, including transmit and receive engines and checksum and hash offloading.
#### `mqnic_ptp` module
PTP subsystem. Contains one `mqnic_ptp_clock` instance and a parametrizable number of `mqnic_ptp_perout` instances.
#### `mqnic_ptp_clock` module
PTP clock module. Contains an instance of `ptp_clock` with a register interface.
#### `mqnic_ptp_perout` module
PTP period output module. Contains an instance of `ptp_perout` with a register interface.
#### `mqnic_tx_scheduler_block_rr` module
Transmit scheduler block with round-robin transmit scheduler and register interface.
#### `mqnic_tx_scheduler_block_rr_tdma` module
Transmit scheduler block with round-robin transmit scheduler, TDMA scheduler, TDMA scheduler controller, and register interface.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `queue_manager` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Queue manager module. Stores host to device queue state in block RAM or ultra RAM.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `rx_checksum` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Receive checksum computation module. Computes 16 bit checksum of Ethernet frame payload to aid in IP checksum offloading.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `rx_engine` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Receive engine. Manages receive datapath operations including descriptor dequeue and fetch via DMA, packet reception, data writeback via DMA, and completion enqueue and writeback via DMA. Handles PTP timestamps for inclusion in completion records.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `rx_hash` module
2019-12-06 14:56:54 -08:00
2021-09-13 20:40:39 -07:00
Receive hash computation module. Extracts IP addresses and ports from packet headers and computes 32 bit Toeplitz flow hash.
#### `stats_collect` module
Statistics collector module. Parametrizable number of increment inputs, single AXI stream output for accumulated counts.
#### `stats_counter` module
Statistics counter module. Receives increments over AXI stream and accumulates them in block RAM, which is accessible via AXI lite.
#### `stats_dma_if_pcie` module
Collects DMA-related statistics for `dma_if_pcie` module, including operation latency.
#### `stats_dma_if_latency` module
DMA latency measurement module.
#### `stats_pcie_if` module
Collects TLP-level statistics for the generic PCIe interface.
#### `stats_pcie_tlp` module
Extracts TLP-level statistics for the generic PCIe interface (single channel).
2019-12-06 14:56:54 -08:00
2021-05-18 22:33:01 -07:00
#### `tdma_ber_ch` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
TDMA bit error ratio (BER) test channel module. Controls PRBS logic in Ethernet PHY and accumulates bit errors. Can be configured to bin error counts by TDMA timeslot.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `tdma_ber` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
TDMA bit error ratio (BER) test module. Wrapper for a tdma_scheduler and multiple instances of `tdma_ber_ch`.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `tdma_scheduler` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
TDMA scheduler module. Generates TDMA timeslot index and timing signals from PTP time.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `tx_checksum` module
2019-08-22 00:57:17 -07:00
2021-09-13 20:40:39 -07:00
Transmit checksum computation and insertion module. Computes 16 bit checksum of frame data with specified start offset, then inserts computed checksum at the specified position.
2019-08-22 00:57:17 -07:00
2021-05-18 22:33:01 -07:00
#### `tx_engine` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Transmit engine. Manages transmit datapath operations including descriptor dequeue and fetch via DMA, packet data fetch via DMA, packet transmission, and completion enqueue and writeback via DMA. Handles PTP timestamps for inclusion in completion records.
2019-07-20 00:56:21 -07:00
2021-05-18 22:33:01 -07:00
#### `tx_scheduler_ctrl_tdma` module
2019-11-05 22:13:26 -08:00
2021-09-13 20:40:39 -07:00
TDMA transmit scheduler control module. Controls queues in a transmit scheduler based on PTP time, via a `tdma_scheduler` instance.
2019-11-05 22:13:26 -08:00
2021-05-18 22:33:01 -07:00
#### `tx_scheduler_rr` module
2019-07-20 00:56:21 -07:00
2021-09-13 20:40:39 -07:00
Round-robin transmit scheduler. Determines which queues from which to send packets.
2019-07-20 00:56:21 -07:00
2019-07-15 14:53:31 -07:00
### Source Files
2021-09-13 20:40:39 -07:00
cmac_pad.v : Pad frames to 64 bytes for CMAC TX
cpl_op_mux.v : Completion operation mux
cpl_queue_manager.v : Completion queue manager
cpl_write.v : Completion write module
desc_fetch.v : Descriptor fetch module
desc_op_mux.v : Descriptor operation mux
event_mux.v : Event mux
event_queue.v : Event queue
mqnic_core.v : Core logic
mqnic_core_pcie.v : Core logic for PCIe
mqnic_core_pcie_us.v : Core logic for PCIe (UltraScale)
mqnic_interface.v : Interface
mqnic_port.v : Port
mqnic_ptp.v : PTP subsystem
mqnic_ptp_clock.v : PTP clock wrapper
mqnic_ptp_perout.v : PTP period output wrapper
mqnic_tx_scheduler_block_rr.v : Scheduler block (round-robin)
mqnic_tx_scheduler_block_rr_tdma.v : Scheduler block (round-robin TDMA)
queue_manager.v : Queue manager
rx_checksum.v : Receive checksum offload
rx_engine.v : Receive engine
rx_hash.v : Receive hashing module
stats_collect.v : Statistics collector
stats_counter.v : Statistics counter
stats_dma_if_pcie.v : DMA interface statistics
stats_dma_latency.v : DMA latency measurement
stats_pcie_if.v : PCIe interface statistics
stats_pcie_tlp.v : PCIe TLP statistics
tdma_ber_ch.v : TDMA BER channel
tdma_ber.v : TDMA BER
tdma_scheduler.v : TDMA scheduler
tx_checksum.v : Transmit checksum offload
tx_engine.v : Transmit engine
tx_scheduler_ctrl_tdma.v : TDMA transmit scheduler controller
tx_scheduler_rr.v : Round robin transmit scheduler
2019-07-15 14:53:31 -07:00
## Testing
2020-12-15 17:21:22 -08:00
Running the included testbenches requires [cocotb](https://github.com/cocotb/cocotb), [cocotbext-axi](https://github.com/alexforencich/cocotbext-axi), [cocotbext-eth](https://github.com/alexforencich/cocotbext-eth), [cocotbext-pcie](https://github.com/alexforencich/cocotbext-pcie), [scapy](https://scapy.net/), and [Icarus Verilog](http://iverilog.icarus.com/). The testbenches can be run with pytest directly (requires [cocotb-test](https://github.com/themperek/cocotb-test)), pytest via tox, or via cocotb makefiles.
2019-08-08 12:47:19 -07:00
2020-03-26 11:54:48 -07:00
## Publications
2022-03-13 23:32:41 -07:00
- A. Forencich, A. C. Snoeren, G. Porter, G. Papen, *Corundum: An Open-Source 100-Gbps NIC,* in FCCM'20. ([FCCM Paper](https://www.cse.ucsd.edu/~snoeren/papers/corundum-fccm20.pdf), [FCCM Presentation](https://www.fccm.org/past/2020/forums/topic/corundum-an-open-source-100-gbps-nic/))
2020-07-13 13:41:33 -07:00
2022-03-13 23:32:41 -07:00
- J. A. Forencich, *System-Level Considerations for Optical Switching in Data Center Networks*. ([Thesis](https://escholarship.org/uc/item/3mc9070t))
2020-03-26 11:54:48 -07:00
## Citation
2021-09-13 20:40:39 -07:00
2022-03-13 23:32:41 -07:00
If you use Corundum in your project, please cite one of the following papers
and/or link to the project on GitHub:
2020-03-26 11:54:48 -07:00
```
@inproceedings{forencich2020fccm,
author = {Alex Forencich and Alex C. Snoeren and George Porter and George Papen},
2020-04-01 11:56:11 -07:00
title = {Corundum: An Open-Source {100-Gbps} {NIC}},
2020-03-26 11:54:48 -07:00
booktitle = {28th IEEE International Symposium on Field-Programmable Custom Computing Machines},
year = {2020},
}
2020-04-21 18:06:20 -07:00
@phdthesis{forencich2020thesis,
author = {John Alexander Forencich},
title = {System-Level Considerations for Optical Switching in Data Center Networks},
school = {UC San Diego},
year = {2020},
2020-07-13 13:41:33 -07:00
url = {https://escholarship.org/uc/item/3mc9070t},
2020-04-21 18:06:20 -07:00
}
2020-03-26 11:54:48 -07:00
```
2019-08-08 12:47:19 -07:00
## Dependencies
Corundum internally uses the following libraries:
* https://github.com/alexforencich/verilog-axi
* https://github.com/alexforencich/verilog-axis
* https://github.com/alexforencich/verilog-ethernet
* https://github.com/alexforencich/verilog-pcie
* https://github.com/solemnwarning/timespec