2015-07-14 09:43:51 -07:00
\documentclass { refrep}
\settextfraction { .9}
\usepackage { extramarks}
\usepackage { amsmath}
\usepackage { amsthm}
\usepackage { amsfonts}
\usepackage { tikz}
\usepackage [plain] { algorithm}
\usepackage { algpseudocode}
\usepackage { listings}
\usepackage { datetime}
\usepackage { xcolor}
\usepackage { float}
\usepackage { graphicx}
\usepackage [export] { adjustbox}
\usepackage { caption}
\usepackage { contour}
\makeatletter
\newcount \dirtree @lvl
\newcount \dirtree @plvl
\newcount \dirtree @clvl
\def \dirtree @growth{
\ifnum \tikznumberofcurrentchild =1\relax
\global \advance \dirtree @plvl by 1
\expandafter \xdef \csname dirtree@p@\the \dirtree @plvl\endcsname { \the \dirtree @lvl}
\fi
\global \advance \dirtree @lvl by 1\relax
\dirtree @clvl=\dirtree @lvl
\advance \dirtree @clvl by -\csname dirtree@p@\the \dirtree @plvl\endcsname
\pgf @xa=.25cm\relax
\pgf @ya=-.5cm\relax
\pgf @ya=\dirtree @clvl\pgf @ya
\pgftransformshift { \pgfqpoint { \the \pgf @xa} { \the \pgf @ya} } %
\ifnum \tikznumberofcurrentchild =\tikznumberofchildren
\global \advance \dirtree @plvl by -1
\fi
}
\tikzset {
dirtree/.style={
growth function=\dirtree @growth,
every node/.style={ anchor=north} ,
every child node/.style={ anchor=west} ,
edge from parent path={ (\tikzparentnode \tikzparentanchor ) |- (\tikzchildnode \tikzchildanchor )}
}
}
\makeatother
\graphicspath { { images/vc707/} { images/vc709/} { images/qsys/} { images/ipc/} { images/perf/} { images/} }
\newcommand { \figurewidth } { 350px}
\newcommand { \VivadoVer } { 2014.4}
\newcommand { \QuartusVer } { 14.1}
\newcommand { \RIFFAVer } { 2.2.0}
\newcommand { \Directory } [1]{ \textit { #1} }
\newcommand { \EnvVariable } [1]{ \textbf { #1} }
\newcommand { \TermCmd } [1]{ \$ \texttt { #1} }
\newcommand { \TODO } [1]{ \textbf { \color { red} { #1} } }
\newcommand { \Xilinx } [1]{ { \color { red} { #1} } }
\newcommand { \Altera } [1]{ { \color { blue} { #1} } }
\newcommand { \ConfigSetting } [1]{ \textbf { #1} }
\newcommand { \VCSevenOhNineExampleDesignsLong } { VC709\_ Gen1x8If64 (PCIe Gen1, 8 lanes, 64-bit CHNL interface), VC709\_ Gen2x8If128 (PCIe Gen2, 8 lanes, 128-bit CHNL interface), VC709\_ Gen3x4If128 (PCIe Gen3, 8 lanes, 128-bit CHNL interface)}
\newcommand { \VCSevenOhNineExampleDesignsShort } { VC709\_ Gen1x8If64, VC709\_ Gen2x8If128, VC709\_ Gen3x4If128}
\newcommand { \RIFFAParameter } [1]{ \textit { \textbf { #1} } }
\title { RIFFA \RIFFAVer ~ Documentation}
\author { Dustin Richmond, Matt Jacobsen}
\date { \today }
\begin { document}
\maketitle
\pagebreak
\tableofcontents
\chapter { Introduction: RIFFA}
\section { What is RIFFA}
RIFFA (Reusable Integration Framework for FPGA Accelerators) is a simple
framework for communicating data from a host CPU to a FPGA via a PCI Express
bus. The framework requires a PCIe enabled workstation and a FPGA on a board
with a PCIe connector. RIFFA supports Windows and Linux, Altera and Xilinx, with
bindings in C/C++, Python, MATLAB and Java.
On the software side there are two main functions: data send and data
receive. These functions are exposed via user libraries in C/C++, Python,
MATLAB, and Java. The driver supports multiple FPGAs (up to 5) per system. The
software bindings work on Linux and Windows operating systems. Users can
communicate with FPGA IP cores by writing only a few lines of code.
On the hardware side, users access an interface with independent transmit and
receive signals. The signals provide transaction handshaking and a first word
fall through FIFO interface for reading/writing data to the host. No knowledge
of bus addresses, buffer sizes, or PCIe packet formats is required. Simply send
data on a FIFO interface and receive data on a FIFO interface. RIFFA does not
rely on a PCIe Bridge and therefore is not subject to the limitations of a
bridge implementation. Instead, RIFFA works directly with the PCIe Endpoint and
can run fast enough to saturate the PCIe link.
RIFFA communicates data using direct memory access (DMA) transfers and interrupt
signaling. This achieves high bandwidth over the PCIe link. In our tests we are
able to saturate (or near saturate) the link in all our tests. The RIFFA distribution
contains examples and guides for setting up designs on several standard development
boards.
\begin { figure} [H]
\includegraphics [width=300px,center] { bw_ sg_ 2_ 1_ maxs.png}
\caption { Graph of Bandwidth vs Transfer Size}
\label { Fig:RIFFA:Performance}
\end { figure}
RIFFA \RIFFAVer ~is significantly more efficient than its predecesor RIFFA
1.0. RIFFA \RIFFAVer ~is able to saturate the PCIe link for nearly all link
configurations supported. Figure~\ref { Fig:RIFFA:Performance} shows the
performance of designs using the 32 bit, 64 bit, and 128 bit interfaces. The
colored bands show the bandwidth region between the theoretical maximum and the
maximum achievable. PCIe Gen 1 and 2 use 8 bit / 10 bit encoding which limits
the maximum achievable bandwidth to 80\% of the theoretical. Our experiments
show that RIFFA can achieve 80\% of the theoretical bandwidth in nearly all
cases. The 128 bit interface achieves 76\% of the theoretical maximum.
If you are using RIFFA on a new platform not listed above let us know and we’ ll
help you out!
%\pagebreak
\section { Licensing}
2015-07-15 15:53:19 -07:00
2016-02-09 15:23:37 -08:00
Copyright (c) 2016, The Regents of the University of California All rights
2015-07-15 15:53:19 -07:00
reserved.
Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:
\begin { itemize}
\item Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
\item Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following
disclaimer in the documentation and/or other materials provided
with the distribution.
\item Neither the name of The Regents of the University of California
nor the names of its contributors may be used to endorse or
promote products derived from this software without specific
prior written permission.
\end { itemize}
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS''
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL REGENTS OF THE UNIVERSITY OF CALIFORNIA BE LIABLE
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR
TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF
THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2015-07-14 09:43:51 -07:00
\pagebreak
\chapter { Getting Started}
\section { Development Board Support in RIFFA \RIFFAVer }
RIFFA \RIFFAVer ~supports:
\begin { itemize}
\item The VC707, ZC706 and similar boards with the Xilinx IP Core 7-Series
Integrated Block for PCI Express. Example designs for the VC707 and ZC706
boards are provided, and contain this core. The current distribution supports
all 64-bit interfaces for these devices, with 128-bit support coming soon
after the initial release. (Support for the 128-bit interface is in RIFFA 2.1,
but is temporarily mising due to changes)
\item The VC709 board and similar boards with the Xilinx IP Gen3 Integrated
Block for PCI Express. Example designs for the VC709 are provided, and contain
this core. The current distribution supports all 64-bit and 128-bit AXI
interfaces. 256-bit (PCIe Gen3 x8) support is planned for a later date.
\item The DE5-Net board and similar boards with the Stratix V, Cyclone V, and
Arria V, Hard IP for PCI express (Avalon Streaming Interface). Example designs
for the DE5-net board are provided, and contain the Stratix V version of this
core. The current distribution supports all 64-bit and 128-bit Avalon
Streaming interfaces.
\item The DE4 and similar boards with the IP Compiler for PCI Express Core,
supporting Stratix IV, Cyclone IV and Arria II devices. Example designs for
the DE4 board are provided. The current distribution supports all 64-bit and
128-bit Avalon Streaming interfaces.
\end { itemize}
\section { Understanding this User Guide}
In this user guide, we use the following conventions:
\begin { center}
\begin { tabular} { | l | l |}
\hline
Object & Example \\ \hline
Directories and Paths & \Directory { RIFFA \RIFFAVer /source/fpga/riffa} \\ \hline
Xilinx Specific Content & \Xilinx { vc709} \\ \hline
Altera Specific Content & \Altera { de5} \\ \hline
Configuration Setting & \ConfigSetting { Number of Lanes} \\ \hline
Terminal Command, Code Snippet & \TermCmd { echo ``Hello World''} \\ \hline
RIFFA Parameter & \RIFFAParameter { C\_ NUM\_ CHNL} \\ \hline
\end { tabular}
\end { center}
\section { Decoding What's Provided}
\label { Sec:Intro:Decoding}
Fig~\ref { Fig:RIFFA:DirStructure} shows the directory hierarchy of RIFFA. This
instruction manual uses this directory tree when specifying all directory paths.
The \Directory { RIFFA \RIFFAVer /source/fpga/} contains a directory for each board
we have tested for the current distribution:\Altera { de5, de4} , \Xilinx { VC709,
VC707, ZC706} . Each board directory has several example project directories
(e.g. \Altera { DE5Gen1x8If64} and \Xilinx { VC709\_ Gen1x8If64} ). Each example
project directory has 5 sub-directories:
\begin { itemize}
\item \Directory { prj/} contains all of the project files (\Altera { .qsf,.qpf} , \Xilinx { .xpr} ).
\item \Directory { ip/} contains all of the ip files (\Altera { .qsys} , \Xilinx { .xci} ) generated for the project, when permitted by licensing agreements.
\item \Directory { bit/} contains the example programming file for the corresponding FPGA example design. \Altera { Quartus} and \Xilinx { Vivado} do not modify this programming (\Altera { .sof} , \Xilinx { .bit} ).
\item \Directory { constr/} contains the user constraint files (\Altera { .sdc} , \Xilinx { .xdc} ).
\item \Directory { hdl/} contains any example-project specific Verilog files, such as the project top level file.
\end { itemize}
\begin { figure} [H]
\begin { tikzpicture} [dirtree]
\node { RIFFA \RIFFAVer }
child { node { Documentation}
child { node { RIFFA User Guide.pdf (This Document)} }
}
child { node { install}
child { node { windows}
child{ node { setup.exe} }
child{ node { setup\_ debug.exe} }
}
child { node { linux}
child{ node { README.txt} }
}
}
child { node { source}
child { node { c\_ c++} }
child { node { java} }
child { node { matlab} }
child { node { python} }
child { node { driver}
child { node { linux}
child{ node { makefile} }
child{ node { ... (Other source files)} }
}
child { node { windows}
child{ node { ... (Other source files \& directories)} }
}
}
child { node{ fpga}
child { node{ riffa}
child { node{ (RIFFA Source)} }
}
child { node{ \Altera { de5} }
child { node{ \Altera { DE5Gen1x8If64} }
child{ node{ prj} }
child{ node{ bit} }
child{ node{ ip} }
child{ node{ constr} }
child{ node{ hdl} }
}
child { node{ ... (Other example projects)} }
}
child { node{ \Altera { de4} }
child { node{ \Altera { DE4Gen1x8If64} } }
child { node{ ...} }
}
child { node{ \Xilinx { vc709} }
child { node{ \Xilinx { VC709\_ Gen1x8If64} } }
child { node{ ...} }
}
child { node{ \Xilinx { zc706} }
child { node{ \Xilinx { ZC706\_ Gen1x8If64} } }
child { node{ ...} }
}
child { node{ \Xilinx { vc707} }
child { node{ \Xilinx { VC707\_ Gen1x8If64} } }
child { node{ ...} }
}
}
} ;
\end { tikzpicture}
\caption { Directory hierarchy of the RIFFA \RIFFAVer ~distribution} \label { Fig:RIFFA:DirStructure}
\end { figure}
\section { Release Notes}
\subsection { Version 2.2.0}
\begin { itemize}
\item Added: Support for the new Gen3 Integrated Block for PCIe Express, and the VC709 Development board.
\item Added: ZC706 Example Designs
\item Changed: Xilinx example project packaging. All Xilinx Virtex 7 projects are now click-to-compile, and come with instantiated IP.
\item Re-wrote and refactored: Various parts of the TX and RX engines to maximize code reuse between different vendors and PCIe endpoint implementations
\item Fixed: A bug in the Linux Driver that prevented compilation on older kernels
\item Fixed: A bug in the Windows Driver that prevented repeated small transfers.
\end { itemize}
\subsection { Version 2.1.0}
\begin { itemize}
\item Added reorder\_ queue and updated many rx/tx engine and channel modules that
use it.
\item Added parameters for number of tags to use and max payload length for sizing
RAM for reorder\_ queue.
\item Fixed: Bug in the riffa\_ driver.c, too few circular buffer elements.
\item Fixed: Bug in the riffa\_ driver.c, bad order in which interrupt vector bits
were processed. Can cause deadlock in heavy use situations.
\item Fixed: Bug in the tx\_ port\_ writer.v, maxlen did not start with a value of 1.
Can cause deadlock behavior on second transfer.
\item Fixed: Bug in the rx\_ port\_ reader.v, added delay to allow FIFO flush to
propagate.
\item Fixed: Bug in rx\_ port\_ xxx.v, changed to use FWFT FIFO instead of existing
logic that could cause CHNL\_ RX\_ DATA\_ VALID to drop for a cycle after CHNL\_ RX
dropped even when there is still data in the FIFO. Can cause premature
transmission termination.
\item Changed rx\_ port\_ channel\_ gate.v to use FWFT FIFO.
\item Removed unused signal from rx\_ port\_ requester\_ mux.v.
\item Fixed: Typo/bug that would attempt to change state within
tx\_ port\_ monitor\_ xxx.v.
\item Added flow control for receive credits to avoid over driving upstream transactions
(applies to Altera devices).
\end { itemize}
\subsection { Version 2.0.2}
\begin { itemize}
\item Fixed: Bug in Windows and Linux drivers that could report data sent/received
before such data was confirmed.
\item Fixed: Updated common functions to avoid assigning input values.
\item Fixed: FIFO overflow error causing data corruption in tx\_ engine\_ upper and
breaking the Xilinx Endpoint.
\item Fixed: Missing default cases in rx\_ port\_ reader, sg\_ list\_ requester,
tx\_ engine\_ upper, and tx\_ port\_ writer.
\item Fixed: Bug in tx\_ engine\_ lower\_ 128 corrupting s\_ axis\_ tx\_ tkeep, causing Xilinx
PCIe endpoint core to shut down.
\item Fixed: Bug in tx\_ engine\_ upper\_ 128 causing incomplete TX data timeouts.
\item Changed rx\_ engine to not block on nonposted TLPs. They're added to a FIFO and
serviced in order.
\item Reset rx\_ port FIFOs before a receive transaction to avoid data corruption from
replayed TLPs.
\end { itemize}
\subsection { Version 2.0.1}
\begin { itemize}
\item RIFFA 2.0.1 is a general release. This means we've tested it in a number of
ways. Please let us know if you encounter a bug.
\item Neither the HDL nor the drivers from RIFFA 2.0.1 are backwards compatible with
the components of any previous release of RIFFA.
\item RIFFA 2.0.1 consumes more resources than 2.0 beta. This is because 2.0.1 was
rewritten to support scatter gather DMA, higher bandwidth, and appreciably
more signal registering. The additional registering was included to help meet
timing constraints.
\item The Windows driver is supported on Windows 7 32/64. Other Windows versions
can be supported. The driver simply needs to be built for that target.
\item Debugging on Windows is difficult because there exists no system log file.
Driver log messages are visible only to an attached kernel debugger. So to see
any messages you'll need the Windows Development Kit debugger (WinDbg) or a
small utility called DbgView. DbgView is a standalone kernel debug viewer that
\item http://technet.microsoft.com/ens/sysinternals/bb896647.aspx
Run DbgView with administrator privileges and be sure to enable the following
capture options: Capture Kernel, Capture Events, and Capture Verbose Kernel
Output.
\item The Linux driver is supported on kernel version 2.6.27+.
\item The Java bindings make use of a native library (in order to connect Java JNI
to the native library). Libraries for Linux and Windows for both 32/64 bit
platforms have been compiled and included in the riffa.jar.
\item Removed the CHNL\_ RX\_ ERR signal from the channel interface. Error handling now
ends the transaction gracefully. Errors can be easily detected by comparing
the number of words received to the CHNL\_ RX\_ LEN amount. An error will cause
CHNL\_ RX will go low prematurely and not provide the advertised amount of data.
\item Fixed: Bug in sg\_ list\_ requester which could cause an unbounded TLP request.
\item Fixed: Bug in tx\_ port\_ buffer\_ 128 which could stall the TX transaction.
\end { itemize}
\section { Errata}
While we have extensively tested the current distribution, we are human and
cannot eliminate all bugs in our distribution. As a general rule of thumb, if
you find yourself delving into the RIFFA code, you have gone too far. Contact us
if you need additional assistance!
See the following notes for issues we are currently tracking:
\subsection { Windows}
\subsection { Linux}
No open issues
\subsection { Altera}
\textbf { Issue 1: Inexplicable DE4 behavior} We are seeing inexplicable behavior
on the DE4 boards. In particular, this affects both upstream and downstream data
transfers on the Gen2 128-bit interface. In the channel tester, this problem
manifests as an incorrect number of words recieved, and incorrrect data sent.
\textbf { Issue 2: DE4 Designs intermittently fail timing} Particularly on the
128-bit interface. Working to fix.
\textbf { Issue 3: No support for the 256-bit, Gen3x8 Interface} Coming soon...
\subsection { Xilinx (Classic)}
\textbf { Issue 1: Missing example designs for ML605} There is no disadvantage to
using RIFFA 2.1.0 until we return support in a future distribution.
\textbf { Issue 2: Missing example design for Spartan 6 LXT Development board} The
32-bit interface support has been removed from RIFFA 2.2 and may be added back
in the future. Please use RIFFA 2.1 in the meantime
\subsection { Xilinx (Ultrascale)}
\textbf { Issue 1: No support for the 256-bit, Gen3x8 Interface} Coming soon...
\pagebreak
\chapter { Installing the RIFFA driver}
\label { Sec:RIFFA:Installation}
\section { Linux}
To install the RIFFA driver in linux, you must build it against your installed
version of the Linux kernel. RIFFA \RIFFAVer ~comes with a makefile that will
install the necessar linux kernel headers and the driver. This makefile will
also build and install the C/C++ native library. To install RIFFA \RIFFAVer ~in
linux, follow these instructions:
\begin { enumerate}
\item Open a terminal in linux and navigate to the \Directory { RIFFA \RIFFAVer /source/driver/linux} directory.
\item Ensure you have the kernel headers installed, run:
\TermCmd { sudo make setup}
This will attempt to install the kernel headers using your system's package
manager. You can skip this step if you've already installed the kernel headers.
\item Compile the driver and C/C++ library:
\TermCmd { make}
or
\TermCmd { make debug}
Using make debug will compile in code to output debug messages to the system log
at runtime. These messages are useful when developing your design. However they
pollute your system log and incur some overhead. So you may want to install the
non-debug version after you've completed development.
\item Install the driver and library:
\TermCmd { sudo make install}
The system will be configured to load the driver at boot time. The C/C++ library
will be installed in the default library path. The header files will be placed
in the default include path. You will need to reboot after you've installed for
the driver to be (re)loaded.
\item If the driver is installed and there is a RIFFA \RIFFAVer ~configured FPGA when the computer boots, the
driver will detect it. Output in the system log will provide additional
information.
\item The C/C++ code must include the riffa.h header. An example inclusion is
shown in Listing~\ref { Listing:RIFFA:Include}
\item When compiling (using GCC/G++, etc.) you must link with the RIFFA
libraries using the -lriffa flag. For example, when compiling test.c from
Listing~\ref { Listing:RIFFA:Include} :
\TermCmd { gcc -g -c -lriffa -o test.o test.c}
\item Bindings for other languages can be installed by following the README
files in their respective directories (See Figure~\ref { Fig:RIFFA:DirStructure}
\end { enumerate}
\section { Windows}
Currently only Windows 7 (32/64) is supported by RIFFA \RIFFAVer . In the
\Directory { RIFFA \RIFFAVer /install/windows/} subdirectory use the
provided setup.exe program to install the RIFFA driver and native C/C++
library. You can verify that RIFFA \RIFFAVer ~installed correctly by checking
the installation directory in Program Files. After installation, you'll be able
to install the bindings for other languages.
The setup\_ dbg.exe installer installs a driver with additional debugging output.
You can install the setup\_ dbg.exe version and then later use setup.exe to
install the non-debug output version.
\begin { lstlisting} [basicstyle=\footnotesize \ttfamily ,language=C,
commentstyle=\color { red} ,label=Listing:RIFFA:Include,
caption=Inclusion of the RIFFA header files in a user application,frame=single]
#include <stdio.h>
#include <stdlib.h>
#include <riffa.h>
#define BUF_ SIZE (1*1024*1024)
unsigned int buf[BUF_ SIZE];
int main(int argc, char* argv[]) {
fpga_ t * fpga;
int fid = 0; // FPGA id
int channel = 0; // FPGA channel
fpga = fpga_ open(fid);
fpga_ send(fpga, channel, (void *)buf, BUF_ SIZE, 0, 1, 0);
fpga_ recv(fpga, channel, (void *)buf, BUF_ SIZE, 0);
fpga_ close(fpga);
return 0;
}
\end { lstlisting}
\chapter { Compiling and using the Xilinx Example Designs}
Vivado \VivadoVer ~was used in all example designs and documentation included in
this distribution. We highly recommend using \VivadoVer ~and all newer versions
of the software, since we have encountered bugs in previous versions of the
Vivado (e.g. 2014.2) software. This guide assumes that the end-user has already
configured their board for PCI Express operation. See the VC709 User Guide
\footnote { http://www.xilinx.com/support/documentation/boards\_ and\_ kits/vc709/ug887-vc709-eval-board-v7-fpga.pdf} ,
VC707 User Guide
\footnote { http://www.xilinx.com/support/documentation/boards\_ and\_ kits/vc707/ug848-VC707-getting-started-guide.pdf}
or ZC706 User Guide
\footnote { http://www.xilinx.com/support/documentation/boards\_ and\_ kits/zc706/ug954-zc706-eval-board-xc7z045-ap-soc.pdf} .
While we have not tested all of the current-generation Xilinx development
boards, we are confident that they can be supported with minimal
modifications. For more information about supporting new boards, see the
sections \ref { Sec:7SeriesIntegrated:Generating} and
\ref { Sec:Gen3Integrated:Generating} . These sections cover the settings used
in the RIFFA example design IP.
The easiset way to use RIFFA is to start with one of the example designs
included in the distribution. Sections \ref { Sec:7SeriesIntegrated:ExampleDesign}
and \ref { Sec:Gen3Integrated:ExampleDesign} describe how to use and compile these
designs for the VC707 and VC709 boards respectively. These example designs are
ready to compile out of the box, and require no user IP configuration and
generation. The designs also include pre-compiled bit-files in the
\Directory { bit} directory of the example project. For advanced users, we also
describe how we generated the PCIe IP in sections
\ref { Sec:7SeriesIntegrated:Generating} and
\ref { Sec:Gen3Integrated:Generating} .
\section { Classic - 7 Series Integrated Block for PCI Express - (VC707, ZC706 and older)}
This is a step by step guide for using RIFFA \RIFFAVer ~on a Xilinx FPGA with the
7 Series Integrated Block for PCI Express Core. This core is supported on the
ZC706, and VC707 development boards, using the 64-bit and 128-bit AXI
interfaces.
\subsection { VC707 and ZC706 Example Designs}
\label { Sec:7SeriesIntegrated:ExampleDesign}
There is one VC707 example design and two ZC706 example designs in the RIFFA
\RIFFAVer ~distribution. The VC707 example design folders are in \Directory { RIFFA
\RIFFAVer /source/fpga/vc707} and the ZC706 example design folders are in
\Directory { RIFFA \RIFFAVer /source/fpga/zc706} .
\begin { enumerate}
\item Open Vivado to get the introductory screen shown in
Figure~\ref { Fig:Vivado:WelcomeScreen} .
\item Click 'Open an Existing Project' and navigate to your RIFFA
\RIFFAVer ~directory.
\item In the RIFFA \RIFFAVer ~distribution, open \Directory { RIFFA
\RIFFAVer /source/fpga/xilinx/vc707/} or \Directory { RIFFA
\RIFFAVer /source/fpga/zc706} and choose from one of the existing example
design directories for your board. In the example design directory, locate the
\Directory { prj} folder and open it. Select the .xpr file and click open. This
will open the example project, as shown in
Figure~\ref { Fig:7SeriesIntegrated:ExampleDesign:ProjectOpened} .
\item This project was compiled in Vivado \VivadoVer . The bit file generated can
be used to test the FPGA system. If you are using a newer version of Vivado,
recompile the example design or use the programming file provided.
\begin { itemize}
\item IP Settings are now packaged as part of the example designs! Users no
longer need to generate IP.
\item To recompile the example design, click the generate bitstream button in
the top left corner as shown in
Figure~\ref { Fig:7SeriesIntegrated:ExampleDesign:ProjectOpened} .
\item Recompiling your design will generate a new bitfile in the Xilinx
project. The bit file in the \Directory { bit} will not be changed.
\end { itemize}
\item To program the FPGA, click 'Open Hardware Manager'. New bit files
(generated by Vivado) will appear in Vivado's internal directory. An example
bit file is provided in the example design's \Directory { bit} . Load the
bitstream to your VC707 or ZC706 board and restart your computer.
\begin { itemize}
\item Before programming your FPGA, you should install the RIFFA driver. See
Section~\ref { Sec:RIFFA:Installation}
\end { itemize}
\item The example design uses the chnl\_ tester (shown in
Figure~\ref { Fig:Vivado:ExampleDesign:chnl_ tester} , which works with
the example software in the \Directory { source/\{ C\_ C++,Java,python,matlab\} }
directories. Replace the chnl\_ tester instantiation with any user logic,
matching the RIFFA interface.
\item Recompile the design and program the FPGA Device. Changing the
\RIFFAParameter { C\_ NUM\_ CHNL} will change the number of independent channel
interfaces
\end { enumerate}
\begin { figure}
\includegraphics [width=400px,center] { VivadoWelcomeScreen.png}
\caption { Welcome Screen for Vivado \VivadoVer }
\label { Fig:Vivado:WelcomeScreen}
\end { figure}
\begin { figure}
\includegraphics [width=400px,center] { 7SeriesIntegratedOpenProject.png}
\caption { Project Splash Screen for 7Series Integrated Block for PCI Express Projects}
\label { Fig:7SeriesIntegrated:ExampleDesign:ProjectOpened}
\end { figure}
\begin { figure}
\includegraphics [width=400px,trim=500 200 200 250, clip=true,center] { VivadoChnlTesterInstantiation.png}
\caption { Project Splash Screen for 7Series Integrated Block for PCI Express Projects}
\label { Fig:Vivado:ExampleDesign:chnl_ tester}
\end { figure}
\subsection { Generating the 7 Series Integrated Block for PCI Express}
\label { Sec:7SeriesIntegrated:Generating}
The following steps are not required for general users. See the instructions
above for how to compile RIFFA.
Alternatively, it is possible to generate the PCIe Endpoint with different
settings than those provided in the example design. Modifying the RIFFA
parameters \RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} ,
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ BYTES} and \RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} ,
change certain settings in the IP Core. The \RIFFAParameter { C\_ NUM\_ LANES} is a
parameter in the top level file of each example project. How these parameters
relate to IP core settings is highlighted in the following figures.
If the goal is to generate a RIFFA design completely from scratch, each board
directory comes with a RIFFA wrapper verilog file and instantiates a
vendor-specific translation layer. It is highly recommended to re-use these
files RIFFA wrapper when creating designs from scratch.
To generate the PCIe IP select the 7 Series Integrated Block for PCI Express
after selecting the IP Catalog shown in
Figure~\ref { Fig:7SeriesIntegrated:ExampleDesign:ProjectOpened} . This will open
the IP Customization window as shown in
Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabBasic}
\begin { figure} [H]
\includegraphics [width=\figurewidth,center] { 7SeriesIntegratedTabBasic.png}
\caption { Basic settings tab.}
\label { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabBasic}
\end { figure}
First, select \ConfigSetting { Mode} to \ConfigSetting { ADVANCED} from the drop
down menu. This will cause more tabs to appear in the bar. The following tabs
are not used during customization: Link Registers, Power Mangement,
Ext. Capabilities, Ext. Capabilities 2, TL Settings and DL/PL Settings.
In Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabBasic} , we
have set the \ConfigSetting { Xilinx Development Board} to \ConfigSetting { VC707} ,
selected the PCIe Gen1 rate \ConfigSetting { 2.5 GT/s} , and a \ConfigSetting { Lane
Width} of \ConfigSetting { 8} (\RIFFAParameter { C\_ NUM\_ LANES} = 8). We have
chosen to set the \ConfigSetting { AXI Interface Width} to \ConfigSetting { 64-bits}
(\RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} = 64). The choice of Link Rate, Lanes, and
Interface Width will allow different AXI Interface Frequencies to be
selected. The RIFFA core will run at this clock frequency, but the user logic
can run at whatever frequency it desires.
Optional: Set the Component Name of the PCI Express block, and the IP
Location. In our example projects, we use the name template
PCIeGen\textbf { W} x\textbf { Y} If\textbf { Z} where \textbf { W} is the PCI Express
Version (\ConfigSetting { Link Speed} in
Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabBasic} ),
\textbf { Y} is the lane width, and \textbf { Z} is the AXI interface width. The IP
location is the \Directory { ip} directory in the example project.
\begin { figure} [H]
\includegraphics [width=350px,center] { 7SeriesIntegratedTabIDs.png}
\caption { PCI Express ID Tab.}
\label { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabIDs}
\end { figure}
The tab in Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabIDs}
is optional. Setting the Device ID may assist in identifying different FPGAs in
a multi-FPGA system. The other options, specifically the Vendor ID, must remain
the same.
\begin { figure} [H]
\includegraphics [width=350px,center] { 7SeriesIntegratedTabBARs.png}
\caption { PCI Express Base Address Register (BAR) ID Tab.}
\label { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabBARs}
\end { figure}
The tab in
Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabBARs} must be
configured so that BAR0 is \ConfigSetting { enabled} (checked). Set the
\ConfigSetting { Type} to \ConfigSetting { Memory} , and \ConfigSetting { Unit}
\ConfigSetting { Kilobyte} , and \ConfigSetting { Size Value} to \ConfigSetting { 1}
from the dropdown menus. If these values are not set correctly the RIFFA driver
will not recognize the FPGA device.
\begin { figure} [H]
\includegraphics [width=350px,center] { 7SeriesIntegratedTabCapabilities.png}
\caption { PCI Express Capabilities Tab.}
\label { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabCapabilities}
\end { figure}
In this tab select the boxes \ConfigSetting { Buffering Optimized for Bus
Mastering Applications} and \ConfigSetting { Extended Tag Field} . If the
\ConfigSetting { Extended Tag Field} is selected
\RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} = 8, otherwise
\RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} = 5. Select the \ConfigSetting { Maximum
Payload Size} from the dropdown menu. Use this to set the RIFFA
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ BYTES} parameter.
Note: Maximum Payload sizes are typically set by the BIOS, and 256 bytes seems
to be standard. RIFFA will default to the minimum of
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ SIZE} and the setting in your BIOS. Unless your
BIOS is modified, or can support substantially larger packets, there will be no
performance benefit to increasing the payload size. Increasing the maximum
payload size will increase the resources consumed.
% \begin{figure}[H]
% \includegraphics[width=350px,center]{7SeriesIntegratedTabLinkRegisters.png}
% \label{Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabLinkRegisters}
% \end{figure}
\begin { figure} [H]
\includegraphics [width=350px,center] { 7SeriesIntegratedTabInterrupts.png}
\caption { PCI Express Interrupts Tab.}
\label { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabInterrupts}
\end { figure}
In the Interrupts Tab shown in
Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabInterrupts}
\ConfigSetting { clear} the checkbox for \ConfigSetting { Enable INTx} (To disable
INTx). The remaining options should match those shown in
Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabInterrupts}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{7SeriesIntegratedTabPowerManagement.png}
% \label{Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabPowerManagement}
% \caption{PCI Express Power Manement Tab.}\\
% \end{figure}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{7SeriesIntegratedTabExtCapabilities.png}
% \label{Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabExtCapabilities}
% \end{figure}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{7SeriesIntegratedTabExtCapabilities2.png}
% \label{Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabExtCapabilities2}
% \end{figure}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{7SeriesIntegratedTabTLSettings.png}
% \label{Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabTLSettings}
% \end{figure}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{7SeriesIntegratedTabDLPLSettings.png}
% \label{Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabDLPLSettings}
% \end{figure}
\begin { figure} [H]
\includegraphics [width=350px,center] { 7SeriesIntegratedTabSharedLogic.png}
\caption { Shared Logic Tab}
\label { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabSharedLogic}
\end { figure}
In the Shared Logic Tab shown in
Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabSharedLogic}
\ConfigSetting { clear} all of the checkboxes shown. These settings will not affect the core
generated, but will affect the example designs generated by Vivado. As a result,
the Example Design will mirror the RIFFA example design provided.
\begin { figure} [H]
\includegraphics [width=350px,center] { 7SeriesIntegratedTabCoreInterfaceParameters.png}
\caption { Core Interface Parameters Tab}
\label { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabInterfaceParameters}
\end { figure}
Finally, in the Interface Parameters tab, match the checkboxes shown in
Figure~\ref { Fig:7SeriesIntegrated:Generating:7SeriesIntegratedTabInterfaceParameters} . These
options simplify the interface to the generated core
\subsection { Creating Constraints files for the VC707 Development Board}
\label { Sec:7SeriesIntegrated:Generating:Constraints:VC707}
When generating a design for the VC707 board, the following constraints will
correctly constrain the clocks. When using a different board, read the user
guide for appropriate pin placment, or copy the constraints from the PCIe
Endpoint Example Design. The remaining constraints are contained the generated PCIe IP.
\begin { lstlisting} [basicstyle=\footnotesize \ttfamily ,language=tcl,
commentstyle=\color { red} ,label=Listing:7SeriesIntegrated:Generating:Constraints:VC707,
caption=\Xilinx { .xdc} constraints for the VC707 board,frame=single]
set_ property PACKAGE_ PIN AV35 [get_ ports PCIE_ RESET_ N]
set_ property IOSTANDARD LVCMOS18 [get_ ports PCIE_ RESET_ N]
set_ property PULLUP true [get_ ports PCIE_ RESET_ N]
# The following constraints are BOARD SPECIFIC. This is for the VC707
set_ property LOC IBUFDS_ GTE2_ X1Y5 [get_ cells refclk_ ibuf]
create_ clock -period 10.000 -name pcie_ refclk [get_ pins refclk_ ibuf/O]
set_ false_ path -from [get_ ports PCIE_ RESET_ N]
\end { lstlisting}
\subsection { Creating Constraints files for the ZC706 Development Board}
\label { Sec:7SeriesIntegrated:Generating:Constraints:ZC706}
When generating a design for the ZC706 board, the following constraints will
correctly constrain the clocks. When using a different board, read the user
guide for appropriate pin placment, or copy the constraints from the PCIe
Endpoint Example Design. The remaining constraints are contained the generated PCIe IP.
\begin { lstlisting} [basicstyle=\footnotesize \ttfamily ,language=tcl,
commentstyle=\color { red} ,label=Listing:7SeriesIntegrated:Generating:Constraints:ZC706,
caption=\Xilinx { .xdc} constraints for the ZC706 board,frame=single]
set_ property IOSTANDARD LVCMOS15 [get_ ports PCIE_ RESET_ N]
set_ property PACKAGE_ PIN AK23 [get_ ports PCIE_ RESET_ N]
set_ property PULLUP true [get_ ports PCIE_ RESET_ N]
# The following constraints are BOARD SPECIFIC. This is for the ZC706
set_ property LOC IBUFDS_ GTE2_ X0Y6 [get_ cells refclk_ ibuf]
create_ clock -period 10.000 -name pcie_ refclk [get_ pins refclk_ ibuf/O]
set_ false_ path -from [get_ ports PCIE_ RESET_ N]
\end { lstlisting}
\pagebreak
\section { Ultrascale - Gen3 Integrated Block for PCI Express - (VC709 and newer)}
This is a step by step guide for building a RIFFA \RIFFAVer ~reference design for
Xilinx FPGA's compatible with the Gen3 Integrated Block for PCI Express. In
RIFFA \RIFFAVer ~there are three example designs for the VC709 board in the
\Directory { RIFFA \RIFFAVer /source/fpga/vc709} directory:
\VCSevenOhNineExampleDesignsLong . To use one of these example designs, follow
the instructions below.
\subsection { VC709 Example Designs}
\label { Sec:Gen3Integrated:ExampleDesign}
\begin { enumerate}
\item Open Vivado to get the introductory screen shown in
Figure~\ref { Fig:Vivado:WelcomeScreen} .
\item Click 'Open an Existing Project' and navigate to your RIFFA
\RIFFAVer ~directory.
\item In the RIFFA \RIFFAVer ~distribution, open \Directory { RIFFA
\RIFFAVer /source/fpga/xilinx/vc709/} and choose from one of the existing
example design directories for your board. In the example design directory,
locate the \Directory { prj} folder and open it. Select the .xpr file and click
open. This will open the example project, as shown in
Figure~\ref { Fig:Gen3Integrated:ExampleDesign:ProjectOpened} .
\item This project was compiled in Vivado \VivadoVer . The bit file generated can
be used to test the FPGA system. If you are using a newer version of Vivado,
recompile the example design or use the programming file provided.
\begin { itemize}
\item IP Settings are now packaged as part of the example designs! Users no
longer need to generate IP.
\item To recompile the example design, click the generate bitstream button in
the top left corner as shown in
Figure~\ref { Fig:Gen3Integrated:ExampleDesign:ProjectOpened} .
\item Recompiling your design will generate a new bitfile in the Xilinx
project. The bit file in the \Directory { bit} will not be changed.
\end { itemize}
\item To program the FPGA, click 'Open Hardware Manager'. New bit files
(generated by Vivado) will appear in the Vivado generated directories. An
example bit file is provided in the example design's \Directory { bit} . Load the
bitstream to your VC709 board and restart your computer.
\begin { itemize}
\item Before programming your FPGA, you should install the RIFFA driver. See
Section~\ref { Sec:RIFFA:Installation}
\end { itemize}
\item The example design uses the chnl\_ tester (shown in
Figure~\ref { Fig:Vivado:ExampleDesign:chnl_ tester} , which works with
the example software in the \Directory { source/\{ C\_ C++,Java,python,matlab\} }
directories. Replace the chnl\_ tester instantiation with any user logic,
matching the RIFFA interface.
\item Recompile the design and program the FPGA Device. Changing the
\RIFFAParameter { C\_ NUM\_ CHNL} will change the number of independent channel
interfaces
\end { enumerate}
\begin { figure}
\includegraphics [width=300px,center] { Gen3IntegratedOpenProject.png}
\caption { Project Splash Screen for Gen3 Integrated Block for PCI Express Projects}
\label { Fig:Gen3Integrated:ExampleDesign:ProjectOpened}
\end { figure}
\subsection { Generating the Gen3 Integrated Block for PCI Express}
\label { Sec:Gen3Integrated:Generating}
The following steps are not required for general users. See the instructions
above for how to compile RIFFA.
Alternatively, it is possible to generate the PCIe Endpoint with different
settings than those provided in the example design. Changing the endpoint
settings is required when changing the parameters
\RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} , \RIFFAParameter { C\_ MAX\_ PAYLOAD\_ BYTES}
and \RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} . The \RIFFAParameter { C\_ NUM\_ LANES} is a
parameter in the top level file of each example project. How these parameters
relate to IP core settings is highlighted in the following figures.
If the goal is to generate a RIFFA design completely from scratch, each board
directory comes with a RIFFA wrapper verilog file and instantiates a
vendor-specific translation layer. It is highly recommended to re-use these
files RIFFA wrapper when creating designs from scratch.
To generate the PCIe IP select the 7 Series Integrated Block for PCI Express
after selecting the IP Catalog shown in
Figure~\ref { Fig:Gen3Integrated:ExampleDesign:ProjectOpened} . This will open the
IP Customization window as shown in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabBasic}
\begin { figure} [H]
\includegraphics [width=\figurewidth,center] { Gen3IntegratedTabBasic.png}
\caption { Basic settings tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabBasic}
\end { figure}
First, select ``ADVANCED'' from the drop down menu. This will cause more tabs to
appear in the bar. The following tabs are not used during customization: MSIx Cap (Capabilities),
Extd. Capabilities 1, and Extd Capabilites 2.
In this example, we have set the \ConfigSetting { Xilinx Development Board} to
\ConfigSetting { VC709} , and selected the PCIe Gen1 rate of \ConfigSetting { 2.5
GT/s} , and a Lane Width of 8 (\RIFFAParameter { C\_ NUM\_ LANES} = 8). We have
chosen to set the \ConfigSetting { AXI Interface Width} to \ConfigSetting { 64-bits}
(\RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} = 64). Finally Clear the
\ConfigSetting { Disable Client Tag} and \ConfigSetting { PCIe DRP Ports} boxes. The
choice of \ConfigSetting { Link Rate} , \ConfigSetting { Lanes} , and
\ConfigSetting { Interface Width} will allow different AXI Interface Frequencies
to be selected. The RIFFA core will run at this clock frequency, but the user
logic can run at whatever frequency it desires.
Optional: Set the Component Name of the PCI Express block, and the IP
Location. In our example projects, we use the name template
PCIeGen\textbf { W} x\textbf { Y} If\textbf { Z} where \textbf { W} is the PCI Express
Version (\ConfigSetting { Link Speed} in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabBasic} ),
\textbf { Y} is the lane width, and \textbf { Z} is the AXI interface width. The IP
location is the \Directory { ip} directory in the example project.
Note: For RIFFA \RIFFAVer ~the 256-bit interface is not supported, however the
128-bit interface is. This means PCIe Gen2 with 8 lanes, and PCIe Gen3 with 4
lanes are both supported.
\begin { figure} [H]
\includegraphics [width=350px,center] { Gen3IntegratedTabCapabilities.png}
\caption { PCI Express Capabilities Tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabCapabilities}
\end { figure}
In the Capabilities tab shown in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabCapabilities} check
the \ConfigSetting { Extended Tag Field} box. If the \ConfigSetting { Extended Tag
Field} is selected \RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} = 8, otherwise
\RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} = 5. Set the \ConfigSetting { PFO Max Payload
Size} from the dropdown menu; Use this to set the RIFFA
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ BYTES} parameter.
Note: Maximum Payload sizes are typically set by the BIOS, and 256 bytes seems
to be standard. RIFFA will default to the minimum of
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ SIZE} and the setting in your BIOS. Unless your
BIOS is modified, or can support substantially larger packets, there will be no
performance benefit to increasing the payload size. Increasing the maximum
payload size will increase the resources consumed.
\begin { figure} [H]
\includegraphics [width=350px,center] { Gen3IntegratedTabPF0Ids.png}
\caption { PCI Express IDs Tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabPF0Ids}
\end { figure}
The tab in Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabPF0Ids}
is optional. Setting the Device ID may assist in identifying different FPGAs in
a multi-FPGA system. The other options, specifically the Vendor ID, must remain
the same.
\begin { figure} [H]
\includegraphics [width=350px,center] { Gen3IntegratedTabPF0Bar.png}
\caption { PCI Express Base Address Registers (BAR) Tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabPF0Bar}
\end { figure}
The tab in Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabPF0Bar}
must be configured so that BAR0 is \ConfigSetting { enabled} . Select type
\ConfigSetting { Memory} , and Unit \ConfigSetting { Kilobyte} , and
\ConfigSetting { Size Value} \ConfigSetting { 1} from the dropdown menus. If these
values are not set correctly the RIFFA driver will not recognize the FPGA
device.
\begin { figure} [H]
\includegraphics [width=350px,center] { Gen3IntegratedTabLegacyMSICap.png}
\caption { PCI Express Legacy and MSI Interrupts Tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabLegacyMSICap}
\end { figure}
In the Legacy/MSI Capabilites tab shown in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabLegacyMSICap} , select
\ConfigSetting { None} in the \ConfigSetting { PFO Interrupt Pin Dropdown} menu and
set the \ConfigSetting { PFO Multiple Message Capable} dropdown menu to
\ConfigSetting { 1 Vector}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{Gen3IntegratedTabMSIxCap.png}
% \label{Fig:Gen3Integrated:Generating:Gen3IntegratedTabMSIxCap}
% \end{figure}
\begin { figure} [H]
\includegraphics [width=350px,center] { Gen3IntegratedTabPowerManagement.png}
\caption { PCI Express Power Management Tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabPowerManagement}
\end { figure}
In the Power Management tab, shown in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabPowerManagement} ,
ensure that the \ConfigSetting { Performance Level} is set to
\ConfigSetting { Extreme} .
% \begin{figure}[H]
% \includegraphics[width=350px,center]{Gen3IntegratedTabExtCapabilities1.png}
% \label{Fig:Gen3Integrated:Generating:Gen3IntegratedTabExtCapabilities1}
% \end{figure}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{Gen3IntegratedTabExtCapabilities2.png}
% \label{Fig:Gen3Integrated:Generating:Gen3IntegratedTabExtCapabilities2}
% \end{figure}
\begin { figure} [H]
\includegraphics [width=350px,center] { Gen3IntegratedTabSharedLogic.png}
\caption { PCI Express Shared Logic Tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabSharedLogic}
\end { figure}
In the Shared Logic Tab shown in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabSharedLogic}
\ConfigSetting { clear} all of the checkboxes shown. These settings will not
affect the core generated, but will affect the example designs generated by the
Vivado, and make the Vivado example design mirror the RIFFA Example design.
\begin { figure} [H]
\includegraphics [width=350px,center] { Gen3IntegratedTabCoreInterfaceParameters.png}
\caption { PCI Express Core Interface Parameters Tab.}
\label { Fig:Gen3Integrated:Generating:Gen3IntegratedTabCoreInterfaceParameters}
\end { figure}
Finally, in the Interface Parameters tab, match the checkboxes shown in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabCoreInterfaceParameters} . These
options simplify the interface to the generated core
\subsection { Creating Constraints files for the VC709 Development Board}
\label { Sec:Gen3Integrated:Generating:Constraints}
When generating a design for the VC709 board, the following constraints will
correctly constrain the clocks. When using a different board, read the user
guide for appropriate pin placment, or copy the constraints from the PCIe
Endpoint Example Design.
\begin { lstlisting} [basicstyle=\footnotesize \ttfamily ,language=tcl,
commentstyle=\color { red} ,label=Listing:7SeriesIntegrated:Generating:Constraints:VC709,
caption=\Xilinx { .xdc} constraints for the VC709 board,frame=single]
create_ clock -period 10.000 -name pcie_ refclk [get_ pins refclk_ ibuf/O]
set_ false_ path -from [get_ ports PCIE_ RESET_ N]
# The following constraints are BOARD SPECIFIC. This is for the VC709
set_ property LOC IBUFDS_ GTE2_ X1Y11 [get_ cells refclk_ ibuf]
set_ property PACKAGE_ PIN AV35 [get_ ports PCIE_ RESET_ N]
set_ property IOSTANDARD LVCMOS18 [get_ ports PCIE_ RESET_ N]
set_ property PULLUP true [get_ ports PCIE_ RESET_ N]
\end { lstlisting}
\pagebreak
\chapter { Compiling and using the Altera Example Designs}
\label { Chap:Altera}
This section describes how to use RIFFA \RIFFAVer ~ with Quartus \QuartusVer . The
example projects included in this distribution target Terasic DE5Net and DE4
boards. We are confident that RIFFA will work on all currently supported Altera
devices using the Hard IP for PCI Express (Cyclone V, Arria V and Stratix V)
devices, as well as all devices using IP Compiler for PCI Express (Stratix IV
and prior). For device support in Quartus \QuartusVer
see \footnote { http://dl.altera.com/devices/}
The FPGA families that we have successfully tested RIFFA \RIFFAVer ~are:
\begin { itemize}
\item Stratix V (DE5-Net)
\item Stratix IV (DE4)
%\item Cyclone IV (DE2i-150)
\end { itemize}
There are three options for starting a new RIFFA project:
\begin { itemize}
\item For first-time users with a DE5 board, we recommend the archived projects
provided in the \Directory { RIFFA \RIFFAVer /source/fpga/de5\_ qsys} directory. Follow the
instructions in Section~\ref { Sec:Altera:QsysMegawizard:Qsys}
\item Intermediate and advanced users, or users with a DE4 board, we have
provided projects without instantiated IP. For DE5 boards, follow the
instructions in Section~\ref { Sec:Altera:QsysMegawizard:Megawizard} . For DE4 boards, follow
the instructions in Section~\ref { Sec:Altera:IPCompiler}
\item For advanced users, or users wishing to support a new board, we provide
full instructions for creating a top level and generating IP. Follow the
instructions in Section~\ref { Sec:Altera:QsysMegawizard}
\end { itemize}
\section { Example Designs with Qsys and MegaWizard (Stratix V, Cyclone V and newer)}
\label { Sec:Altera:QsysMegawizard}
\subsection { Qsys (Stratix V and newer)}
\label { Sec:Altera:QsysMegawizard:Qsys}
For first-time users with the DE5-Net board, copy one of the archived projects
(.qar files) available in the \Directory { de5\_ qsys} directory.
\begin { enumerate}
\item Open Quartus to get the introductory screen shown in
Figure~\ref { Fig:Quartus:WelcomeScreen} .
\item Click 'Open an Existing Project' and navigate to your RIFFA
\RIFFAVer ~directory.
\item In the RIFFA \RIFFAVer ~distribution, open \Directory { RIFFA
\RIFFAVer /source/fpga/de4/} and choose from one of the existing example design
directories for your board. In the example design directory, locate the
\Directory { prj} folder and open it. Select the .qpf file and click open. This
will open the example project, as shown in
Figure~\ref { Fig:Quartus:ExampleDesign:ProjectOpened} .
\item This project was compiled in Quartus \QuartusVer . The bit file generated
can be used to test the FPGA system. If you are using a newer version of
Quartus, recompile the example design or use the programming file provided.
\begin { itemize}
\item To recompile the example design, click the compile button in the top
left corner as shown in
Figure~\ref { Fig:Quartus:ExampleDesign:ProjectOpened} .
\item Recompiling your design will generate a new bitfile in the
\Directory { prj} directory. The bit file in the \Directory { bit} will not be
changed.
\end { itemize}
\item To program the FPGA, click 'Open Programmer'. New bit files (generated by
Quartus) will appear in the \Directory { prj/output\_ files/} directory. An example
bit file is provided in the example design's \Directory { bit} directory.
\begin { itemize}
\item Before programming your FPGA, you should install the RIFFA driver. See
Section~\ref { Sec:RIFFA:Installation}
\end { itemize}
\item The example design uses the chnl\_ tester (shown in
Figure~\ref { Fig:Quartus:ExampleDesign:chnl_ tester} , which works with
the example software in the \Directory { source/\{ C\_ C++,Java,python,matlab\} }
directories. Replace the chnl\_ tester instantiation with any user logic,
matching the RIFFA interface.
\item Recompile the design and program the FPGA Device. Changing the
\RIFFAParameter { C\_ NUM\_ CHNL} will change the number of independent channel
interfaces
\end { enumerate}
\begin { figure}
\includegraphics [width=300px,center] { QuartusWelcomeScreen.png}
\caption { Welcome Screen for Quartus \QuartusVer }
\label { Fig:Quartus:WelcomeScreen}
\end { figure}
\begin { figure}
\includegraphics [width=300px,center] { IPCompileOpenProject.png}
\caption { Project Splash Screen for Quartus Projects}
\label { Fig:Quartus:ExampleDesign:ProjectOpened}
\end { figure}
\begin { figure}
\includegraphics [width=300px,trim=400 200 300 200, clip=true,center] { QuartusChnlTesterInstantiation.png}
\caption { chnl\_ tester instantiation in the top level file}
\label { Fig:Quartus:ExampleDesign:chnl_ tester}
\end { figure}
\subsection { Generating IP using MegaWizard (Stratix V, Cyclone V and newer)}
\label { Sec:Altera:QsysMegawizard:Megawizard}
In some cases, it may be necessary to generate the PCIe Endpoint IP. For
intermediate users, there are project example projects inside of the
\Directory { de5} directory without instantiated IP (This is done to avoid
licensing problems). For the DE5, the project directories are: DE5Gen1x8If64,
DE5Gen2x8If128, DE5Gen3x4If128.
Modifying the RIFFA parameters \RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} ,
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ BYTES} and \RIFFAParameter { C\_ LOG\_ NUM\_ TAGS}
require changing certain settings in the IP core file. The paramter
\RIFFAParameter { C\_ NUM\_ LANES} is located in the top level file of each example
project. How these parameters relate to IP core settings is highlighted in the
following figures.
For advanced users whose goal is to generate a RIFFA design completely from
scratch, we provide instructions for generating the timing constraints and other
low level details. Each board directory contains with a RIFFA wrapper verilog
file and instantiates a vendor-specific translation layer. It is highly
recommended to re-use these files RIFFA wrapper when creating designs from
scratch. Users should also use the constraints file (\Altera { .sdc} ) in the board
directory, and in the \Directory { constr/} , or read the User Guide provided with
each board and the instructions for generating constratints in
Section~\ref { Sec:Altera:QsysMegawizard:Constraints} .
As stated in Section~\ref { Sec:Intro:Decoding} , each project directory contains five folders.
\begin { itemize}
\item The \Directory { prj/} directory contains the project \Altera { .qpf} and \Altera { .qsf} file.
\item The \Directory { hdl/} contains the top level file, e.g. DE5Gen2x8If128.v, which instantiates the skeleton IP and the RIFFA Core.
\item The \Directory { ip/} directory is empty but will contain Altera IP generated by Quartus
in the following guide.
\item The \Directory { constr/} directory contains project-specific timing constraint files.
\item Finally the \Directory { bit/} directory contains the project .sof, or bit
file that we have tested. This bitfile will not be overwritten by subsequent Quartus compilations.
\end { itemize}
Note: The bitfile in the bit directory is not modified by recompilation in
Quartus. Quartus will generate a new bitfile (.sof) in the \Directory { prj/}
directory for the DE5Net board.
\begin { figure} [H]
\includegraphics [width=350px,center] { QsysPCIExpSystemTrim.png}
\caption { Qsys Diagram depicting the connections between the three Altera IP blocks.}
\label { Fig:Altera:QsysMegawizard:Megawizard:QsysPCIExpSystemTrim}
\end { figure}
Altera designs require additional IP to drive the PCIe Core Transcievers. For the
DE5, these blocks are the Transciever Reconfiguration Controller and the
Reconfiguration Driver. When creating a new top level design, these blocks must be
connected together with the PCIe Endpoint as shown in
Figure~\ref { Fig:Altera:QsysMegawizard:Megawizard:QsysPCIExpSystemTrim} .
First, we will generate the PCIe Endpoint. Click on the Avalon Streaming
Interface for PCI Express in the Quartus IP Catalog.
Figure~\ref { Fig:Altera:IPCompiler:PCIeSystemSettingsTab} .
Optional: Set the Component Name of the PCI Express block, and the IP
Location. In our example projects, we typically use the name
PCIeGen\textbf { W} x\textbf { Y} If\textbf { Z} where \textbf { W} is the PCI Express
Version (\ConfigSetting { Link Speed} in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabBasic} ), \textbf { Y}
is the lane width, and \textbf { Z} is the Avalon interface width. The IP location
is the \Directory { ip/} directory in the example project.
\begin { figure} [H]
\includegraphics [width=350px,center] { PCIExpressEndpoint1Trim.png}
\caption { PCI Express Endpoint Configuration Menu}
\label { Fig:Altera:QsysMegawizard:Megawizard:PCIExpressEndpoint1Trim}
\end { figure}
In Figure~\ref { Fig:Altera:QsysMegawizard:Megawizard:PCIExpressEndpoint1Trim} ,
select the \ConfigSetting { Number of Lanes} , which corresponds to the top level
parameter \RIFFAParameter { C\_ NUM\_ LANES} , \ConfigSetting { Lane Rate} , and
\ConfigSetting { PCI Express Base Specification} version from the dropdown menus
(Choose the highest possible base specification version). Select an
\ConfigSetting { Application Interface Width} ; This corresponds to the
\RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} parameter in RIFFA. Currently the
\ConfigSetting { 64-bit} and \ConfigSetting { 128-bit} interfaces are supported for
all Altera designs. Some widths may not be possible depending on the
\ConfigSetting { Lane Rate} and \ConfigSetting { Number of Lanes} selected.
The choice of \ConfigSetting { Link Rate} , \ConfigSetting { Number of Lanes} , and
\ConfigSetting { Interface Width} will set the frequency for the PCI interface,
which is clocked by the pld\_ clk signal. For the chosen settings, the frequency
should be displayed in the messages bar at the bottom of the configuration menu
(Messages bar not shown). The RIFFA core will run at this clock frequency, but the
user logic can run at whatever frequency it desires.
In the Base Address Registers Section set BAR0's type to \ConfigSetting { 32-bit
non-prefetchable memory} and set the size to \ConfigSetting { 1 KByte - 10 Bits} .
There are no required changes in the Device Identification Registers
Section. However, in a multiple FPGA system, it may be useful to change the
\ConfigSetting { Device ID} to allow identification of different FPGA
platforms. The other options, specifically the \ConfigSetting { Vendor ID} , must
remain the same.
Scroll down to view the final two sections shown in
Figure~\ref { Fig:Altera:QsysMegawizard:Megawizard:PCIExpressEndpoint2Trim} .
\begin { figure} [H]
\includegraphics [width=350px,trim=0 0 0 620, clip=true,center] { PCIExpressEndpoint2Trim.png}
\caption { PCI Express Endpoint Configuration Menu}
\label { Fig:Altera:QsysMegawizard:Megawizard:PCIExpressEndpoint2Trim}
\end { figure}
In the PCI Express/PCI Capabilities menu, set your desired
\ConfigSetting { Maximum Payload Size} , which corresponds to the RIFFA parameter,
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ BYTES} and the \ConfigSetting { Number of Tags
Supported} . The log of the \ConfigSetting { Number of Tags Supported} is
the \RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} parameter in RIFFA.
In the MSI Tab, make sure that the number of MSI messages requested is equal to
1.
Note: Maximum Payload sizes are typically set by the BIOS, and 256 bytes seems
to be standard. RIFFA will default to the minimum setting
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ SIZE} and the setting in your BIOS. Unless your
BIOS is modified, or can support substantially larger packets, there will be no
performance benefit to increasing the payload size. Increasing the
\ConfigSetting { Maximum Payload Size} will increase the resources consumed.
Finally, record the number of \ConfigSetting { Transciever Reconfiguration
Interfaces} in the messages bar at the bottom of the screen, then close the
PCIe IP Generation Menu. A window may ask if you wish to generate the example
design. This is optional.
\begin { figure} [H]
\includegraphics [width=350px,center] { XCVRMenuTrim.png}
\caption { Transciever Reconfiguration IP Generation Menu}
\label { Fig:Altera:QsysMegawizard:Megawizard:XCVRMenuTrim}
\end { figure}
Next, generate the Transceiver Reconfiguration Controller by opening MegaWizard
and selecting the Transceiver Reconfiguration Controller megafunction.
Set the appropriate number of \ConfigSetting { Transciever Reconfiguration
Interfaces} in the Interface Bundles Menu. In the analog features section,
\ConfigSetting { Enable Analog Controls} and \ConfigSetting { Enable Adaptive
Equalization} block by clicking the appropriate boxes.
Optional: Set the Component Name of the Transciever Reconfiguration Controller, and the IP
Location. In our example projects, we typically use the name
XCVRCtrlGen\ConfigSetting { W} x\ConfigSetting { Y} where \ConfigSetting { W} is the \ConfigSetting { PCI Express Version}
(\ConfigSetting { Link Speed} in
Figure~\ref { Fig:Altera:QsysMegawizard:Megawizard:PCIExpressEndpoint1Trim} ), \ConfigSetting { Y}
is the lane width. The IP location is the \Directory { ip/} directory in the
example project.
\begin { figure} [H]
\includegraphics [width=350px,center] { ReconfigDriverMenu.png}
\caption { Transciever Reconfiguration Driver Menu}
\label { Fig:Altera:QsysMegawizard:Megawizard:ReconfigDriverMenu}
\end { figure}
In Qsys, generate the Transciver Reconfiguration Controller. Select the
appropriate lane rate for your design, and the number of Reconfiguration
Intefaces. These should match the number of reconfiguration interfaces dictated
when generating the PCIe IP in
Figure~\ref { Fig:Altera:QsysMegawizard:Megawizard:PCIExpressEndpoint1Trim} and
the number selected in
Figure~\ref { Fig:Altera:QsysMegawizard:Megawizard:XCVRMenuTrim} .
Optional (in Qsys): Set the Component Name of the Transciever Reconfiguration
Driver, and the IP Location. In our example projects, we typically use the name
XCVRDrvGen\ConfigSetting { W} x\ConfigSetting { Y} where \ConfigSetting { W} is the
\ConfigSetting { PCI Express Version} (\ConfigSetting { Link Speed} in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabBasic} ), and
\textbf { Y} is the lane width. The IP location is the \Directory { ip} directory in
the example project.
If you are using Megawizard to instantiate IP, you must manually instantiate the
Transciver Reconfiguration Driver in the Top Level design. The instantiation
template is shown in
Listing~\ref { Listing:Altera:QsysMegawizard:ManualDriverInst} into your top-level
file. Match the PCIe generation, and chip generation for your project.
\begin { lstlisting} [basicstyle=\footnotesize \ttfamily ,language=Verilog,
commentstyle=\color { red} ,label=Listing:Altera:QsysMegawizard:ManualDriverInst,
caption=Manual (non-qsys) instantiation of Reconfiguration Driver,frame=single]
altpcie_ reconfig_ driver
#(
/* These values should match the values used for the PCIe Endpoint */
.number_ of_ reconfig_ interfaces(10), /*Set This*/
.gen123_ lane_ rate_ mode_ hwtcl(``Gen1 (2.5 Gbps)''), /*Set This*/
.INTENDED_ DEVICE_ FAMILY(``Stratix V'')) /*Set This*/
XCVRDriverGen2x8_ inst
(
/*Ports Here -- Copy from Example Designs*/
);
\end { lstlisting}
\pagebreak
\subsection { Creating Constraints files for MegaWizard and QSys Designs}
\label { Sec:Altera:QsysMegawizard:Constraints}
Advanced users may also want to edit and modify the constraint files. This not
required or recommended for novice users. The example designs in the RIFFA
\RIFFAVer ~distribution contain appropriate constraint files for the example
designs. However if the need arises, these constraints are documented below.
To appropriately constrain your PCIe reference clocks, place the constraints
shown in Listing~\ref { Listing:Altera:QsysMegawizard:Constraints} in your
\Altera { .sdc} file. Modify the names PCIE\_ REFCLK, PCIE\_ TX\_ OUT and
PCIE\_ RX\_ IN to match your design.
\begin { lstlisting} [basicstyle=\footnotesize \ttfamily ,language=tcl,
commentstyle=\color { red} ,label=Listing:Altera:QsysMegawizard:Constraints,
caption=\Altera { .sdc} constraints for Qsys and Megawizard designs,frame=single]
create_ clock -name PCIE_ REFCLK -period 10.000 [get_ ports { PCIE_ REFCLK} ]
derive_ pll_ clocks -create_ base_ clocks
derive_ clock_ uncertainty
\end { lstlisting}
Likewise, copy the constraints in
Listing~\ref { Listing:Altera:QsysMegawizard:QSFSettings} into your .qsf file. Copy
the location assignment commands for each PCIe Pin in your reference design.
\begin { lstlisting} [language=tcl,basicstyle=\footnotesize \ttfamily ,commentstyle=\color { red} ,
label=Listing:Altera:QsysMegawizard:QSFSettings,
caption=\Altera { .qsf} settings for Qsys and Megawizard designs,frame=single]
###################################################################
# PCIE Connections
###################################################################
set_ location_ assignment <PCIE_ REFCLK_ PIN> -to PCIE_ REFCLK
set_ instance_ assignment -name IO_ STANDARD HCSL -to PCIE_ REFCLK
set_ location_ assignment <PCIE_ REFCLK_ PIN(n)> -to ``PCIE_ REFCLK(n)''
set_ instance_ assignment -name IO_ STANDARD HCSL -to ``PCIE_ REFCLK(n)''
set_ location_ assignment <PCIE_ RESET_ N> -to PCIE_ RESET_ N
set_ instance_ assignment -name IO_ STANDARD ``2.5 V'' -to PCIE_ RESET_ N
# For each PCIE Lane (L) set the pin locations from the board user guide!
######################################################################
#PCIE TX_ OUT L
######################################################################
set_ location_ assignment <TX_ LANE[L]_ PIN> -to PCIE_ TX_ OUT[0]
set_ location_ assignment <TX_ LANE[L]_ PIN(n)> -to ``PCIE_ TX_ OUT[0](n)''
###################################################################
#PCIE RX_ IN L
###################################################################
set_ location_ assignment <RX_ LANE[L]_ PIN> -to PCIE_ RX_ IN[L]
set_ location_ assignment <RX_ LANE[L]_ PIN(n)> -to ``PCIE_ RX_ IN[L](n)''
\end { lstlisting}
\pagebreak
\section { IP Compiler for PCI Express (Stratix IV, and older)}
\label { Sec:Altera:IPCompiler}
To avoid licensing problems, we do not package Altera IP for the DE4
board. Manual IP Instantiation is required when using the DE4 Board, and similar
devices using the IP Compiler for PCI Express. Changing the endpoint settings
described here may change the RIFFA para\- meters
\RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} , \RIFFAParameter { C\_ MAX\_ PAYLOAD\_ BYTES}
and \RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} . How these parameters relate to IP core
settings is highlighted in the following figures.
There are sever example projects inside of the \Directory { de4} directory folder
without instantiated IP. For the DE4 board, these
projects are DE4Gen1x8If64, DE4Gen2x8If128.% and DE2Gen1x1If64 for the DE2i-150
board.
As stated in Section~\ref { Sec:Intro:Decoding} , each project directory contains
five folders.
\begin { itemize}
\item The \Directory { prj/} directory contains the project \Altera { .qpf} and \Altera { .qsf} file.
\item The \Directory { hdl/} contains the top level file, e.g. DE5Gen2x8If128.v,
which instantiates the skeleton IP and the RIFFA Core.
\item The \Directory { ip/} directory is empty but will contain Altera IP generated by Quartus
in the following guide.
\item The \Directory { constr/} directory contains project-specific timing constraint files.
\item Finally the \Directory { bit/} directory contains the project .sof, or bit
file that we have tested. This bitfile will not be overwritten by subsequent
Quartus compilations.
\end { itemize}
\subsection { Generating IP with IP Compiler for PCI Express (Stratix IV, and older)}
\label { Sec:Altera:IPCompiler:Generating}
Note: The bitfile in the bit directory is not modified by recompilation in
Quartus. Quartus will generate a new bitfile (.sof) in the
\Directory { /output\_ files} directory for the DE4 board.
First, we will generate the PCIe Endpoint. Open the Altera IP Catalog and select
the IP Compiler for PCI Express. This will open the window shown in
Figure~\ref { Fig:Altera:IPCompiler:PCIeSystemSettingsTab} .
Optional: Set the Component Name of the PCI Express block, and the IP
Location. In our example projects, we typically use the name
PCIeGen\ConfigSetting { W} x\ConfigSetting { Y} If\ConfigSetting { Z} where \ConfigSetting { W} is the PCI Express
Version (Link Speed in
Figure~\ref { Fig:Gen3Integrated:Generating:Gen3IntegratedTabBasic} ), \ConfigSetting { Y}
is the lane width, and \ConfigSetting { Z} is the Avalon interface width. The IP location
is the \Directory { ip} directory in the example project.
In this guide, we will skip the Power Mangement tab shown in
Figure~\ref { Fig:Altera:IPCompiler:PCIeSystemSettingsTab} .
\begin { figure} [H]
\includegraphics [width=350px,center] { IPCompilerPCIeTabSystemSettings.png}
\caption { IP Compiler for PCI Express System Settings Tab}
\label { Fig:Altera:IPCompiler:PCIeSystemSettingsTab}
\end { figure}
In the first column of the System Settings Tab, select your \ConfigSetting { Chip
Generation/PHY Type} (\ConfigSetting { Stratix IV GX} for the DE4 board),
\ConfigSetting { Lanes} , and \ConfigSetting { Max Rate} . The \ConfigSetting { Number
of Lanes} is the parameter \RIFFAParameter { C\_ NUM\_ LANES} in the project top
level file. In the second column, select the \ConfigSetting { PCI Express Version}
(2.0, or the highest possible) and set the \ConfigSetting { Test Out Width} to
\ConfigSetting { 0} . In the third column, select the \ConfigSetting { Application Interface
Width} . The \ConfigSetting { Application Interface Width} corresponds to the RIFFA parameter
\RIFFAParameter { C\_ PCI\_ DATA\_ WIDTH} .
The choice of \ConfigSetting { Link Rate} , \ConfigSetting { Lanes} , and
\ConfigSetting { Interface Width} will set the frequency for the PCI interface,
which is clocked by the signal pld\_ clk. For the chosen settings, the frequency
is determined in the Chip User Guide (though, typically it is one of 62.5, 125,
or 256 MHz)
\begin { figure} [H]
\includegraphics [width=350px,center] { IPCompilerPCIeTabPCIRegisters.png}
\caption { IP Compiler for PCI Express Registers Tab}
\label { Fig:Altera:IPCompiler:PCIRegisters}
\end { figure}
PCI Registers Tab, shown in Figure~\ref { Fig:Altera:IPCompiler:PCIRegisters} , set
BAR0's type to ``32-bit non-prefetchable memory''. Set the size to ``1 KByte -
10 Bits''.
There are no required changes in the PCI Registers Tab. However, in a multiple
FPGA system, it may be useful to change the \ConfigSetting { Device ID} to
identify different FPGA platforms. The other options, specifically the
\ConfigSetting { Vendor ID} , must remain the same.
\begin { figure} [H]
\includegraphics [width=350px,center] { IPCompilerPCIeTabCapabilities.png}
\caption { IP Compiler for PCI Express Capabilities Tab}
\label { Fig:Altera:IPCompiler:PCIeCapabilitiesTab}
\end { figure}
Open the Capabilities Tab shown in
Figure~\ref { Fig:Altera:IPCompiler:PCIeCapabilitiesTab} . In the Device
Capabilities box, set the \ConfigSetting { Tags Supported} to
\ConfigSetting { 64} . The log of the maximum number of tags supported is the RIFFA
parameter \RIFFAParameter { C\_ LOG\_ NUM\_ TAGS} parameter in RIFFA. In the MSI
Capabilities box, set the number of \ConfigSetting { MSI Messages Requested} to
\ConfigSetting { 1} . All the remaining settings must stay the same.
\begin { figure} [H]
\includegraphics [width=350px,center] { IPCompilerPCIeTabBufferSetup.png}
\caption { IP Compiler for PCI Express Buffer Setup Tab}
\label { Fig:Altera:IPCompiler:PCIeBufferSetupTab}
\end { figure}
In Figure~\ref { Fig:Altera:IPCompiler:PCIeBufferSetupTab} select the
\ConfigSetting { Maximum Payload Size} from the dropdown menu. Use this to set the
\RIFFAParameter { C\_ MAX\_ PAYLOAD} parameter. Set the numer of \ConfigSetting { Virtual Channels} to
\ConfigSetting { 1} .
Note: \ConfigSetting { Maximum Payload} sizes are typically set by the BIOS, and
256 bytes seems to be standard. RIFFA will default to the minimum of
\RIFFAParameter { C\_ MAX\_ PAYLOAD\_ SIZE} and the setting in your BIOS. Unless your
BIOS is modified, or can support substantially larger packets, there will be no
performance benefit to increasing the payload size. Increasing the
\ConfigSetting { Maximum Payload} size will increase the resources consumed.
Next, we need to generate the PLL for the example design. Select the ALTPLL
megafunction from the Quartus IP Catalog, to open the window shown in
Figure~\ref { Fig:Altera:ALTPLL:TabGeneral} .
\begin { figure} [H]
\includegraphics [width=350px,center] { ALTPLLTabGeneral.png}
\caption { ALTPLL General Settings Tab}
\label { Fig:Altera:ALTPLL:TabGeneral}
\end { figure}
In Figure~\ref { Fig:Altera:ALTPLL:TabGeneral} , select the \ConfigSetting { Speed
Grade} that matches your board (Found in the User Guide and online). Next set
the input clock frequency. The DE4 board provides 50 MHz clock inputs and we use
these for convenience. The remaining settings are unchaged. Click on the
Inputs/Lock tab to move on to Figure~\ref { Fig:Altera:ALTPLL:TabInputs} .
Optional: Set the name of the ALTPLL block. In the example designs we use the
name ALTPLL50I50O125O250O, for 50 MHz Input clock, 50, 125, and 250 MHz output
clocks. The 125 and 50 MHz Clocks are required for the PCIe Endpoint.
\begin { figure} [H]
\includegraphics [width=350px,center] { ALTPLLTabInputs.png}
\caption { ALTPLL Input Settings Tab}
\label { Fig:Altera:ALTPLL:TabInputs}
\end { figure}
Match the settings shown in Figure~\ref { Fig:Altera:ALTPLL:TabInputs} . In the
Output Clocks Section, create a 50 MHz output clock, 125 MHz clock, and 250 MHz
Clock. Click Finish when done.
% \begin{figure}[H]
% \includegraphics[width=350px,center]{ALTPLLTabBandwidth.png}
% \caption{ALTPLL Bandwidth Tab}
% \label{Fig:Altera:ALTPLL:TabBandwidth}
% \end{figure}
% \begin{figure}[H]
% \includegraphics[width=350px,center]{ALTPLLTabSwitchover.png}
% \caption{ALTPLL Switchover Tab}
% \label{Fig:Altera:ALTPLL:TabSwitchover}
% \end{figure}
Finally, we generate the ALTGX\_ RECONFIG Megafunction. Select the
ALTGX\_ RECONFIG megafunction from the Quartus IP Catalog to produce the widown
shown in Figure~\ref { Fig:Altera:ALTGX:TabSettings} .
\begin { figure} [H]
\includegraphics [width=350px,center] { ALTGXReconfigTabSettings.png}
\caption { ALTGX Reconfiguration Settings Tab}
\label { Fig:Altera:ALTGX:TabSettings}
\end { figure}
In the Reconfiguration Tab shown in Figure~\ref { Fig:Altera:ALTGX:TabSettings} , set the \ConfigSetting { Number of Channels} . This should be equal to the number of PCIe Lanes at the Top Level. In the Features Section, \ConfigSetting { Enable Analog Controls} . Match the settings in the remaining windows, shown in Figure~\ref { Fig:Altera:ALTGX:TabInputs} ,Figure~\ref { Fig:Altera:ALTGX:TabChannel} , and Figure~\ref { Fig:Altera:ALTGX:TabError} .
\begin { figure} [H]
\includegraphics [width=350px,center] { ALTGXReconfigTabAnalog.png}
\caption { ALTGX Reconfiguration Analog Settings Tab}
\label { Fig:Altera:ALTGX:TabInputs}
\end { figure}
\begin { figure} [H]
\includegraphics [width=350px,center] { ALTGXReconfigTabChannel.png}
\caption { ALTGX Reconfiguration Channel Tab}
\label { Fig:Altera:ALTGX:TabChannel}
\end { figure}
\begin { figure} [H]
\includegraphics [width=350px,center] { ALTGXReconfigTabError.png}
\caption { ALTGX Reconfiguration Error Tab}
\label { Fig:Altera:ALTGX:TabError}
\end { figure}
\pagebreak
\subsection { Creating Constraints files for IP Compiler Designs}
Advanced users may also want to edit and modify the constraint files. This not
required or recommended for novice users. The example designs in the RIFFA
\RIFFAVer ~distribution contain appropriate constraint files for the example
designs. However if the need arises, we demonstrate the constraints we used
below.
If the goal is to generate a RIFFA design completely from scratch, each board
directory comes with a RIFFA wrapper verilog file and instantiates a
vendor-specific translation layer. It is highly recommended to re-use these
files RIFFA wrapper when creating designs from scratch. Users should also use
the constraints file (\Altera { .sdc} ) in the board directory, and in the
\Directory { constr/} , or read the User Guide provided with each board.
To appropriately constrain your PCIe reference clocks, place the constraints
shown in Listing~\ref { Listing:Altera:IPCompiler:Constraints} in your
\Altera { .sdc} file. Modify the name of the osc\_ 50MHz and PCIE\_ REFCLK ports to
match your design
\begin { lstlisting} [language=tcl,basicstyle=\footnotesize \ttfamily ,commentstyle=\color { red} ,
label=Listing:Altera:IPCompiler:Constraints,
caption=\Altera { .sdc} constraints for Qsys and Megawizard designs,frame=single]
create_ clock -name PCIE_ REFCLK -period 10.000 [get_ ports { PCIE_ REFCLK} ]
create_ clock -name osc_ 50MHz -period 20.000 [get_ ports { OSC_ BANK3D_ 50MHZ} ]
derive_ pll_ clocks -create_ base_ clocks
derive_ clock_ uncertainty
# 50 MHZ PLL Clock
create_ generated_ clock -name clk50 -source [get_ ports { OSC_ 50_ BANK2} ] \
[get_ nets { *|altpll_ component|auto_ generated|wire_ pll1_ clk[0]} ]
# 125 MHZ PLL Clock
create_ generated_ clock -name clk125 -multiply_ by 5 -divide_ by 2 -source \
[get_ ports { OSC_ 50_ BANK2} ] \
[get_ nets { *|altpll_ component|auto_ generated|wire_ pll1_ clk[1]} ]
# 250 MHZ PLL Clock
create_ generated_ clock -name clk250 -multiply_ by 5 \
-source [get_ ports { OSC_ 50_ BANK2} ] [get_ nets \
{ *|altpll_ component|auto_ generated|wire_ pll1_ clk[2]} ]
\end { lstlisting}
Likewise, copy the constraints in
Listing~\ref { Listing:Altera:IPCompiler:QSFSettings} into your \Altera { .qsf}
file. Copy the location assignment commands for each PCIe Pin in your reference
design.
\begin { lstlisting} [language=tcl,basicstyle=\footnotesize \ttfamily ,
commentstyle=\color { red} ,label=Listing:Altera:IPCompiler:QSFSettings,
caption=\Altera { .qsf} settings for IP Compiler Designs,frame=single]
###################################################################
# PCIE Connections
###################################################################
set_ location_ assignment <PCIE_ REFCLK_ PIN> -to PCIE_ REFCLK
set_ instance_ assignment -name IO_ STANDARD HCSL -to PCIE_ REFCLK
set_ location_ assignment <PCIE_ REFCLK_ PIN(n)> -to ``PCIE_ REFCLK(n)''
set_ instance_ assignment -name IO_ STANDARD HCSL -to ``PCIE_ REFCLK(n)''
set_ location_ assignment <PCIE_ RESET_ N> -to PCIE_ RESET_ N
set_ instance_ assignment -name IO_ STANDARD ``2.5 V'' -to PCIE_ RESET_ N
# For each PCIE Lane (L) set the pin locations from the board user guide!
######################################################################
#PCIE TX_ OUT L
######################################################################
2015-07-15 10:26:04 -07:00
set_ location_ assignment <TX_ LANE[L]_ PIN> -to PCIE_ TX_ OUT[L]
set_ location_ assignment <TX_ LANE[L]_ PIN(n)> -to ``PCIE_ TX_ OUT[L](n)''
2015-07-14 09:43:51 -07:00
###################################################################
#PCIE RX_ IN L
###################################################################
set_ location_ assignment <RX_ LANE[L]_ PIN> -to PCIE_ RX_ IN[L]
set_ location_ assignment <RX_ LANE[L]_ PIN(n)> -to ``PCIE_ RX_ IN[L](n)''
\end { lstlisting}
\pagebreak
\chapter { Developer Documentation}
This chapter describes RIFFA \RIFFAVer ~at a level of detail that is useful for
RIFFA developers. Users of RIFFA should not read this section until they are
comfortable developing for RIFFA or have experience with PCIe and DMA
concepts.
\section { Architecture Description}
\begin { figure} [H]
\includegraphics [width=350px,center] { RIFFA.pdf}
\caption { High level RIFFA Diagram}
\label { Fig:Developer:HighLevel:Full}
\end { figure}
\begin { itemize}
\item \textbf { \color { red} { IP Interfaces} } The Vendor IP interfaces provied
low-level access to the PCIe bus. Each vendor provides a set of signals for
communicating over PCIe. Xilinx FPGAs without PCIe Gen3 support provide an
interface very similar to Altera FPGAs. We call this the ``Classic
Interface''. Newer Xilinx devices with PCIe Gen3 support have completely
different non-compatible interfaces (CC, CQ, RC, RQ instead of RX and TX). We
call this the ``Xilinx Ultrascale Interface''.
\textit { Files: \Xilinx { *.xci} , \Altera { *.qsys} (And others generated by vendor
tools)}
\item \textbf { \color { orange} { Translation Layer} } The Translation Layer provides
a set of vendor-independent interfaces and signal names.
There is one translation layer for each interface. The ``Classic Translation
Layer'' provides a set of interfaces (RX, TX, Interrupt, and Configuration)
and vendor independent signal names to higher layers. There is very little
logic in these layers, and there should be no timing-critical logic here.
The ``Ultrascale Translation Layer'' operates on the ultrascale
interface. Similar to the classic translation layer, it contains very little
logic. It provides the interfaces: RX Completion, RX Request, TX
Completion, TX Request, Interrupt, and Configuration.
\textit { Files: translation\_ altera.v, translation\_ xilinx.v,
txc\_ engine\_ ultrascale.v,\\ txr\_ engine\_ ultrascale.v}
\item \textbf { \color { yellow} { Formatting Engine Layer} } The Formatting Engine
Layer is responsible for formatting requests and completions into
packets. This layer provides four interfaces: RX Completion (RXC) for
receiving completions (responses to memory read requests), RX Request (RXR)
for receiving memory read and write requests, TX Completion (TXC) for
transmitting completions (reponses to memory read requests), and TX Request
(TXR) for transmitting read and write requests.
The engine layer abstracts vendor specific features, such as Xilinx's
Classic-Interface Big-Endian requirement and Altera's Quad-word
Alignment. The C\_ VENDOR parameter for the engine layer switches between
Xilinx, Altera, and Ultrascale logic to produce TLPs (Classic Interface) and
AXI Descriptors (Ultrascale Interface).
The RX path of the engine layer has packet parsers for TLPs and AXI
Descriptors. These are parameterized by width, as of RIFFA 2.2. The TX Path of
the engine layer has packet formatters for TLPs and AXI Descriptors.
As alluded to in the Translation Layer, the Classic IP Cores provide only two
transmit interfaces (RX, and TX), while the Xilinx Ultrascale IP Core handles
RX Demultiplexing and multiplexing internally and provides four interfaces
(RXC, RXR, TXC, and TXR). For this reason, the multiplexing/FIFO logic used in
the Classic interfaces are not necessary for the Xilinx interface.
After the Engine-Layer, higher layers should be vendor agnostic, if not bus
agnostic. The exception will be sideband signals signals. (How much of this
ideal can be achieved remains to be seen)
Note: The engine layer currently uses word-aligned addresses, and byte-enable
signals to specify sub-word addresses. In the future, all addresses will be
byte-aligned and word enables will be handled in the formatting logic.
\textit { Files: engine\_ layer.v, schedules.vh, rx\_ engine\_ classic.v, rxc\_ engine\_ classic.v,\\
rxr\_ engine\_ classic.v, tx\_ engine\_ classic.v, txc\_ engine\_ classic.v,\\
txr\_ engine\_ classic.v, rx\_ engine\_ ultrascale.v, rxc\_ engine\_ ultrascale.v,\\
rxr\_ engine\_ ultrascale.v, tx\_ engine\_ ultrascale.v, txc\_ engine\_ ultrascale.v, \\
txr\_ engine\_ ultrascale.v}
\item \textbf { \color { green} { Scatter Gather (SG) DMA Layer} } The Scatter Gather
DMA Layer handles reading from and writing to scatter gather lists and
providing the addresses found in these lists to the data-request logic in the
Data Abstraction layer. In RIFFA, each channel has its own SG DMA list logic.
The Completion Merge/Reorder buffer handles out-of-order completions. In the
PCIe specification, a memory request can be serviced by multiple smaller
completions (the responses must remain in order). Completions from different
memory requests can be returned in any order. The reorder buffer releases data
when all of the responses to a memory request have been received.
Memory read and write requests to the host are multiplexed by the TX Request
Mux. These are serviced fairly in round robin order.
The Scatter Gather List Readers issue read requests to read data from the
Scatter Gather List (SGL) created by the driver. This list contains the
address and length of pages containing data to transmit. When an SGL has been
exhausted, an interrupt is raised and the SGL is refilled or the transaction
is comlete.
Each element in the SGL 128-bit triple: { 32'b0, 32'b Length of Data in 32-bit
words, 64'b Address of Page} . The addresses in this list are provided to the
DMA Data Read Engine in the Data Abstraction layer. Since the SGL must be a
single continuous stream of 128-bit elements regardless of the size of the
interface, gaps and mis-alignments due to packet formatting are removed using
the Data Packer, which receives its data from the reorder buffer.
The location of the SGL in host memory is written to the BAR Memory space.
The BAR Memory space is partitioned among the channels. Only the host can
issue read and write requests to this memory space. Since the memory space is
partitioned, the RX Request interface and TX Completion interface do not have
demultiplexing or multiplexing logic.
A more through treatment of the SG DMA Layer can be found in Sec.~\ref { Sec:Developer:Arch:SGDMA} .
\textit { Files: reorder\_ queue*.v, sg\_ list\_ reader\_ * .v, sg\_ list\_ requester.v \\
fifo\_ packer\_ * .v, registers.v, tx\_ multiplexer\_ * .v}
\item \textbf { \color { blue} { Data Abstraction / DMA Layer} } The Data Abstraction /
DMA Layer is responsible for making requests to read data from, or write data
to host memory.
The read and write addresses are provided by the Scatter Gather list
readers. Since RIFFA provides a single continuous stream of 32-bit words
regardless of the size of the interface, gaps and mis-alignments due to packet
formatting are removed using the Data Packer, which receives its data from the
reorder buffer. On the TX side, this is not necessary. However a write buffer,
and other transaction tracking logic is necessary for buffering, and removing
non-integral data.
A more through treatment of the Data Abstraction Layer can be found in
Sec.~\ref { Sec:Developer:Arch:DataAbstraction} .
\textit { Files: reorder\_ queue*.v, rx\_ port\_ * .v, rx\_ port\_ reader.v,\\
fifo\_ packer\_ * .v, tx\_ port\_ writer.v tx\_ port\_ buffer\_ * .v tx\_ port\_ monitor\_ * .v}
\item \textbf { \color { purple} { Channel Interface} }
\textit { Files: rx\_ channel\_ gate\_ * .v, tx\_ channel\_ gate\_ * .v}
\contourlength { .2pt}
\item \contour { black} { \textbf { \color { white} { User Logic} } }
\end { itemize}
\subsection { Scatter Gather DMA Layer}
\label { Sec:Developer:Arch:SGDMA}
READS from the SG lists are prioritized
\subsection { Data Abstraction DMA Layer}
\label { Sec:Developer:Arch:DataAbstraction}
\section { Software Description}
\section { FPGA RX Transfer / Host Send}
\begin { center}
\begin { tabular} { | l | l |}
\hline
Parameter & Value \\ \hline
Data Transfer Length & 128 (32-bit words)\\ \hline
Data Transfer Offsfet & 0\\ \hline
Data Transfer Last & 1\\ \hline
Data Transfer Channel & 0\\ \hline
Data Page Address (DMA) & 0x00000000\_ FEED0000\\ \hline
SGL Head Address & 0x00000000\_ BEEF0000\\ \hline
\end { tabular}
\end { center}
\begin { itemize}
\item A user makes an call to fpga\_ send() to transfer 128 32-bit words of data on Channel 0. \item The RIFFA driver writes \{ 32'd128\} to Channel 0's RX Length register, and
\{ 31'd0,1'b1\} to Channel 0's RX OffLast register. This notifies the FPGA that a new
transfer is happening and will raise CHNL\_ RX for the user
application. \textit { Files: rxr\_ engine\_ * .v, registers.v, channel*.v,
rx\_ port.v rx\_ port\_ gate.v, rx\_ port\_ reader.v}
\item The RIFFA driver allocates an SGL with 1 element (4 32-bit words) at
address \\ \{ 64'h0000\_ 0000\_ BEEF\_ 0000\} . The driver fills the list with the
length and address of the user data:\\
\{ 32'd0,32'd128,64'h0000\_ 0000\_ FEED\_ 0000\} .
\item The RIFFA driver communicates the address and length of the SGL by
writing \\ \{ 32'hBEEF0000\} to to Channel 0's RX SGL Address Low register,
\{ 32'd0\} to to Channel 0's RX SGL Address High register, and \{ 32'd4\} to to
Channel 0's RX SGL Length register. Writing the RX SGL Length register
notifies the RX SG Engine that a transfer has started, and the low and high
portions of the 64-bit RX SGL Address are valid. \textit { Files:
rxr\_ engine\_ * .v, registers.v, channel*.v, rx\_ port.v,
sg\_ list\_ requester.v}
\item The SG List Requester on the FPGA issues a read request for 4 32-bit
words of data starting at address 0xBEEF0000. The FPGA also issues an
interrupt. The RIFFA driver reads the Interrupt Status Register of the FPGA
and determines that Channel 0 has finished reading the RX SGL. \textit { Files:
sg\_ list\_ requester.v, rx\_ port\_ requester\_ mux.v, rx\_ port\_ * .v,
channel*.v, tx\_ multiplexer.v, engine\_ layer.v, txr\_ engine\_ * .v,
interrupt.v}
\item The FPGA receieves a completion with 4 32-bit words. After being enqueued
in the reorder buffer, the completion is delivered to Channel 0, and packed
into the SGL RX Fifo. \textit { Files: rxc\_ engine\_ * .v, engine\_ layer.v,
reorder\_ queue*.v, fifo\_ packer\_ * .v}
\item The RX Port Reader removes the SG element from the FIFO, and issues
several read requests to receive all 128 32-bit words. \textit { Files:
rx\_ port\_ reader.v, rx\_ port\_ * .v, channel*.v, tx\_ multiplexer.v,
engine\_ layer.v, txr\_ engine\_ * .v, tx\_ multiplexer.v}
\item The completions return interleaved and are reordered in the reorder
buffer. The reorder buffer releases the completions in order to the fifo
packer, which puts them in the FIFO. The RX Port Channel Gate issues the data
to the user. \textit { Files: rxc\_ engine\_ * .v, engine\_ layer.v, reorder\_ queue*.v,
fifo\_ packer\_ * .v, rx\_ port\_ reader.v, rx\_ port\_ channel\_ gate.v, channel*.v}
\item The FPGA raises an interrupt with the last word of data is put into the
Main Data Fifo. The RIFFA driver reads the Interrupt Status Register of the
FPGA and determines that Channel 0 has finished the RX Transaction. The RIFFA
driver reads the RX Words Read register to determine how many words were read
during the transaction.
\item Control is returned to the user.
\end { itemize}
\section { TX Transfer}
\section { FPGA RX Transfer / Host Send}
\begin { center}
\begin { tabular} { | l | l |}
\hline
Parameter & Value \\ \hline
Data Transfer Length & 128 (32-bit words)\\ \hline
Data Transfer Offsfet & 0\\ \hline
Data Transfer Last & 1\\ \hline
Data Transfer Channel & 0\\ \hline
Data Page Address (DMA) & 0x00000000\_ FEED0000\\ \hline
SGL Head Address & 0x00000000\_ BEEF0000\\ \hline
\end { tabular}
\end { center}
\begin { itemize}
\item A user makes an call to fpga\_ recv() to transfer 128 32-bit words of data
from Channel 0.
\item The RIFFA driver allocates an SGL with 1 element (4 32-bit words) at
address\\ \{ 64'h0000\_ 0000\_ BEEF\_ 0000\} . The driver fills the list with the
length and address of the user data:
\{ 32'd0,32'd128,64'h0000\_ 0000\_ FEED\_ 0000\} .
\item The user application independently raises CHNL\_ TX and starts writing data to\\
CHNL\_ TX\_ DATA. RIFFA core logic reads transaction parameters from
CHNL\_ TX\_ OFF, CHNL\_ TX\_ LAST, and CHNL\_ TX\_ LEN and acknowledges them with
CHNL\_ TX\_ ACK. \\ \textit { Files: tx\_ port\_ channel\_ gate.v}
\item An interrupt is raised by the FPGA. The RIFFA driver reads the Interrupt
Status Register of the FPGA and determines that Channel 0 wishes to start a
new TX Transaction. The driver ISR reads \{ 32'd128\} from Channel 0's TX
Length register, and \{ 31'd0, 1'b1\} from Channel 0's TX OffLast
register. Reading the OffLast register notifies the FPGA that the new
transfer has been accepted. \textit { Files: rxr\_ engine\_ * .v, riffa.v,
registers.v, channel*.v, tx\_ port\_ * .v, tx\_ port\_ writer.v,
tx\_ port\_ monitor\_ * .v, engine\_ layer.v, txc\_ engine\_ * .v}
\item The RIFFA driver communicates the address and length of the SGL by
writing\\ \{ 32'hBEEF\_ 0000\} to to Channel 0's TX SGL Address Low register,
\{ 32'd0\} to to Channel 0's TX SGL Address High register, and \{ 32'd4\} to to
Channel 0's TX SGL Length register. Writing the TX SGL Length register
notifies the TX SG Engine that a transfer has started, and the low and high
portions of the 64-bit TX SGL Address are valid. \textit { Files:
rxr\_ engine\_ * .v, registers.v, channel*.v, rx\_ port.v,
sg\_ list\_ requester.v}
\item The SG List Requester on the FPGA issues a read request for 4 32-bit
words of data starting at address 0xBEEF0000. The FPGA raises an
interrupt. The RIFFA driver reads the Interrupt Status Register of the FPGA
and determines that Channel 0 has finished reading the TX SGL. \textit { Files:
sg\_ list\_ requester.v, rx\_ port\_ requester\_ mux.v, rx\_ port\_ * .v,
channel*.v, tx\_ multiplexer.v, engine\_ layer.v, txr\_ engine\_ * .v,
interrupt.v}
\item The FPGA receieves a completion with 4 32-bit words. After being enqueued
in the reorder buffer, the completion is delivered to Channel 0, and packed
into the SGL TX Fifo. \textit { Files: rxc\_ engine\_ * .v, engine\_ layer.v,
reorder\_ queue*.v, fifo\_ packer\_ * .v}
\item The TX Port Writer removes the SG element from the FIFO, and issues
several write requests to write all 128 32-bit words. \textit { Files:
tx\_ port\_ monitor.v, tx\_ port\_ writer.v, tx\_ port\_ * .v, channel*.v, tx\_ multiplexer.v,
engine\_ layer.v, txr\_ engine\_ * .v, tx\_ multiplexer.v}
\item When the last write transaction has been accepted by the core, the FPGA
raises an interrupt. The RIFFA driver reads the Interrupt Status Register of
the FPGA and determines that Channel 0 has finished writing data. The RIFFA
driver reads the TX Words Written register to determine how many words were
written during the transaction (in case of early termination, or overflow).
\textit { Files: rxr\_ engine\_ * .v, riffa.v, interrupt.v, registers.v, channel*.v,
tx\_ port\_ * .v, tx\_ port\_ writer.v, engine\_ layer.v, txc\_ engine\_ * .v}
\item Control is return to the user because the TX\_ LAST signal was set to 1.
\end { itemize}
%\pagebreak
%\chapter{The RIFFA Architecture}
%\section{Translation Layer}
%\section{Engine Layer}
%\section{DMA Layer}
%\section{RIFFA}
%\section{The DMA Engine Layer}
%\chapter{Version Notes}
%\section{References, Notes and Reading Material}
%\section{Core Diagrams}
\end { document}