Introducing OCuLink support for the Open Source Jetson Orin Baseboard
Published:
Topics: Open hardware, Open machine vision
Many embedded applications require physically distancing the sensors, storage or communication modules from the processing platforms. This may be needed e.g. to maintain adequate cooling, or allow for extra components such as radiation shields. The mechanical setup of complex devices, operating in challenging environments, both outdoors and indoors, can also be a factor in itself.
An interesting solution to handling such separation is OCuLink, short for Optical-Copper Link, a technology that allows connecting PCIe endpoint devices, such as graphics cards, to a processing platform using cables, rather than direct slot connectors. Standardized and maintained by the PCI-SIG group for high-bandwidth connectivity, OCuLink offers a more direct and lower-latency connection compared to e.g. Thunderbolt. It is an excellent solution for hardware which requires its PCIe-based peripherals to be connected at a distance; examples include servers with multiple NVMe drives which require cooling and high-speed/high-resolution cameras, especially those connected with hardware that is spread out in a confined, narrow space.
Antmicro has recently designed and released an OCuLink-themed series of open source boards and adapters to our highly popular Jetson Orin Baseboard to support customer cases which use standardized OCuLink cabling as transmission media for PCIe-enabled devices. In the sections below, we explore what new capabilities those additions to Antmicro’s open hardware setup bring to the table for developing Orin-powered edge devices.
Complete Jetson Orin Baseboard with OCuLink to PCIe system
PCIe enables the transfer of huge amounts of data with minimal latency, but (normally) only works over very short distances. To transmit PCIe data over a cable, consumer standards like Thunderbolt and USB are often used, although those are higher-level interfaces requiring additional chips for encapsulating and unpacking PCIe data.
OCuLink, on the other hand, was specifically designed with point-to-point PCIe transmission in mind. While this may limit the flexibility of possible applications to PCIe-only, it’s both simpler electrically and does not include additional overhead which is a better fit for many embedded applications than e.g. Thunderbolt.
Below we present a complete open source system showcasing how to connect any PCIe device, such as GPUs, to the Jetson Orin Baseboard using the OCuLink to PCIe Adapter and the Jetson Orin Baseboard OCuLink Expansion Board.
The setup features the following open source hardware:
-
Jetson Orin Baseboard - an open source baseboard supporting NVIDIA Jetson Orin Nano and Jetson Orin NX System-on-Modules. You can read our previous blog note dedicated to this matter, and purchase Jetson Orin Baseboard on CircuitHub
-
Jetson Orin Baseboard OCuLink Expansion Board - lets you utilize a 2-lane PCIe Gen 4 exposed from the Nvidia Jetson Orin NX or Nano System on Module (SoM). The board exposes two OCuLink connectors, break-routed from the Nvidia Jetson Orin Nano/NX SoM, in the current revision offering a maximum bandwidth of PCIe 3.0
-
OCuLink to PCIe Adapter - used with the Jetson Orin Baseboard OCuLink Expansion Board, it converts OCuLink back into PCIe, break-routing a regular OCuLink x4 connector into x16 PCIe edge card slot
You can explore the Jetson Orin Baseboard OCuLink to PCIe setup in detail in the Complex Devices section of Antmicro’s System Designer: including block diagrams, HBOM and interactive lists of components, highlighted hot areas, data sheets, related devices, beautiful 3D renders, and more. The sample setup features an off-the-shelf NVIDIA Tesla T4 GPU as a PCIe device connected to the Jetson host.
Additionally, to provide OCuLink connectivity to other host devices, we have developed the M.2 to OCuLink Adapter. The board extracts x4 PCIe lanes from an M.2 M-key connector. In the case of Jetson Orin Baseboard, this is the J2 connector, exposing Jetson’s PCIe0 interface.
The PCIe0 interface of the Jetson SoM can work in both endpoint and root complex mode, so this adapter also lets you use the Jetson SoM as a PCIe endpoint device, or for multi-JOB setups. The Jetson Orin Baseboard can for example be connected as an accelerator to e.g. an FPGA-based system, resulting in a flexible, hybrid processing setup.
Performance tests
To functionally verify the performance and signal integrity of the described system, we conducted some real-world tests. We used a Jetson Orin Nano SoM plugged into a Jetson Orin Baseboard with Jetson Orin Baseboard OCuLink Expansion Board and a 1 meter off-the-shelf OCuLink cable leading to a OCuLink to PCIe Adapter, with various PCIe devices plugged in, in the following test scenarios:
-
We verified Ethernet transfers with an off-the-shelf Asus XG-C100C V2 10 GBE network card, arriving at 9.42 Gb/s (TCP/IP stream, measured by iperf3).
-
The other test focused on memory transfers to an external GPU. With a NVIDIA Tesla GPU connected via a 2 lane PCIe Oculink interface, we achieved 3 GB/s throughput in both read and write operations.
-
Finally we tested the setup with a Samsung 980 PRO 1TB M.2 NVMe drive. In this case, an additional PCIe to M.2 adapter was required on the device side. The drive write and read speed we measured was in the range of 1550 - 1600 MB/s. As we suspected that the 2-lane Gen3 PCIe might be a bottleneck here, we repeated the test with the M.2 to OCuLink Adapter on the Jetson side, replacing the Jetson’s 2-lane PCIe2 interface with a 4-lane PCIe0. With this setup, we achieved ~2400 MB/s read/write speeds on the drive.
Potential applications
Using OCuLink with the Jetson Orin Baseboard, we can integrate not only M.2 NVMe drives, but also ones using other standards like U.2, E3.S or E1.L. Such drives are available in very high capacity options like 120 TB, and have many advantages compared to consumer-grade M.2 drives, typically offering better longevity / endurance (TBW/TWDP), and very high performance for specific workloads. They can be optimized for write operations, sequential access, or IOPS. This opens up many new possibilities for applications such as LLMs, data-intensive workloads, and capturing data for edge processing.
Another potential application is high throughput cameras. The default camera interface of Jetson modules, MIPI CSI-2 D-PHY, offers a maximum bandwidth of 10 Gb/s on 4 lanes. With PCIe, we can go beyond that and support high resolution, high data rate imagers. A good example is XIMEA xiB series with a 20 Gb/s data rate. High speed CSI is also limited by 15-30 cm cable length, while OCuLink can operate e.g. with 1 m cables.
Here are some examples of high-speed cameras which you can wire up using Antmicro’s OCuLink to PCIe Adapter:
- XIMEA xiX-XL series
- also check the XIMEA PCI Express Camera Zone
- Teledyne PCIe Cameras
- Balluff PCI Express cameras
OCuLink for your next Jetson-based device
With new, data-intensive use cases, applications for an OCuLink-supported Jetson Orin Baseboard are ever expanding. For many of them, such as hyperspectral cameras, only PCIe can provide a stable connection with enough bandwidth to handle the necessary throughput, and OCuLink is a reliable option to transmit that data in mechanically challenging circumstances.
With Antmicro’s expertise in developing end-to-end vision systems, we help customers build high-speed camera and data storage/processing devices that leverage the numerous possibilities enabled by our Jetson Orin Baseboard (and other processing hardware, including AMD Zynq/US+ Versal, Qualcomm Snapdragon or NXP’s i.MX or LayerScape series) coupled with accessories like the OCuLink to PCIe Adapter.
If you are looking to build a complex embedded system, aiming to integrate OCuLink or other high-speed data links, Antmicro can help you design, build, and test your next system. Just send us an email at contact@antmicro.com to discuss your use case.