diff options
author | Christian Ehrhardt <christian.ehrhardt@canonical.com> | 2017-05-16 14:51:32 +0200 |
---|---|---|
committer | Christian Ehrhardt <christian.ehrhardt@canonical.com> | 2017-05-16 14:51:32 +0200 |
commit | fca143f059a0bddd7d47b8dc2df646a891b0eb0f (patch) | |
tree | 4bfeadc905c977e45e54a90c42330553b8942e4e /doc/guides/nics | |
parent | ce3d555e43e3795b5d9507fcfc76b7a0a92fd0d6 (diff) |
Imported Upstream version 17.05
Diffstat (limited to 'doc/guides/nics')
56 files changed, 2938 insertions, 906 deletions
diff --git a/doc/guides/nics/ark.rst b/doc/guides/nics/ark.rst new file mode 100644 index 00000000..a7c2590b --- /dev/null +++ b/doc/guides/nics/ark.rst @@ -0,0 +1,261 @@ +.. BSD LICENSE + + Copyright (c) 2015-2017 Atomic Rules LLC + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Atomic Rules LLC nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +ARK Poll Mode Driver +==================== + +The ARK PMD is a DPDK poll-mode driver for the Atomic Rules Arkville +(ARK) family of devices. + +More information can be found at the `Atomic Rules website +<http://atomicrules.com>`_. + +Overview +-------- + +The Atomic Rules Arkville product is DPDK and AXI compliant product +that marshals packets across a PCIe conduit between host DPDK mbufs and +FPGA AXI streams. + +The ARK PMD, and the spirit of the overall Arkville product, +has been to take the DPDK API/ABI as a fixed specification; +then implement much of the business logic in FPGA RTL circuits. +The approach of *working backwards* from the DPDK API/ABI and having +the GPP host software *dictate*, while the FPGA hardware *copes*, +results in significant performance gains over a naive implementation. + +While this document describes the ARK PMD software, it is helpful to +understand what the FPGA hardware is and is not. The Arkville RTL +component provides a single PCIe Physical Function (PF) supporting +some number of RX/Ingress and TX/Egress Queues. The ARK PMD controls +the Arkville core through a dedicated opaque Core BAR (CBAR). +To allow users full freedom for their own FPGA application IP, +an independent FPGA Application BAR (ABAR) is provided. + +One popular way to imagine Arkville's FPGA hardware aspect is as the +FPGA PCIe-facing side of a so-called Smart NIC. The Arkville core does +not contain any MACs, and is link-speed independent, as well as +agnostic to the number of physical ports the application chooses to +use. The ARK driver exposes the familiar PMD interface to allow packet +movement to and from mbufs across multiple queues. + +However FPGA RTL applications could contain a universe of added +functionality that an Arkville RTL core does not provide or can +not anticipate. To allow for this expectation of user-defined +innovation, the ARK PMD provides a dynamic mechanism of adding +capabilities without having to modify the ARK PMD. + +The ARK PMD is intended to support all instances of the Arkville +RTL Core, regardless of configuration, FPGA vendor, or target +board. While specific capabilities such as number of physical +hardware queue-pairs are negotiated; the driver is designed to +remain constant over a broad and extendable feature set. + +Intentionally, Arkville by itself DOES NOT provide common NIC +capabilities such as offload or receive-side scaling (RSS). +These capabilities would be viewed as a gate-level "tax" on +Green-box FPGA applications that do not require such function. +Instead, they can be added as needed with essentially no +overhead to the FPGA Application. + +The ARK PMD also supports optional user extensions, through dynamic linking. +The ARK PMD user extensions are a feature of Arkville’s DPDK +net/ark poll mode driver, allowing users to add their +own code to extend the net/ark functionality without +having to make source code changes to the driver. One motivation for +this capability is that while DPDK provides a rich set of functions +to interact with NIC-like capabilities (e.g. MAC addresses and statistics), +the Arkville RTL IP does not include a MAC. Users can supply their +own MAC or custom FPGA applications, which may require control from +the PMD. The user extension is the means providing the control +between the user's FPGA application and the existing DPDK features via +the PMD. + +Device Parameters +------------------- + +The ARK PMD supports device parameters that are used for packet +routing and for internal packet generation and packet checking. This +section describes the supported parameters. These features are +primarily used for diagnostics, testing, and performance verification +under the guidance of an Arkville specialist. The nominal use of +Arkville does not require any configuration using these parameters. + +"Pkt_dir" + +The Packet Director controls connectivity between Arkville's internal +hardware components. The features of the Pkt_dir are only used for +diagnostics and testing; it is not intended for nominal use. The full +set of features are not published at this level. + +Format: +Pkt_dir=0x00110F10 + +"Pkt_gen" + +The packet generator parameter takes a file as its argument. The file +contains configuration parameters used internally for regression +testing and are not intended to be published at this level. The +packet generator is an internal Arkville hardware component. + +Format: +Pkt_gen=./config/pg.conf + +"Pkt_chkr" + +The packet checker parameter takes a file as its argument. The file +contains configuration parameters used internally for regression +testing and are not intended to be published at this level. The +packet checker is an internal Arkville hardware component. + +Format: +Pkt_chkr=./config/pc.conf + + +Data Path Interface +------------------- + +Ingress RX and Egress TX operation is by the nominal DPDK API . +The driver supports single-port, multi-queue for both RX and TX. + +Refer to ``ark_ethdev.h`` for the list of supported methods to +act upon RX and TX Queues. + +Configuration Information +------------------------- + +**DPDK Configuration Parameters** + + The following configuration options are available for the ARK PMD: + + * **CONFIG_RTE_LIBRTE_ARK_PMD** (default y): Enables or disables inclusion + of the ARK PMD driver in the DPDK compilation. + + * **CONFIG_RTE_LIBRTE_ARK_PAD_TX** (default y): When enabled TX + packets are padded to 60 bytes to support downstream MACS. + + * **CONFIG_RTE_LIBRTE_ARK_DEBUG_RX** (default n): Enables or disables debug + logging and internal checking of RX ingress logic within the ARK PMD driver. + + * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TX** (default n): Enables or disables debug + logging and internal checking of TX egress logic within the ARK PMD driver. + + * **CONFIG_RTE_LIBRTE_ARK_DEBUG_STATS** (default n): Enables or disables debug + logging of detailed packet and performance statistics gathered in + the PMD and FPGA. + + * **CONFIG_RTE_LIBRTE_ARK_DEBUG_TRACE** (default n): Enables or disables debug + logging of detailed PMD events and status. + + +Building DPDK +------------- + +See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for +instructions on how to build DPDK. + +By default the ARK PMD library will be built into the DPDK library. + +For configuring and using UIO and VFIO frameworks, please also refer :ref:`the +documentation that comes with DPDK suite <linux_gsg>`. + +Supported ARK RTL PCIe Instances +-------------------------------- + +ARK PMD supports the following Arkville RTL PCIe instances including: + +* ``1d6c:100d`` - AR-ARKA-FX0 [Arkville 32B DPDK Data Mover] +* ``1d6c:100e`` - AR-ARKA-FX1 [Arkville 64B DPDK Data Mover] + +Supported Operating Systems +--------------------------- + +Any Linux distribution fulfilling the conditions described in ``System Requirements`` +section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK +Release Notes*. ARM and PowerPC architectures are not supported at this time. + + +Supported Features +------------------ + +* Dynamic ARK PMD extensions +* Multiple receive and transmit queues +* Jumbo frames up to 9K +* Hardware Statistics + +Unsupported Features +-------------------- + +Features that may be part of, or become part of, the Arkville RTL IP that are +not currently supported or exposed by the ARK PMD include: + +* PCIe SR-IOV Virtual Functions (VFs) +* Arkville's Packet Generator Control and Status +* Arkville's Packet Director Control and Status +* Arkville's Packet Checker Control and Status +* Arkville's Timebase Management + +Pre-Requisites +-------------- + +#. Prepare the system as recommended by DPDK suite. This includes environment + variables, hugepages configuration, tool-chains and configuration + +#. Insert igb_uio kernel module using the command 'modprobe igb_uio' + +#. Bind the intended ARK device to igb_uio module + +At this point the system should be ready to run DPDK applications. Once the +application runs to completion, the ARK PMD can be detached from igb_uio if necessary. + +Usage Example +------------- + +Follow instructions available in the document +:ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to launch +**testpmd** with Atomic Rules ARK devices managed by librte_pmd_ark. + +Example output: + +.. code-block:: console + + [...] + EAL: PCI device 0000:01:00.0 on NUMA socket -1 + EAL: probe driver: 1d6c:100e rte_ark_pmd + EAL: PCI memory mapped at 0x7f9b6c400000 + PMD: eth_ark_dev_init(): Initializing 0:2:0.1 + ARKP PMD CommitID: 378f3a67 + Configuring Port 0 (socket 0) + Port 0: DC:3C:F6:00:00:01 + Checking link statuses... + Port 0 Link Up - speed 100000 Mbps - full-duplex + Done + testpmd> diff --git a/doc/guides/nics/avp.rst b/doc/guides/nics/avp.rst new file mode 100644 index 00000000..1fcba66c --- /dev/null +++ b/doc/guides/nics/avp.rst @@ -0,0 +1,111 @@ +.. BSD LICENSE + Copyright(c) 2017 Wind River Systems, Inc. rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +AVP Poll Mode Driver +================================================================= + +The Accelerated Virtual Port (AVP) device is a shared memory based device +only available on `virtualization platforms <http://www.windriver.com/products/titanium-cloud/>`_ +from Wind River Systems. The Wind River Systems virtualization platform +currently uses QEMU/KVM as its hypervisor and as such provides support for all +of the QEMU supported virtual and/or emulated devices (e.g., virtio, e1000, +etc.). The platform offers the virtio device type as the default device when +launching a virtual machine or creating a virtual machine port. The AVP device +is a specialized device available to customers that require increased +throughput and decreased latency to meet the demands of their performance +focused applications. + +The AVP driver binds to any AVP PCI devices that have been exported by the Wind +River Systems QEMU/KVM hypervisor. As a user of the DPDK driver API it +supports a subset of the full Ethernet device API to enable the application to +use the standard device configuration functions and packet receive/transmit +functions. + +These devices enable optimized packet throughput by bypassing QEMU and +delivering packets directly to the virtual switch via a shared memory +mechanism. This provides DPDK applications running in virtual machines with +significantly improved throughput and latency over other device types. + +The AVP device implementation is integrated with the QEMU/KVM live-migration +mechanism to allow applications to seamlessly migrate from one hypervisor node +to another with minimal packet loss. + + +Features and Limitations of the AVP PMD +--------------------------------------- + +The AVP PMD driver provides the following functionality. + +* Receive and transmit of both simple and chained mbuf packets, + +* Chained mbufs may include up to 5 chained segments, + +* Up to 8 receive and transmit queues per device, + +* Only a single MAC address is supported, + +* The MAC address cannot be modified, + +* The maximum receive packet length is 9238 bytes, + +* VLAN header stripping and inserting, + +* Promiscuous mode + +* VM live-migration + +* PCI hotplug insertion and removal + + +Prerequisites +------------- + +The following prerequisites apply: + +* A virtual machine running in a Wind River Systems virtualization + environment and configured with at least one neutron port defined with a + vif-model set to "avp". + + +Launching a VM with an AVP type network attachment +-------------------------------------------------- + +The following example will launch a VM with three network attachments. The +first attachment will have a default vif-model of "virtio". The next two +network attachments will have a vif-model of "avp" and may be used with a DPDK +application which is built to include the AVP PMD driver. + +.. code-block:: console + + nova boot --flavor small --image my-image \ + --nic net-id=${NETWORK1_UUID} \ + --nic net-id=${NETWORK2_UUID},vif-model=avp \ + --nic net-id=${NETWORK3_UUID},vif-model=avp \ + --security-group default my-instance1 diff --git a/doc/guides/nics/bnx2x.rst b/doc/guides/nics/bnx2x.rst index 6d1768a5..fbfc048e 100644 --- a/doc/guides/nics/bnx2x.rst +++ b/doc/guides/nics/bnx2x.rst @@ -96,9 +96,11 @@ Config File Options The following options can be modified in the ``.config`` file. Please note that enabling debugging options may affect system performance. -- ``CONFIG_RTE_LIBRTE_BNX2X_PMD`` (default **y**) +- ``CONFIG_RTE_LIBRTE_BNX2X_PMD`` (default **n**) - Toggle compilation of bnx2x driver. + Toggle compilation of bnx2x driver. To use bnx2x PMD set this config parameter + to 'y'. Also, in order for firmware binary to load user will need zlib devel + package installed. - ``CONFIG_RTE_LIBRTE_BNX2X_DEBUG`` (default **n**) @@ -123,143 +125,14 @@ enabling debugging options may affect system performance. .. _bnx2x_driver-compilation: -Driver Compilation -~~~~~~~~~~~~~~~~~~ - -BNX2X PMD for Linux x86_64 gcc target, run the following "make" -command:: - - cd <DPDK-source-directory> - make config T=x86_64-native-linuxapp-gcc install - -To compile BNX2X PMD for Linux x86_64 clang target, run the following "make" -command:: - - cd <DPDK-source-directory> - make config T=x86_64-native-linuxapp-clang install - -To compile BNX2X PMD for Linux i686 gcc target, run the following "make" -command:: - - cd <DPDK-source-directory> - make config T=i686-native-linuxapp-gcc install - -To compile BNX2X PMD for Linux i686 gcc target, run the following "make" -command: - -.. code-block:: console - - cd <DPDK-source-directory> - make config T=i686-native-linuxapp-gcc install - -To compile BNX2X PMD for FreeBSD x86_64 clang target, run the following "gmake" -command:: - - cd <DPDK-source-directory> - gmake config T=x86_64-native-bsdapp-clang install - -To compile BNX2X PMD for FreeBSD x86_64 gcc target, run the following "gmake" -command:: - - cd <DPDK-source-directory> - gmake config T=x86_64-native-bsdapp-gcc install -Wl,-rpath=/usr/local/lib/gcc49 CC=gcc49 - -To compile BNX2X PMD for FreeBSD x86_64 gcc target, run the following "gmake" -command: - -.. code-block:: console - - cd <DPDK-source-directory> - gmake config T=x86_64-native-bsdapp-gcc install -Wl,-rpath=/usr/local/lib/gcc49 CC=gcc49 - -Linux ------ - -.. _bnx2x_Linux-installation: - -Linux Installation -~~~~~~~~~~~~~~~~~~ - -Sample Application Notes -~~~~~~~~~~~~~~~~~~~~~~~~ - -This section demonstrates how to launch ``testpmd`` with QLogic 578xx -devices managed by ``librte_pmd_bnx2x`` in Linux operating system. - -#. Request huge pages: - - .. code-block:: console - - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages/nr_hugepages - -#. Load ``igb_uio`` or ``vfio-pci`` driver: - - .. code-block:: console - - insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko - - or - - .. code-block:: console - - modprobe vfio-pci - -#. Bind the QLogic adapters to ``igb_uio`` or ``vfio-pci`` loaded in the - previous step:: - - ./tools/dpdk-devbind.py --bind igb_uio 0000:84:00.0 0000:84:00.1 - - or - - Setup VFIO permissions for regular users and then bind to ``vfio-pci``: - - .. code-block:: console - - sudo chmod a+x /dev/vfio - - sudo chmod 0666 /dev/vfio/* - - ./tools/dpdk-devbind.py --bind vfio-pci 0000:84:00.0 0000:84:00.1 - -#. Start ``testpmd`` with basic parameters: - - .. code-block:: console - - ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xf -n 4 -- -i - - Example output: - - .. code-block:: console +Driver compilation and testing +------------------------------ - [...] - EAL: PCI device 0000:84:00.0 on NUMA socket 1 - EAL: probe driver: 14e4:168e rte_bnx2x_pmd - EAL: PCI memory mapped at 0x7f14f6fe5000 - EAL: PCI memory mapped at 0x7f14f67e5000 - EAL: PCI memory mapped at 0x7f15fbd9b000 - EAL: PCI device 0000:84:00.1 on NUMA socket 1 - EAL: probe driver: 14e4:168e rte_bnx2x_pmd - EAL: PCI memory mapped at 0x7f14f5fe5000 - EAL: PCI memory mapped at 0x7f14f57e5000 - EAL: PCI memory mapped at 0x7f15fbd4f000 - Interactive-mode selected - Configuring Port 0 (socket 0) - PMD: bnx2x_dev_tx_queue_setup(): fp[00] req_bd=512, thresh=512, - usable_bd=1020, total_bd=1024, - tx_pages=4 - PMD: bnx2x_dev_rx_queue_setup(): fp[00] req_bd=128, thresh=0, - usable_bd=510, total_bd=512, - rx_pages=1, cq_pages=8 - PMD: bnx2x_print_adapter_info(): - [...] - Checking link statuses... - Port 0 Link Up - speed 10000 Mbps - full-duplex - Port 1 Link Up - speed 10000 Mbps - full-duplex - Done - testpmd> +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. SR-IOV: Prerequisites and sample Application Notes -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-------------------------------------------------- This section provides instructions to configure SR-IOV with Linux OS. @@ -311,7 +184,6 @@ This section provides instructions to configure SR-IOV with Linux OS. echo 2 > /sys/devices/pci0000:00/0000:00:03.0/0000:81:00.0/sriov_numvfs - #. Assign VF MAC address: Assign MAC address to the VF using iproute2 utility. The syntax is: @@ -323,9 +195,45 @@ This section provides instructions to configure SR-IOV with Linux OS. ip link set ens5f0 vf 0 mac 52:54:00:2f:9d:e8 - #. PCI Passthrough: The VF devices may be passed through to the guest VM using virt-manager or virsh etc. bnx2x PMD should be used to bind the VF devices in the guest VM using the instructions outlined in the Application notes below. + +#. Running testpmd: + + Follow instructions available in the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` + to run testpmd. + + Example output: + + .. code-block:: console + + [...] + EAL: PCI device 0000:84:00.0 on NUMA socket 1 + EAL: probe driver: 14e4:168e rte_bnx2x_pmd + EAL: PCI memory mapped at 0x7f14f6fe5000 + EAL: PCI memory mapped at 0x7f14f67e5000 + EAL: PCI memory mapped at 0x7f15fbd9b000 + EAL: PCI device 0000:84:00.1 on NUMA socket 1 + EAL: probe driver: 14e4:168e rte_bnx2x_pmd + EAL: PCI memory mapped at 0x7f14f5fe5000 + EAL: PCI memory mapped at 0x7f14f57e5000 + EAL: PCI memory mapped at 0x7f15fbd4f000 + Interactive-mode selected + Configuring Port 0 (socket 0) + PMD: bnx2x_dev_tx_queue_setup(): fp[00] req_bd=512, thresh=512, + usable_bd=1020, total_bd=1024, + tx_pages=4 + PMD: bnx2x_dev_rx_queue_setup(): fp[00] req_bd=128, thresh=0, + usable_bd=510, total_bd=512, + rx_pages=1, cq_pages=8 + PMD: bnx2x_print_adapter_info(): + [...] + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Port 1 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> diff --git a/doc/guides/nics/bnxt.rst b/doc/guides/nics/bnxt.rst index ad33cd5d..9826b350 100644 --- a/doc/guides/nics/bnxt.rst +++ b/doc/guides/nics/bnxt.rst @@ -32,10 +32,10 @@ BNXT Poll Mode Driver The bnxt poll mode library (**librte_pmd_bnxt**) implements support for: - * **Broadcom NetXtreme-C®/NetXtreme-E® BCM5730X and BCM5740X family of + * **Broadcom NetXtreme-C®/NetXtreme-E® BCM5730X and BCM574XX family of Ethernet Network Controllers** - These adapters support Standards compliant 10/25/50Gbps 30MPPS + These adapters support Standards compliant 10/25/50/100Gbps 30MPPS full-duplex throughput. Information about the NetXtreme family of adapters can be found in the diff --git a/doc/guides/nics/build_and_test.rst b/doc/guides/nics/build_and_test.rst new file mode 100644 index 00000000..2d70af88 --- /dev/null +++ b/doc/guides/nics/build_and_test.rst @@ -0,0 +1,179 @@ +.. BSD LICENSE + Copyright(c) 2017 Cavium, Inc. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Cavium, Inc. nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER(S) OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +.. _pmd_build_and_test: + +Compiling and testing a PMD for a NIC +===================================== + +This section demonstrates how to compile and run a Poll Mode Driver (PMD) for +the available Network Interface Cards in DPDK using TestPMD. + +TestPMD is one of the reference applications distributed with the DPDK. Its main +purpose is to forward packets between Ethernet ports on a network interface and +as such is the best way to test a PMD. + +Refer to the :ref:`testpmd application user guide <testpmd_ug>` for detailed +information on how to build and run testpmd. + +Driver Compilation +------------------ + +To compile a PMD for a platform, run make with appropriate target as shown below. +Use "make" command in Linux and "gmake" in FreeBSD. This will also build testpmd. + +To check available targets: + +.. code-block:: console + + cd <DPDK-source-directory> + make showconfigs + +Example output: + +.. code-block:: console + + arm-armv7a-linuxapp-gcc + arm64-armv8a-linuxapp-gcc + arm64-dpaa2-linuxapp-gcc + arm64-thunderx-linuxapp-gcc + arm64-xgene1-linuxapp-gcc + i686-native-linuxapp-gcc + i686-native-linuxapp-icc + ppc_64-power8-linuxapp-gcc + x86_64-native-bsdapp-clang + x86_64-native-bsdapp-gcc + x86_64-native-linuxapp-clang + x86_64-native-linuxapp-gcc + x86_64-native-linuxapp-icc + x86_x32-native-linuxapp-gcc + +To compile a PMD for Linux x86_64 gcc target, run the following "make" command: + +.. code-block:: console + + make install T=x86_64-native-linuxapp-gcc + +Use ARM (ThunderX, DPAA, X-Gene) or PowerPC target for respective platform. + +For more information, refer to the :ref:`Getting Started Guide for Linux <linux_gsg>` +or :ref:`Getting Started Guide for FreeBSD <freebsd_gsg>` depending on your platform. + +Running testpmd in Linux +------------------------ + +This section demonstrates how to setup and run ``testpmd`` in Linux. + +#. Mount huge pages: + + .. code-block:: console + + mkdir /mnt/huge + mount -t hugetlbfs nodev /mnt/huge + +#. Request huge pages: + + Hugepage memory should be reserved as per application requirement. Check + hugepage size configured in the system and calculate the number of pages + required. + + To reserve 1024 pages of 2MB: + + .. code-block:: console + + echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages + + .. note:: + + Check ``/proc/meminfo`` to find system hugepage size: + + .. code-block:: console + + grep "Hugepagesize:" /proc/meminfo + + Example output: + + .. code-block:: console + + Hugepagesize: 2048 kB + +#. Load ``igb_uio`` or ``vfio-pci`` driver: + + .. code-block:: console + + modprobe uio + insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko + + or + + .. code-block:: console + + modprobe vfio-pci + +#. Setup VFIO permissions for regular users before binding to ``vfio-pci``: + + .. code-block:: console + + sudo chmod a+x /dev/vfio + + sudo chmod 0666 /dev/vfio/* + +#. Bind the adapters to ``igb_uio`` or ``vfio-pci`` loaded in the previous step: + + .. code-block:: console + + ./usertools/dpdk-devbind.py --bind igb_uio DEVICE1 DEVICE2 ... + + Or setup VFIO permissions for regular users and then bind to ``vfio-pci``: + + .. code-block:: console + + ./usertools/dpdk-devbind.py --bind vfio-pci DEVICE1 DEVICE2 ... + + .. note:: + + DEVICE1, DEVICE2 are specified via PCI "domain:bus:slot.func" syntax or + "bus:slot.func" syntax. + +#. Start ``testpmd`` with basic parameters: + + .. code-block:: console + + ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i + + Successful execution will show initialization messages from EAL, PMD and + testpmd application. A prompt will be displayed at the end for user commands + as interactive mode (``-i``) is on. + + .. code-block:: console + + testpmd> + + Refer to the :ref:`testpmd runtime functions <testpmd_runtime>` for a list + of available commands. diff --git a/doc/guides/nics/cxgbe.rst b/doc/guides/nics/cxgbe.rst index d8236b08..a205b43f 100644 --- a/doc/guides/nics/cxgbe.rst +++ b/doc/guides/nics/cxgbe.rst @@ -125,24 +125,11 @@ enabling debugging options may affect system performance. .. _driver-compilation: -Driver Compilation -~~~~~~~~~~~~~~~~~~ - -To compile CXGBE PMD for Linux x86_64 gcc target, run the following "make" -command: - -.. code-block:: console - - cd <DPDK-source-directory> - make config T=x86_64-native-linuxapp-gcc install - -To compile CXGBE PMD for FreeBSD x86_64 clang target, run the following "gmake" -command: - -.. code-block:: console +Driver compilation and testing +------------------------------ - cd <DPDK-source-directory> - gmake config T=x86_64-native-bsdapp-clang install +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. Linux ----- @@ -218,13 +205,6 @@ Running testpmd This section demonstrates how to launch **testpmd** with Chelsio T5 devices managed by librte_pmd_cxgbe in Linux operating system. -#. Change to DPDK source directory where the target has been compiled in - section :ref:`driver-compilation`: - - .. code-block:: console - - cd <DPDK-source-directory> - #. Load the kernel module: .. code-block:: console @@ -255,60 +235,16 @@ devices managed by librte_pmd_cxgbe in Linux operating system. modprobe -ar cxgb4 csiostor -#. Request huge pages: - - .. code-block:: console - - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages/nr_hugepages - -#. Mount huge pages: - - .. code-block:: console - - mkdir /mnt/huge - mount -t hugetlbfs nodev /mnt/huge - -#. Load igb_uio or vfio-pci driver: - - .. code-block:: console - - insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko - - or - - .. code-block:: console - - modprobe vfio-pci - -#. Bind the Chelsio T5 adapters to igb_uio or vfio-pci loaded in the previous - step: +#. Running testpmd - .. code-block:: console - - ./tools/dpdk-devbind.py --bind igb_uio 0000:02:00.4 - - or - - Setup VFIO permissions for regular users and then bind to vfio-pci: - - .. code-block:: console - - sudo chmod a+x /dev/vfio - - sudo chmod 0666 /dev/vfio/* - - ./tools/dpdk-devbind.py --bind vfio-pci 0000:02:00.4 + Follow instructions available in the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` + to run testpmd. .. note:: Currently, CXGBE PMD only supports the binding of PF4 for Chelsio T5 NICs. -#. Start testpmd with basic parameters: - - .. code-block:: console - - ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xf -n 4 -w 0000:02:00.4 -- -i - Example output: .. code-block:: console @@ -334,10 +270,10 @@ devices managed by librte_pmd_cxgbe in Linux operating system. Done testpmd> -.. note:: + .. note:: - Flow control pause TX/RX is disabled by default and can be enabled via - testpmd. Refer section :ref:`flow-control` for more details. + Flow control pause TX/RX is disabled by default and can be enabled via + testpmd. Refer section :ref:`flow-control` for more details. FreeBSD ------- @@ -509,7 +445,7 @@ devices managed by librte_pmd_cxgbe in FreeBSD operating system. .. code-block:: console - ./x86_64-native-bsdapp-clang/app/testpmd -c 0xf -n 4 -w 0000:02:00.4 -- -i + ./x86_64-native-bsdapp-clang/app/testpmd -l 0-3 -n 4 -w 0000:02:00.4 -- -i Example output: diff --git a/doc/guides/nics/dpaa2.rst b/doc/guides/nics/dpaa2.rst new file mode 100644 index 00000000..1ca27d45 --- /dev/null +++ b/doc/guides/nics/dpaa2.rst @@ -0,0 +1,594 @@ +.. BSD LICENSE + Copyright (C) NXP. 2016. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of NXP nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +DPAA2 Poll Mode Driver +====================== + +The DPAA2 NIC PMD (**librte_pmd_dpaa2**) provides poll mode driver +support for the inbuilt NIC found in the **NXP DPAA2** SoC family. + +More information can be found at `NXP Official Website +<http://www.nxp.com/products/microcontrollers-and-processors/arm-processors/qoriq-arm-processors:QORIQ-ARM>`_. + +NXP DPAA2 (Data Path Acceleration Architecture Gen2) +---------------------------------------------------- + +This section provides an overview of the NXP DPAA2 architecture +and how it is integrated into the DPDK. + +Contents summary + +- DPAA2 overview +- Overview of DPAA2 objects +- DPAA2 driver architecture overview + +.. _dpaa2_overview: + +DPAA2 Overview +~~~~~~~~~~~~~~ + +Reference: `FSL MC BUS in Linux Kernel <https://www.kernel.org/doc/readme/drivers-staging-fsl-mc-README.txt>`_. + +DPAA2 is a hardware architecture designed for high-speed network +packet processing. DPAA2 consists of sophisticated mechanisms for +processing Ethernet packets, queue management, buffer management, +autonomous L2 switching, virtual Ethernet bridging, and accelerator +(e.g. crypto) sharing. + +A DPAA2 hardware component called the Management Complex (or MC) manages the +DPAA2 hardware resources. The MC provides an object-based abstraction for +software drivers to use the DPAA2 hardware. + +The MC uses DPAA2 hardware resources such as queues, buffer pools, and +network ports to create functional objects/devices such as network +interfaces, an L2 switch, or accelerator instances. + +The MC provides memory-mapped I/O command interfaces (MC portals) +which DPAA2 software drivers use to operate on DPAA2 objects: + +The diagram below shows an overview of the DPAA2 resource management +architecture: + +.. code-block:: console + + +--------------------------------------+ + | OS | + | DPAA2 drivers | + | | | + +-----------------------------|--------+ + | + | (create,discover,connect + | config,use,destroy) + | + DPAA2 | + +------------------------| mc portal |-+ + | | | + | +- - - - - - - - - - - - -V- - -+ | + | | | | + | | Management Complex (MC) | | + | | | | + | +- - - - - - - - - - - - - - - -+ | + | | + | Hardware Hardware | + | Resources Objects | + | --------- ------- | + | -queues -DPRC | + | -buffer pools -DPMCP | + | -Eth MACs/ports -DPIO | + | -network interface -DPNI | + | profiles -DPMAC | + | -queue portals -DPBP | + | -MC portals ... | + | ... | + | | + +--------------------------------------+ + +The MC mediates operations such as create, discover, +connect, configuration, and destroy. Fast-path operations +on data, such as packet transmit/receive, are not mediated by +the MC and are done directly using memory mapped regions in +DPIO objects. + +Overview of DPAA2 Objects +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The section provides a brief overview of some key DPAA2 objects. +A simple scenario is described illustrating the objects involved +in creating a network interfaces. + +DPRC (Datapath Resource Container) + + A DPRC is a container object that holds all the other + types of DPAA2 objects. In the example diagram below there + are 8 objects of 5 types (DPMCP, DPIO, DPBP, DPNI, and DPMAC) + in the container. + +.. code-block:: console + + +---------------------------------------------------------+ + | DPRC | + | | + | +-------+ +-------+ +-------+ +-------+ +-------+ | + | | DPMCP | | DPIO | | DPBP | | DPNI | | DPMAC | | + | +-------+ +-------+ +-------+ +---+---+ +---+---+ | + | | DPMCP | | DPIO | | + | +-------+ +-------+ | + | | DPMCP | | + | +-------+ | + | | + +---------------------------------------------------------+ + +From the point of view of an OS, a DPRC behaves similar to a plug and +play bus, like PCI. DPRC commands can be used to enumerate the contents +of the DPRC, discover the hardware objects present (including mappable +regions and interrupts). + +.. code-block:: console + + DPRC.1 (bus) + | + +--+--------+-------+-------+-------+ + | | | | | + DPMCP.1 DPIO.1 DPBP.1 DPNI.1 DPMAC.1 + DPMCP.2 DPIO.2 + DPMCP.3 + +Hardware objects can be created and destroyed dynamically, providing +the ability to hot plug/unplug objects in and out of the DPRC. + +A DPRC has a mappable MMIO region (an MC portal) that can be used +to send MC commands. It has an interrupt for status events (like +hotplug). + +All objects in a container share the same hardware "isolation context". +This means that with respect to an IOMMU the isolation granularity +is at the DPRC (container) level, not at the individual object +level. + +DPRCs can be defined statically and populated with objects +via a config file passed to the MC when firmware starts +it. There is also a Linux user space tool called "restool" +that can be used to create/destroy containers and objects +dynamically. + +DPAA2 Objects for an Ethernet Network Interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +A typical Ethernet NIC is monolithic-- the NIC device contains TX/RX +queuing mechanisms, configuration mechanisms, buffer management, +physical ports, and interrupts. DPAA2 uses a more granular approach +utilizing multiple hardware objects. Each object provides specialized +functions. Groups of these objects are used by software to provide +Ethernet network interface functionality. This approach provides +efficient use of finite hardware resources, flexibility, and +performance advantages. + +The diagram below shows the objects needed for a simple +network interface configuration on a system with 2 CPUs. + +.. code-block:: console + + +---+---+ +---+---+ + CPU0 CPU1 + +---+---+ +---+---+ + | | + +---+---+ +---+---+ + DPIO DPIO + +---+---+ +---+---+ + \ / + \ / + \ / + +---+---+ + DPNI --- DPBP,DPMCP + +---+---+ + | + | + +---+---+ + DPMAC + +---+---+ + | + port/PHY + +Below the objects are described. For each object a brief description +is provided along with a summary of the kinds of operations the object +supports and a summary of key resources of the object (MMIO regions +and IRQs). + +DPMAC (Datapath Ethernet MAC): represents an Ethernet MAC, a +hardware device that connects to an Ethernet PHY and allows +physical transmission and reception of Ethernet frames. + +- MMIO regions: none +- IRQs: DPNI link change +- commands: set link up/down, link config, get stats, IRQ config, enable, reset + +DPNI (Datapath Network Interface): contains TX/RX queues, +network interface configuration, and RX buffer pool configuration +mechanisms. The TX/RX queues are in memory and are identified by +queue number. + +- MMIO regions: none +- IRQs: link state +- commands: port config, offload config, queue config, parse/classify config, IRQ config, enable, reset + +DPIO (Datapath I/O): provides interfaces to enqueue and dequeue +packets and do hardware buffer pool management operations. The DPAA2 +architecture separates the mechanism to access queues (the DPIO object) +from the queues themselves. The DPIO provides an MMIO interface to +enqueue/dequeue packets. To enqueue something a descriptor is written +to the DPIO MMIO region, which includes the target queue number. +There will typically be one DPIO assigned to each CPU. This allows all +CPUs to simultaneously perform enqueue/dequeued operations. DPIOs are +expected to be shared by different DPAA2 drivers. + +- MMIO regions: queue operations, buffer management +- IRQs: data availability, congestion notification, buffer pool depletion +- commands: IRQ config, enable, reset + +DPBP (Datapath Buffer Pool): represents a hardware buffer +pool. + +- MMIO regions: none +- IRQs: none +- commands: enable, reset + +DPMCP (Datapath MC Portal): provides an MC command portal. +Used by drivers to send commands to the MC to manage +objects. + +- MMIO regions: MC command portal +- IRQs: command completion +- commands: IRQ config, enable, reset + +Object Connections +~~~~~~~~~~~~~~~~~~ + +Some objects have explicit relationships that must +be configured: + +- DPNI <--> DPMAC +- DPNI <--> DPNI +- DPNI <--> L2-switch-port + +A DPNI must be connected to something such as a DPMAC, +another DPNI, or L2 switch port. The DPNI connection +is made via a DPRC command. + +.. code-block:: console + + +-------+ +-------+ + | DPNI | | DPMAC | + +---+---+ +---+---+ + | | + +==========+ + +- DPNI <--> DPBP + +A network interface requires a 'buffer pool' (DPBP object) which provides +a list of pointers to memory where received Ethernet data is to be copied. +The Ethernet driver configures the DPBPs associated with the network +interface. + +Interrupts +~~~~~~~~~~ + +All interrupts generated by DPAA2 objects are message +interrupts. At the hardware level message interrupts +generated by devices will normally have 3 components-- +1) a non-spoofable 'device-id' expressed on the hardware +bus, 2) an address, 3) a data value. + +In the case of DPAA2 devices/objects, all objects in the +same container/DPRC share the same 'device-id'. +For ARM-based SoC this is the same as the stream ID. + + +DPAA2 DPDK - Poll Mode Driver Overview +-------------------------------------- + +This section provides an overview of the drivers for +DPAA2-- 1) the bus driver and associated "DPAA2 infrastructure" +drivers and 2) functional object drivers (such as Ethernet). + +As described previously, a DPRC is a container that holds the other +types of DPAA2 objects. It is functionally similar to a plug-and-play +bus controller. + +Each object in the DPRC is a Linux "device" and is bound to a driver. +The diagram below shows the dpaa2 drivers involved in a networking +scenario and the objects bound to each driver. A brief description +of each driver follows. + +.. code-block: console + + + +------------+ + | DPDK DPAA2 | + | PMD | + +------------+ +------------+ + | Ethernet |.......| Mempool | + . . . . . . . . . | (DPNI) | | (DPBP) | + . +---+---+----+ +-----+------+ + . ^ | . + . | |<enqueue, . + . | | dequeue> . + . | | . + . +---+---V----+ . + . . . . . . . . . . .| DPIO driver| . + . . | (DPIO) | . + . . +-----+------+ . + . . | QBMAN | . + . . | Driver | . + +----+------+-------+ +-----+----- | . + | dpaa2 bus | | . + | VFIO fslmc-bus |....................|..................... + | | | + | /bus/fslmc | | + +-------------------+ | + | + ========================== HARDWARE =====|======================= + DPIO + | + DPNI---DPBP + | + DPMAC + | + PHY + =========================================|======================== + + +A brief description of each driver is provided below. + +DPAA2 bus driver +~~~~~~~~~~~~~~~~ + +The DPAA2 bus driver is a rte_bus driver which scans the fsl-mc bus. +Key functions include: + +- Reading the container and setting up vfio group +- Scanning and parsing the various MC objects and adding them to + their respective device list. + +Additionally, it also provides the object driver for generic MC objects. + +DPIO driver +~~~~~~~~~~~ + +The DPIO driver is bound to DPIO objects and provides services that allow +other drivers such as the Ethernet driver to enqueue and dequeue data for +their respective objects. +Key services include: + +- Data availability notifications +- Hardware queuing operations (enqueue and dequeue of data) +- Hardware buffer pool management + +To transmit a packet the Ethernet driver puts data on a queue and +invokes a DPIO API. For receive, the Ethernet driver registers +a data availability notification callback. To dequeue a packet +a DPIO API is used. + +There is typically one DPIO object per physical CPU for optimum +performance, allowing different CPUs to simultaneously enqueue +and dequeue data. + +The DPIO driver operates on behalf of all DPAA2 drivers +active -- Ethernet, crypto, compression, etc. + +DPBP based Mempool driver +~~~~~~~~~~~~~~~~~~~~~~~~~ + +The DPBP driver is bound to a DPBP objects and provides sevices to +create a hardware offloaded packet buffer mempool. + +DPAA2 NIC Driver +~~~~~~~~~~~~~~~~ +The Ethernet driver is bound to a DPNI and implements the kernel +interfaces needed to connect the DPAA2 network interface to +the network stack. + +Each DPNI corresponds to a DPDK network interface. + +Features +^^^^^^^^ + +Features of the DPAA2 PMD are: + +- Multiple queues for TX and RX +- Receive Side Scaling (RSS) +- Packet type information +- Checksum offload +- Promiscuous mode + +Supported DPAA2 SoCs +-------------------- + +- LS2080A/LS2040A +- LS2084A/LS2044A +- LS2088A/LS2048A +- LS1088A/LS1048A + +Prerequisites +------------- + +There are three main pre-requisities for executing DPAA2 PMD on a DPAA2 +compatible board: + +1. **ARM 64 Tool Chain** + + For example, the `*aarch64* Linaro Toolchain <https://releases.linaro.org/components/toolchain/binaries/4.9-2017.01/aarch64-linux-gnu>`_. + +2. **Linux Kernel** + + It can be obtained from `NXP's Github hosting <https://github.com/qoriq-open-source/linux>`_. + +3. **Rootfile system** + + Any *aarch64* supporting filesystem can be used. For example, + Ubuntu 15.10 (Wily) or 16.04 LTS (Xenial) userland which can be obtained + from `here <http://cdimage.ubuntu.com/ubuntu-base/releases/16.04/release/ubuntu-base-16.04.1-base-arm64.tar.gz>`_. + +As an alternative method, DPAA2 PMD can also be executed using images provided +as part of SDK from NXP. The SDK includes all the above prerequisites necessary +to bring up a DPAA2 board. + +The following dependencies are not part of DPDK and must be installed +separately: + +- **NXP Linux SDK** + + NXP Linux software development kit (SDK) includes support for family + of QorIQ® ARM-Architecture-based system on chip (SoC) processors + and corresponding boards. + + It includes the Linux board support packages (BSPs) for NXP SoCs, + a fully operational tool chain, kernel and board specific modules. + + SDK and related information can be obtained from: `NXP QorIQ SDK <http://www.nxp.com/products/software-and-tools/run-time-software/linux-sdk/linux-sdk-for-qoriq-processors:SDKLINUX>`_. + +- **DPDK Helper Scripts** + + DPAA2 based resources can be configured easily with the help of ready scripts + as provided in the DPDK helper repository. + + `DPDK Helper Scripts <https://github.com/qoriq-open-source/dpdk-helper>`_. + +Currently supported by DPDK: + +- NXP SDK **2.0+**. +- MC Firmware version **10.0.0** and higher. +- Supported architectures: **arm64 LE**. + +- Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to setup the basic DPDK environment. + +.. note:: + + Some part of fslmc bus code (mc flib - object library) routines are + dual licensed (BSD & GPLv2). + +Pre-Installation Configuration +------------------------------ + +Config File Options +~~~~~~~~~~~~~~~~~~~ + +The following options can be modified in the ``config`` file. +Please note that enabling debugging options may affect system performance. + +- ``CONFIG_RTE_LIBRTE_FSLMC_BUS`` (default ``n``) + + By default it is enabled only for defconfig_arm64-dpaa2-* config. + Toggle compilation of the ``librte_bus_fslmc`` driver. + +- ``CONFIG_RTE_LIBRTE_DPAA2_PMD`` (default ``n``) + + By default it is enabled only for defconfig_arm64-dpaa2-* config. + Toggle compilation of the ``librte_pmd_dpaa2`` driver. + +- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_DRIVER`` (default ``n``) + + Toggle display of generic debugging messages + +- ``CONFIG_RTE_LIBRTE_DPAA2_USE_PHYS_IOVA`` (default ``y``) + + Toggle to use physical address vs virtual address for hardware accelerators. + +- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_INIT`` (default ``n``) + + Toggle display of initialization related messages. + +- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_RX`` (default ``n``) + + Toggle display of receive fast path run-time message + +- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_TX`` (default ``n``) + + Toggle display of transmit fast path run-time message + +- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_TX_FREE`` (default ``n``) + + Toggle display of transmit fast path buffer free run-time message + +Driver compilation and testing +------------------------------ + +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. + +#. Running testpmd: + + Follow instructions available in the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` + to run testpmd. + + Example output: + + .. code-block:: console + + ./arm64-dpaa2-linuxapp-gcc/testpmd -c 0xff -n 1 \ + -- -i --portmask=0x3 --nb-cores=1 --no-flush-rx + + ..... + EAL: Registered [pci] bus. + EAL: Registered [fslmc] bus. + EAL: Detected 8 lcore(s) + EAL: Probing VFIO support... + EAL: VFIO support initialized + ..... + PMD: DPAA2: Processing Container = dprc.2 + EAL: fslmc: DPRC contains = 51 devices + EAL: fslmc: Bus scan completed + ..... + Configuring Port 0 (socket 0) + Port 0: 00:00:00:00:00:01 + Configuring Port 1 (socket 0) + Port 1: 00:00:00:00:00:02 + ..... + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Port 1 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> + +Limitations +----------- + +Platform Requirement +~~~~~~~~~~~~~~~~~~~~ +DPAA2 drivers for DPDK can only work on NXP SoCs as listed in the +``Supported DPAA2 SoCs``. + +Maximum packet length +~~~~~~~~~~~~~~~~~~~~~ + +The DPAA2 SoC family support a maximum of a 10240 jumbo frame. The value +is fixed and cannot be changed. So, even when the ``rxmode.max_rx_pkt_len`` +member of ``struct rte_eth_conf`` is set to a value lower than 10240, frames +up to 10240 bytes can still reach the host interface. diff --git a/doc/guides/nics/ena.rst b/doc/guides/nics/ena.rst index 073b35ae..d19912e9 100644 --- a/doc/guides/nics/ena.rst +++ b/doc/guides/nics/ena.rst @@ -200,52 +200,23 @@ application runs to completion, the ENA can be detached from igb_uio if necessar Usage example ------------- -This section demonstrates how to launch **testpmd** with Amazon ENA -devices managed by librte_pmd_ena. - -#. Load the kernel modules: - - .. code-block:: console - - modprobe uio - insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko - - .. note:: - - Currently Amazon ENA PMD driver depends on igb_uio user space I/O kernel module - -#. Mount and request huge pages: - - .. code-block:: console - - mount -t hugetlbfs nodev /mnt/hugepages - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages - -#. Bind UIO driver to ENA device (using provided by DPDK binding tool): - - .. code-block:: console - - ./tools/dpdk-devbind.py --bind=igb_uio 0000:02:00.1 - -#. Start testpmd with basic parameters: - - .. code-block:: console - - ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xf -n 4 -- -i - - Example output: - - .. code-block:: console - - [...] - EAL: PCI device 0000:02:00.1 on NUMA socket -1 - EAL: probe driver: 1d0f:ec20 rte_ena_pmd - EAL: PCI memory mapped at 0x7f9b6c400000 - PMD: eth_ena_dev_init(): Initializing 0:2:0.1 - Interactive-mode selected - Configuring Port 0 (socket 0) - Port 0: 00:00:00:11:00:01 - Checking link statuses... - Port 0 Link Up - speed 10000 Mbps - full-duplex - Done - testpmd> +Follow instructions available in the document +:ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to launch +**testpmd** with Amazon ENA devices managed by librte_pmd_ena. + +Example output: + +.. code-block:: console + + [...] + EAL: PCI device 0000:02:00.1 on NUMA socket -1 + EAL: probe driver: 1d0f:ec20 rte_ena_pmd + EAL: PCI memory mapped at 0x7f9b6c400000 + PMD: eth_ena_dev_init(): Initializing 0:2:0.1 + Interactive-mode selected + Configuring Port 0 (socket 0) + Port 0: 00:00:00:11:00:01 + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst index c535b589..89a30158 100644 --- a/doc/guides/nics/enic.rst +++ b/doc/guides/nics/enic.rst @@ -1,5 +1,5 @@ .. BSD LICENSE - Copyright (c) 2015, Cisco Systems, Inc. + Copyright (c) 2017, Cisco Systems, Inc. All rights reserved. Redistribution and use in source and binary forms, with or without @@ -71,9 +71,9 @@ Configuration information - The number of RQs configured in the vNIC should be greater or equal to *twice* the value of the expected nb_rx_q parameter in - the call to rte_eth_dev_configure(). With the addition of rx + the call to rte_eth_dev_configure(). With the addition of Rx scatter, a pair of RQs on the vnic is needed for each receive - queue used by DPDK, even if rx scatter is not being used. + queue used by DPDK, even if Rx scatter is not being used. Having a vNIC with only 1 RQ is not a valid configuration, and will fail with an error message. @@ -99,7 +99,7 @@ Configuration information gives the application the greatest amount of flexibility in its queue configuration. - - *Note*: Since the introduction of rx scatter, for performance + - *Note*: Since the introduction of Rx scatter, for performance reasons, this PMD uses two RQs on the vNIC per receive queue in DPDK. One RQ holds descriptors for the start of a packet the second RQ holds the descriptors for the rest of the fragments of @@ -135,11 +135,86 @@ of the server. With advanced filters, perfect matching of all fields of IPv4, IPv6 headers as well as TCP, UDP and SCTP L4 headers is available through flow director. -Masking of these feilds for partial match is also supported. +Masking of these fields for partial match is also supported. Without advanced filter support, the flow director is limited to IPv4 perfect filtering of the 5-tuple with no masking of fields supported. +SR-IOV mode utilization +----------------------- + +UCS blade servers configured with dynamic vNIC connection policies in UCS +manager are capable of supporting assigned devices on virtual machines (VMs) +through a KVM hypervisor. Assigned devices, also known as 'passthrough' +devices, are SR-IOV virtual functions (VFs) on the host which are exposed +to VM instances. + +The Cisco Virtual Machine Fabric Extender (VM-FEX) gives the VM a dedicated +interface on the Fabric Interconnect (FI). Layer 2 switching is done at +the FI. This may eliminate the requirement for software switching on the +host to route intra-host VM traffic. + +Please refer to `Creating a Dynamic vNIC Connection Policy +<http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/vm_fex/vmware/gui/config_guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide_chapter_010.html#task_433E01651F69464783A68E66DA8A47A5>`_ +for information on configuring SR-IOV Adapter policies using UCS manager. + +Once the policies are in place and the host OS is rebooted, VFs should be +visible on the host, E.g.: + +.. code-block:: console + + # lspci | grep Cisco | grep Ethernet + 0d:00.0 Ethernet controller: Cisco Systems Inc VIC Ethernet NIC (rev a2) + 0d:00.1 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + 0d:00.2 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + 0d:00.3 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + 0d:00.4 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + 0d:00.5 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + 0d:00.6 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + 0d:00.7 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + +Enable Intel IOMMU on the host and install KVM and libvirt. A VM instance should +be created with an assigned device. When using libvirt, this configuration can +be done within the domain (i.e. VM) config file. For example this entry maps +host VF 0d:00:01 into the VM. + +.. code-block:: console + + <interface type='hostdev' managed='yes'> + <mac address='52:54:00:ac:ff:b6'/> + <source> + <address type='pci' domain='0x0000' bus='0x0d' slot='0x00' function='0x1'/> + </source> + +Alternatively, the configuration can be done in a separate file using the +``network`` keyword. These methods are described in the libvirt documentation for +`Network XML format <https://libvirt.org/formatnetwork.html>`_. + +When the VM instance is started, the ENIC KVM driver will bind the host VF to +vfio, complete provisioning on the FI and bring up the link. + +.. note:: + + It is not possible to use a VF directly from the host because it is not + fully provisioned until the hypervisor brings up the VM that it is assigned + to. + +In the VM instance, the VF will now be visible. E.g., here the VF 00:04.0 is +seen on the VM instance and should be available for binding to a DPDK. + +.. code-block:: console + + # lspci | grep Ether + 00:04.0 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) + +Follow the normal DPDK install procedure, binding the VF to either ``igb_uio`` +or ``vfio`` in non-IOMMU mode. + +Please see :ref:`Limitations <enic_limitations>` for limitations in +the use of SR-IOV. + +.. _enic_limitations: + Limitations ----------- @@ -169,12 +244,31 @@ Limitations - Flow director features are not supported on generation 1 Cisco VIC adapters (M81KR and P81E) -How to build the suite? ------------------------ -The build instructions for the DPDK suite should be followed. By default -the ENIC PMD library will be built into the DPDK library. +- **SR-IOV** + + - KVM hypervisor support only. VMware has not been tested. + - Requires VM-FEX, and so is only available on UCS managed servers connected + to Fabric Interconnects. It is not on standalone C-Series servers. + - VF devices are not usable directly from the host. They can only be used + as assigned devices on VM instances. + - Currently, unbind of the ENIC kernel mode driver 'enic.ko' on the VM + instance may hang. As a workaround, enic.ko should blacklisted or removed + from the boot process. + - pci_generic cannot be used as the uio module in the VM. igb_uio or + vfio in non-IOMMU mode can be used. + - The number of RQs in UCSM dynamic vNIC configurations must be at least 2. + - The number of SR-IOV devices is limited to 256. Components on target system + might limit this number to fewer than 256. + +How to build the suite +---------------------- -For configuring and using UIO and VFIO frameworks, please refer the +Refer to the document :ref:`compiling and testing a PMD for a NIC +<pmd_build_and_test>` for details. + +By default the ENIC PMD library will be built into the DPDK library. + +For configuring and using UIO and VFIO frameworks, please refer to the documentation that comes with DPDK suite. Supported Cisco VIC adapters @@ -196,11 +290,13 @@ ENIC PMD supports all recent generations of Cisco VIC adapters including: Supported Operating Systems --------------------------- + Any Linux distribution fulfilling the conditions described in Dependencies section of DPDK documentation. Supported features ------------------ + - Unicast, multicast and broadcast transmission and reception - Receive queue polling - Port Hardware Statistics @@ -216,9 +312,11 @@ Supported features - IPV4, IPV6 and TCP RSS hashing - Scattered Rx - MTU update +- SR-IOV on UCS managed servers connected to Fabric Interconnects. -Known bugs and Unsupported features in this release +Known bugs and unsupported features in this release --------------------------------------------------- + - Signature or flex byte based flow direction - Drop feature of flow direction - VLAN based flow direction @@ -229,6 +327,7 @@ Known bugs and Unsupported features in this release Prerequisites ------------- + - Prepare the system as recommended by DPDK suite. This includes environment variables, hugepages configuration, tool-chains and configuration - Insert vfio-pci kernel module using the command 'modprobe vfio-pci' if the @@ -238,9 +337,8 @@ Prerequisites - DPDK suite should be configured based on the user's decision to use VFIO or UIO framework - If the vNIC device(s) to be used is bound to the kernel mode Ethernet driver - (enic), use 'ifconfig' to bring the interface down. The dpdk-devbind.py tool - can then be used to unbind the device's bus id from the enic kernel mode - driver. + use 'ifconfig' to bring the interface down. The dpdk-devbind.py tool can + then be used to unbind the device's bus id from the ENIC kernel mode driver. - Bind the intended vNIC to vfio-pci in case the user wants ENIC PMD to use VFIO framework using dpdk-devbind.py. - Bind the intended vNIC to igb_uio in case the user wants ENIC PMD to use @@ -271,10 +369,12 @@ libraries and the initialization time of the application. Additional Reference -------------------- + - http://www.cisco.com/c/en/us/products/servers-unified-computing Contact Information ------------------- + Any questions or bugs should be reported to DPDK community and to the ENIC PMD maintainers: diff --git a/doc/guides/nics/features/ark.ini b/doc/guides/nics/features/ark.ini new file mode 100644 index 00000000..31a35279 --- /dev/null +++ b/doc/guides/nics/features/ark.ini @@ -0,0 +1,14 @@ +; +; Supported features of the 'ark' poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Queue start/stop = Y +Jumbo frame = Y +Scattered Rx = Y +Basic stats = Y +Stats per queue = Y +Linux UIO = Y +x86-64 = Y +Usage doc = Y diff --git a/doc/guides/nics/features/avp.ini b/doc/guides/nics/features/avp.ini new file mode 100644 index 00000000..ceb69939 --- /dev/null +++ b/doc/guides/nics/features/avp.ini @@ -0,0 +1,16 @@ +; +; Supported features of the 'AVP' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Link status = Y +Jumbo frame = Y +Scattered Rx = Y +Promiscuous mode = Y +Unicast MAC filter = Y +VLAN offload = Y +Basic stats = Y +Stats per queue = Y +Linux UIO = Y +x86-64 = Y diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini index f1bf9bf2..cafc6c70 100644 --- a/doc/guides/nics/features/default.ini +++ b/doc/guides/nics/features/default.ini @@ -3,14 +3,17 @@ ; ; This file defines the features that are valid for inclusion in ; the other driver files and also the order that they appear in -; the features table in the documentation. +; the features table in the documentation. The feature description +; string should not exceed feature_str_len defined in conf.py. ; [Features] Speed capabilities = Link status = Link status event = +Removal event = Queue status event = Rx interrupt = +Free Tx mbuf on demand = Queue start/stop = MTU update = Jumbo frame = @@ -36,6 +39,7 @@ Flexible filter = Hash filter = Flow director = Flow control = +Flow API = Rate limitation = Traffic mirroring = CRC offload = @@ -43,13 +47,17 @@ VLAN offload = QinQ offload = L3 checksum offload = L4 checksum offload = +MACsec offload = Inner L3 checksum = Inner L4 checksum = Packet type parsing = Timesync = +Rx descriptor status = +Tx descriptor status = Basic stats = Extended stats = Stats per queue = +FW version = EEPROM dump = Registers dump = Multiprocess aware = @@ -60,7 +68,6 @@ Other kdrv = ARMv7 = ARMv8 = Power8 = -TILE-Gx = x86-32 = x86-64 = Usage doc = diff --git a/doc/guides/nics/features/dpaa2.ini b/doc/guides/nics/features/dpaa2.ini new file mode 100644 index 00000000..d43f4046 --- /dev/null +++ b/doc/guides/nics/features/dpaa2.ini @@ -0,0 +1,18 @@ +; +; Supported features of the 'dpaa2' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Link status = Y +Queue start/stop = Y +MTU update = Y +Promiscuous mode = Y +RSS hash = Y +L3 checksum offload = Y +L4 checksum offload = Y +Packet type parsing = Y +Basic stats = Y +Linux VFIO = Y +ARMv8 = Y +Usage doc = Y diff --git a/doc/guides/nics/features/e1000.ini b/doc/guides/nics/features/e1000.ini index 7f6d55c4..260d46da 100644 --- a/doc/guides/nics/features/e1000.ini +++ b/doc/guides/nics/features/e1000.ini @@ -7,6 +7,7 @@ Link status = Y Link status event = Y Rx interrupt = Y +Free Tx mbuf on demand = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y @@ -20,6 +21,8 @@ VLAN offload = Y QinQ offload = Y L3 checksum offload = Y L4 checksum offload = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y BSD nic_uio = Y Linux UIO = Y diff --git a/doc/guides/nics/features/enic.ini b/doc/guides/nics/features/enic.ini index 86576a75..94e7f3cb 100644 --- a/doc/guides/nics/features/enic.ini +++ b/doc/guides/nics/features/enic.ini @@ -10,10 +10,12 @@ Queue start/stop = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y +TSO = Y Promiscuous mode = Y Unicast MAC filter = Y Multicast MAC filter = Y RSS hash = Y +SR-IOV = Y VLAN filter = Y CRC offload = Y VLAN offload = Y diff --git a/doc/guides/nics/features/i40e.ini b/doc/guides/nics/features/i40e.ini index 0d143bca..ecabce0b 100644 --- a/doc/guides/nics/features/i40e.ini +++ b/doc/guides/nics/features/i40e.ini @@ -27,6 +27,7 @@ Tunnel filter = Y Hash filter = Y Flow director = Y Flow control = Y +Flow API = Y Traffic mirroring = Y CRC offload = Y VLAN offload = Y @@ -37,8 +38,11 @@ Inner L3 checksum = Y Inner L4 checksum = Y Packet type parsing = Y Timesync = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y +FW version = Y Multiprocess aware = Y BSD nic_uio = Y Linux UIO = Y @@ -46,3 +50,4 @@ Linux VFIO = Y x86-32 = Y x86-64 = Y ARMv8 = Y +Power8 = Y diff --git a/doc/guides/nics/features/i40e_vec.ini b/doc/guides/nics/features/i40e_vec.ini index edd6b717..206f348b 100644 --- a/doc/guides/nics/features/i40e_vec.ini +++ b/doc/guides/nics/features/i40e_vec.ini @@ -29,6 +29,8 @@ Flow director = Y Flow control = Y Traffic mirroring = Y Timesync = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Multiprocess aware = Y @@ -38,3 +40,4 @@ Linux VFIO = Y x86-32 = Y x86-64 = Y ARMv8 = Y +Power8 = Y diff --git a/doc/guides/nics/features/i40e_vf.ini b/doc/guides/nics/features/i40e_vf.ini index 2f82c6b9..46e0d9fc 100644 --- a/doc/guides/nics/features/i40e_vf.ini +++ b/doc/guides/nics/features/i40e_vf.ini @@ -26,6 +26,8 @@ L4 checksum offload = Y Inner L3 checksum = Y Inner L4 checksum = Y Packet type parsing = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Multiprocess aware = Y diff --git a/doc/guides/nics/features/i40e_vf_vec.ini b/doc/guides/nics/features/i40e_vf_vec.ini index d6674f76..c2c6c19f 100644 --- a/doc/guides/nics/features/i40e_vf_vec.ini +++ b/doc/guides/nics/features/i40e_vf_vec.ini @@ -18,6 +18,8 @@ RSS key update = Y RSS reta update = Y VLAN filter = Y Hash filter = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Multiprocess aware = Y diff --git a/doc/guides/nics/features/igb.ini b/doc/guides/nics/features/igb.ini index 9fafe72d..11450270 100644 --- a/doc/guides/nics/features/igb.ini +++ b/doc/guides/nics/features/igb.ini @@ -33,8 +33,11 @@ L3 checksum offload = Y L4 checksum offload = Y Packet type parsing = Y Timesync = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y +FW version = Y EEPROM dump = Y Registers dump = Y BSD nic_uio = Y diff --git a/doc/guides/nics/features/igb_vf.ini b/doc/guides/nics/features/igb_vf.ini index b6178202..e641a2c9 100644 --- a/doc/guides/nics/features/igb_vf.ini +++ b/doc/guides/nics/features/igb_vf.ini @@ -17,6 +17,8 @@ QinQ offload = Y L3 checksum offload = Y L4 checksum offload = Y Packet type parsing = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Registers dump = Y diff --git a/doc/guides/nics/features/ixgbe.ini b/doc/guides/nics/features/ixgbe.ini index 4a5667f0..4aa7af6d 100644 --- a/doc/guides/nics/features/ixgbe.ini +++ b/doc/guides/nics/features/ixgbe.ini @@ -29,6 +29,7 @@ SYN filter = Y Tunnel filter = Y Flow director = Y Flow control = Y +Flow API = Y Rate limitation = Y Traffic mirroring = Y CRC offload = Y @@ -36,13 +37,17 @@ VLAN offload = Y QinQ offload = Y L3 checksum offload = Y L4 checksum offload = Y +MACsec offload = Y Inner L3 checksum = Y Inner L4 checksum = Y Packet type parsing = Y Timesync = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Stats per queue = Y +FW version = Y EEPROM dump = Y Registers dump = Y Multiprocess aware = Y diff --git a/doc/guides/nics/features/ixgbe_vec.ini b/doc/guides/nics/features/ixgbe_vec.ini index e1773dd6..4da81182 100644 --- a/doc/guides/nics/features/ixgbe_vec.ini +++ b/doc/guides/nics/features/ixgbe_vec.ini @@ -32,6 +32,8 @@ Flow control = Y Rate limitation = Y Traffic mirroring = Y Timesync = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Stats per queue = Y diff --git a/doc/guides/nics/features/ixgbe_vf.ini b/doc/guides/nics/features/ixgbe_vf.ini index bf28215d..b63e32ce 100644 --- a/doc/guides/nics/features/ixgbe_vf.ini +++ b/doc/guides/nics/features/ixgbe_vf.ini @@ -25,6 +25,8 @@ L4 checksum offload = Y Inner L3 checksum = Y Inner L4 checksum = Y Packet type parsing = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Registers dump = Y diff --git a/doc/guides/nics/features/ixgbe_vf_vec.ini b/doc/guides/nics/features/ixgbe_vf_vec.ini index 8b8c90ba..c994857e 100644 --- a/doc/guides/nics/features/ixgbe_vf_vec.ini +++ b/doc/guides/nics/features/ixgbe_vf_vec.ini @@ -17,6 +17,8 @@ RSS hash = Y RSS key update = Y RSS reta update = Y VLAN filter = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Registers dump = Y diff --git a/doc/guides/nics/features/mpipe.ini b/doc/guides/nics/features/kni.ini index ca609331..6deb66ae 100644 --- a/doc/guides/nics/features/mpipe.ini +++ b/doc/guides/nics/features/kni.ini @@ -1,6 +1,7 @@ ; -; Supported features of the 'mpipe' network poll mode driver. +; Supported features of the 'kni' network poll mode driver. ; ; Refer to default.ini for the full list of available PMD features. ; [Features] +Usage doc = Y diff --git a/doc/guides/nics/features/liquidio.ini b/doc/guides/nics/features/liquidio.ini new file mode 100644 index 00000000..49cc3566 --- /dev/null +++ b/doc/guides/nics/features/liquidio.ini @@ -0,0 +1,28 @@ +; +; Supported features of the 'LiquidIO' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Link status = Y +Link status event = Y +Jumbo frame = Y +Scattered Rx = Y +Allmulticast mode = Y +RSS hash = Y +RSS key update = Y +RSS reta update = Y +VLAN filter = Y +CRC offload = Y +VLAN offload = P +L3 checksum offload = Y +L4 checksum offload = Y +Inner L3 checksum = Y +Inner L4 checksum = Y +Basic stats = Y +Extended stats = Y +Multiprocess aware = Y +Linux UIO = Y +Linux VFIO = Y +x86-64 = Y +Usage doc = Y diff --git a/doc/guides/nics/features/mlx4.ini b/doc/guides/nics/features/mlx4.ini index c9828f71..285f0ecf 100644 --- a/doc/guides/nics/features/mlx4.ini +++ b/doc/guides/nics/features/mlx4.ini @@ -6,6 +6,7 @@ [Features] Link status = Y Link status event = Y +Removal event = Y Queue start/stop = Y MTU update = Y Jumbo frame = Y diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini index f811e3fb..e228c412 100644 --- a/doc/guides/nics/features/mlx5.ini +++ b/doc/guides/nics/features/mlx5.ini @@ -7,10 +7,12 @@ Speed capabilities = Y Link status = Y Link status event = Y +Rx interrupt = Y Queue start/stop = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y +TSO = Y Promiscuous mode = Y Allmulticast mode = Y Unicast MAC filter = Y @@ -21,11 +23,16 @@ RSS reta update = Y SR-IOV = Y VLAN filter = Y Flow director = Y +Flow API = Y CRC offload = Y VLAN offload = Y L3 checksum offload = Y L4 checksum offload = Y +Inner L3 checksum = Y +Inner L4 checksum = Y Packet type parsing = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Stats per queue = Y Multiprocess aware = Y diff --git a/doc/guides/nics/features/nfp.ini b/doc/guides/nics/features/nfp.ini index d9671512..a1281d2a 100644 --- a/doc/guides/nics/features/nfp.ini +++ b/doc/guides/nics/features/nfp.ini @@ -4,3 +4,26 @@ ; Refer to default.ini for the full list of available PMD features. ; [Features] +Speed capabilities = Y +Link status = Y +Link status event = Y +Rx interrupt = Y +Queue start/stop = Y +MTU update = Y +Jumbo frame = Y +Promiscuous mode = Y +TSO = Y +RSS hash = Y +RSS key update = Y +RSS reta update = Y +SR-IOV = Y +Flow control = Y +VLAN offload = Y +L3 checksum offload = Y +L4 checksum offload = Y +Basic stats = Y +Stats per queue = Y +Linux UIO = Y +Linux VFIO = Y +x86-64 = Y +Usage doc = Y diff --git a/doc/guides/nics/features/pcap.ini b/doc/guides/nics/features/pcap.ini index 8245cbfb..28e64880 100644 --- a/doc/guides/nics/features/pcap.ini +++ b/doc/guides/nics/features/pcap.ini @@ -10,7 +10,6 @@ Multiprocess aware = Y ARMv7 = Y ARMv8 = Y Power8 = Y -TILE-Gx = Y x86-32 = Y x86-64 = Y Usage doc = Y diff --git a/doc/guides/nics/features/qede.ini b/doc/guides/nics/features/qede.ini index 7d75030a..fba5dc33 100644 --- a/doc/guides/nics/features/qede.ini +++ b/doc/guides/nics/features/qede.ini @@ -23,6 +23,9 @@ CRC offload = Y VLAN offload = Y L3 checksum offload = Y L4 checksum offload = Y +Tunnel filter = Y +Inner L3 checksum = Y +Inner L4 checksum = Y Packet type parsing = Y Basic stats = Y Extended stats = Y @@ -31,3 +34,7 @@ Multiprocess aware = Y Linux UIO = Y x86-64 = Y Usage doc = Y +N-tuple filter = Y +Flow director = Y +LRO = Y +TSO = Y diff --git a/doc/guides/nics/features/qede_vf.ini b/doc/guides/nics/features/qede_vf.ini index acb1b991..21ec40fa 100644 --- a/doc/guides/nics/features/qede_vf.ini +++ b/doc/guides/nics/features/qede_vf.ini @@ -31,4 +31,6 @@ Stats per queue = Y Multiprocess aware = Y Linux UIO = Y x86-64 = Y +LRO = Y +TSO = Y Usage doc = Y diff --git a/doc/guides/nics/features/sfc_efx.ini b/doc/guides/nics/features/sfc_efx.ini new file mode 100644 index 00000000..7957b5e9 --- /dev/null +++ b/doc/guides/nics/features/sfc_efx.ini @@ -0,0 +1,34 @@ +; +; Supported features of the 'sfc_efx' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Speed capabilities = Y +Link status = Y +Link status event = Y +Queue start/stop = Y +MTU update = Y +Jumbo frame = Y +Scattered Rx = Y +TSO = Y +Promiscuous mode = Y +Allmulticast mode = Y +Multicast MAC filter = Y +RSS hash = Y +RSS key update = Y +RSS reta update = Y +SR-IOV = Y +Flow control = Y +Flow API = Y +VLAN offload = P +L3 checksum offload = Y +L4 checksum offload = Y +Packet type parsing = Y +Basic stats = Y +Extended stats = Y +FW version = Y +BSD nic_uio = Y +Linux UIO = Y +Linux VFIO = Y +x86-64 = Y diff --git a/doc/guides/nics/features/tap.ini b/doc/guides/nics/features/tap.ini new file mode 100644 index 00000000..3efae758 --- /dev/null +++ b/doc/guides/nics/features/tap.ini @@ -0,0 +1,26 @@ +; +; Supported features of the 'tap' driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Link status = Y +Link status event = Y +Jumbo frame = Y +Promiscuous mode = Y +Allmulticast mode = Y +Basic stats = Y +Flow API = Y +MTU update = Y +Multicast MAC filter = Y +Speed capabilities = Y +Unicast MAC filter = Y +Packet type parsing = Y +Flow control = Y +Other kdrv = Y +ARMv7 = Y +ARMv8 = Y +Power8 = Y +x86-32 = Y +x86-64 = Y +Usage doc = Y diff --git a/doc/guides/nics/features/vhost.ini b/doc/guides/nics/features/vhost.ini index 23166fba..dffd1f49 100644 --- a/doc/guides/nics/features/vhost.ini +++ b/doc/guides/nics/features/vhost.ini @@ -6,6 +6,7 @@ [Features] Link status = Y Link status event = Y +Free Tx mbuf on demand = Y Queue status event = Y Basic stats = Y Extended stats = Y diff --git a/doc/guides/nics/features/virtio.ini b/doc/guides/nics/features/virtio.ini index 1d996c65..8e3aca1d 100644 --- a/doc/guides/nics/features/virtio.ini +++ b/doc/guides/nics/features/virtio.ini @@ -5,6 +5,7 @@ ; [Features] Link status = Y +Rx interrupt = Y Queue start/stop = Y Scattered Rx = Y Promiscuous mode = Y @@ -14,6 +15,7 @@ Multicast MAC filter = Y VLAN filter = Y Basic stats = Y Stats per queue = Y +Extended stats = Y Multiprocess aware = Y BSD nic_uio = Y Linux UIO = Y @@ -23,3 +25,4 @@ ARMv8 = Y x86-32 = Y x86-64 = Y Usage doc = Y +MTU update = Y diff --git a/doc/guides/nics/features/virtio_vec.ini b/doc/guides/nics/features/virtio_vec.ini index 6dc7cf02..ec93f5c4 100644 --- a/doc/guides/nics/features/virtio_vec.ini +++ b/doc/guides/nics/features/virtio_vec.ini @@ -5,6 +5,7 @@ ; [Features] Link status = Y +Rx interrupt = Y Queue start/stop = Y Promiscuous mode = Y Allmulticast mode = Y diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index 5780268f..4d3c7ca0 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -64,6 +64,7 @@ Features of the I40E PMD are: - SR-IOV VF - Hot plug - IEEE1588/802.1AS timestamping +- VF Daemon (VFD) - EXPERIMENTAL Prerequisites @@ -104,11 +105,6 @@ Please note that enabling debugging options may affect system performance. Toggle the use of Vector PMD instead of normal RX/TX path. To enable vPMD for RX, bulk allocation for Rx must be allowed. -- ``CONFIG_RTE_LIBRTE_I40E_RX_OLFLAGS_ENABLE`` (default ``y``) - - Toggle to enable RX ``olflags``. - This is only meaningful when Vector PMD is used. - - ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC`` (default ``n``) Toggle to use a 16-byte RX descriptor, by default the RX descriptor is 32 byte. @@ -130,82 +126,15 @@ Please note that enabling debugging options may affect system performance. Interrupt Throttling interval. -Driver Compilation -~~~~~~~~~~~~~~~~~~ - -To compile the I40E PMD see :ref:`Getting Started Guide for Linux <linux_gsg>` or -:ref:`Getting Started Guide for FreeBSD <freebsd_gsg>` depending on your platform. - - -Linux ------ - - -Running testpmd -~~~~~~~~~~~~~~~ - -This section demonstrates how to launch ``testpmd`` with Intel XL710/X710 -devices managed by ``librte_pmd_i40e`` in the Linux operating system. - -#. Load ``igb_uio`` or ``vfio-pci`` driver: - - .. code-block:: console - - modprobe uio - insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko - - or - - .. code-block:: console - - modprobe vfio-pci - -#. Bind the XL710/X710 adapters to ``igb_uio`` or ``vfio-pci`` loaded in the previous step: - - .. code-block:: console - - ./tools/dpdk-devbind.py --bind igb_uio 0000:83:00.0 - - Or setup VFIO permissions for regular users and then bind to ``vfio-pci``: - - .. code-block:: console - - ./tools/dpdk-devbind.py --bind vfio-pci 0000:83:00.0 - -#. Start ``testpmd`` with basic parameters: - - .. code-block:: console - - ./x86_64-native-linuxapp-gcc/app/testpmd -c 0xf -n 4 -w 83:00.0 -- -i - - Example output: - - .. code-block:: console - - ... - EAL: PCI device 0000:83:00.0 on NUMA socket 1 - EAL: probe driver: 8086:1572 rte_i40e_pmd - EAL: PCI memory mapped at 0x7f7f80000000 - EAL: PCI memory mapped at 0x7f7f80800000 - PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000208a - Interactive-mode selected - Configuring Port 0 (socket 0) - ... - - PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are - satisfied.Rx Burst Bulk Alloc function will be used on port=0, queue=0. - - ... - Port 0: 68:05:CA:26:85:84 - Checking link statuses... - Port 0 Link Up - speed 10000 Mbps - full-duplex - Done +Driver compilation and testing +------------------------------ - testpmd> +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. SR-IOV: Prerequisites and sample Application Notes -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-------------------------------------------------- #. Load the kernel module: @@ -254,6 +183,37 @@ SR-IOV: Prerequisites and sample Application Notes #. Assign VF to VM, and bring up the VM. Please see the documentation for the *I40E/IXGBE/IGB Virtual Function Driver*. +#. Running testpmd: + + Follow instructions available in the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` + to run testpmd. + + Example output: + + .. code-block:: console + + ... + EAL: PCI device 0000:83:00.0 on NUMA socket 1 + EAL: probe driver: 8086:1572 rte_i40e_pmd + EAL: PCI memory mapped at 0x7f7f80000000 + EAL: PCI memory mapped at 0x7f7f80800000 + PMD: eth_i40e_dev_init(): FW 5.0 API 1.5 NVM 05.00.02 eetrack 8000208a + Interactive-mode selected + Configuring Port 0 (socket 0) + ... + + PMD: i40e_dev_rx_queue_setup(): Rx Burst Bulk Alloc Preconditions are + satisfied.Rx Burst Bulk Alloc function will be used on port=0, queue=0. + + ... + Port 0: 68:05:CA:26:85:84 + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Done + + testpmd> + Sample Application Notes ------------------------ @@ -267,7 +227,7 @@ To start ``testpmd``, and add vlan 10 to port 0: .. code-block:: console - ./app/testpmd -c ffff -n 4 -- -i --forward-mode=mac + ./app/testpmd -l 0-15 -n 4 -- -i --forward-mode=mac ... testpmd> set promisc 0 off @@ -302,7 +262,7 @@ Start ``testpmd`` with ``--disable-rss`` and ``--pkt-filter-mode=perfect``: .. code-block:: console - ./app/testpmd -c ffff -n 4 -- -i --disable-rss --pkt-filter-mode=perfect \ + ./app/testpmd -l 0-15 -n 4 -- -i --disable-rss --pkt-filter-mode=perfect \ --rxq=8 --txq=8 --nb-cores=8 --nb-ports=1 Add a rule to direct ``ipv4-udp`` packet whose ``dst_ip=2.2.2.5, src_ip=2.2.2.3, src_port=32, dst_port=32`` to queue 1: @@ -444,8 +404,8 @@ is used as the VF driver, DPDK cannot choose 16 byte receive descriptor. That is to say, user should keep ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n`` in config file. -Link down with i40e kernel driver after DPDK application exist -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Link down with i40e kernel driver after DPDK application exit +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ After DPDK application quit, and the device is bound back to Linux i40e kernel driver, the link cannot be up after ``ifconfig <dev> up``. @@ -459,3 +419,31 @@ Receive packets with Ethertype 0x88A8 Due to the FW limitation, PF can receive packets with Ethertype 0x88A8 only when floating VEB is disabled. + +Incorrect Rx statistics when packet is oversize +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When a packet is over maximum frame size, the packet is dropped. +However the Rx statistics, when calling `rte_eth_stats_get` incorrectly +shows it as received. + +VF & TC max bandwidth setting +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The per VF max bandwidth and per TC max bandwidth cannot be enabled in parallel. +The dehavior is different when handling per VF and per TC max bandwidth setting. +When enabling per VF max bandwidth, SW will check if per TC max bandwidth is +enabled. If so, return failure. +When enabling per TC max bandwidth, SW will check if per VF max bandwidth +is enabled. If so, disable per VF max bandwidth and continue with per TC max +bandwidth setting. + +TC TX scheduling mode setting +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There're 2 TX scheduling modes for TCs, round robin and strict priority mode. +If a TC is set to strict priority mode, it can consume unlimited bandwidth. +It means if APP has set the max bandwidth for that TC, it comes to no +effect. +It's suggested to set the strict priority mode for a TC that is latency +sensitive but no consuming much bandwidth. diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index 92d56a59..240d0824 100644 --- a/doc/guides/nics/index.rst +++ b/doc/guides/nics/index.rst @@ -36,9 +36,13 @@ Network Interface Controller Drivers :numbered: overview + build_and_test + ark + avp bnx2x bnxt cxgbe + dpaa2 e1000em ena enic @@ -46,11 +50,15 @@ Network Interface Controller Drivers i40e ixgbe intel_vf + kni + liquidio mlx4 mlx5 nfp qede + sfc_efx szedata2 + tap thunderx virtio vhost diff --git a/doc/guides/nics/intel_vf.rst b/doc/guides/nics/intel_vf.rst index 9fe42093..1e83bf6e 100644 --- a/doc/guides/nics/intel_vf.rst +++ b/doc/guides/nics/intel_vf.rst @@ -124,12 +124,12 @@ However: The above is an important consideration to take into account when targeting specific packets to a selected port. -Intel® Fortville 10/40 Gigabit Ethernet Controller VF Infrastructure -^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ +Intel® X710/XL710 Gigabit Ethernet Controller VF Infrastructure +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ In a virtualized environment, the programmer can enable a maximum of *128 Virtual Functions (VF)* -globally per Intel® Fortville 10/40 Gigabit Ethernet Controller NIC device. -Each VF can have a maximum of 16 queue pairs. +globally per Intel® X710/XL710 Gigabit Ethernet Controller NIC device. +The number of queue pairs of each VF can be configured by ``CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF`` in ``config`` file. The Physical Function in host could be either configured by the Linux* i40e driver (in the case of the Linux Kernel-based Virtual Machine [KVM]) or by DPDK PMD PF driver. When using both DPDK PMD PF/VF drivers, the whole NIC will be taken over by DPDK based application. @@ -156,44 +156,6 @@ For example, Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library. -* Using the DPDK PMD PF ixgbe driver to enable VF RSS: - - Same steps as above to install the modules of uio, igb_uio, specify max_vfs for PCI device, and - launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library. - - The available queue number(at most 4) per VF depends on the total number of pool, which is - determined by the max number of VF at PF initialization stage and the number of queue specified - in config: - - * If the max number of VF is set in the range of 1 to 32: - - If the number of rxq is specified as 4(e.g. '--rxq 4' in testpmd), then there are totally 32 - pools(ETH_32_POOLS), and each VF could have 4 or less(e.g. 2) queues; - - If the number of rxq is specified as 2(e.g. '--rxq 2' in testpmd), then there are totally 32 - pools(ETH_32_POOLS), and each VF could have 2 queues; - - * If the max number of VF is in the range of 33 to 64: - - If the number of rxq is 4 ('--rxq 4' in testpmd), then error message is expected as rxq is not - correct at this case; - - If the number of rxq is 2 ('--rxq 2' in testpmd), then there is totally 64 pools(ETH_64_POOLS), - and each VF have 2 queues; - - On host, to enable VF RSS functionality, rx mq mode should be set as ETH_MQ_RX_VMDQ_RSS - or ETH_MQ_RX_RSS mode, and SRIOV mode should be activated(max_vfs >= 1). - It also needs config VF RSS information like hash function, RSS key, RSS key length. - - .. code-block:: console - - testpmd -c 0xffff -n 4 -- --coremask=<core-mask> --rxq=4 --txq=4 -i - - The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller is: - The hash and key are shared among PF and all VF, the RETA table with 128 entries is also shared - among PF and all VF; So it could not to provide a method to query the hash and reta content per - VF on guest, while, if possible, please query them on host(PF) for the shared RETA information. - Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC. When you enable the four Virtual Functions with the above command, the four enabled functions have a Function# represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3. @@ -207,6 +169,9 @@ However: The above is an important consideration to take into account when targeting specific packets to a selected port. + For Intel® X710/XL710 Gigabit Ethernet Controller, queues are in pairs. One queue pair means one receive queue and + one transmit queue. The default number of queue pairs per VF is 4, and can be 16 in maximum. + Intel® 82599 10 Gigabit Ethernet Controller VF Infrastructure ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -241,6 +206,42 @@ For example, Launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library. +* Using the DPDK PMD PF ixgbe driver to enable VF RSS: + + Same steps as above to install the modules of uio, igb_uio, specify max_vfs for PCI device, and + launch the DPDK testpmd/example or your own host daemon application using the DPDK PMD library. + + The available queue number (at most 4) per VF depends on the total number of pool, which is + determined by the max number of VF at PF initialization stage and the number of queue specified + in config: + + * If the max number of VFs (max_vfs) is set in the range of 1 to 32: + + If the number of Rx queues is specified as 4 (``--rxq=4`` in testpmd), then there are totally 32 + pools (ETH_32_POOLS), and each VF could have 4 Rx queues; + + If the number of Rx queues is specified as 2 (``--rxq=2`` in testpmd), then there are totally 32 + pools (ETH_32_POOLS), and each VF could have 2 Rx queues; + + * If the max number of VFs (max_vfs) is in the range of 33 to 64: + + If the number of Rx queues in specified as 4 (``--rxq=4`` in testpmd), then error message is expected + as ``rxq`` is not correct at this case; + + If the number of rxq is 2 (``--rxq=2`` in testpmd), then there is totally 64 pools (ETH_64_POOLS), + and each VF have 2 Rx queues; + + On host, to enable VF RSS functionality, rx mq mode should be set as ETH_MQ_RX_VMDQ_RSS + or ETH_MQ_RX_RSS mode, and SRIOV mode should be activated (max_vfs >= 1). + It also needs config VF RSS information like hash function, RSS key, RSS key length. + +.. note:: + + The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller is: + The hash and key are shared among PF and all VF, the RETA table with 128 entries is also shared + among PF and all VF; So it could not to provide a method to query the hash and reta content per + VF on guest, while, if possible, please query them on host for the shared RETA information. + Virtual Function enumeration is performed in the following sequence by the Linux* pci driver for a dual-port NIC. When you enable the four Virtual Functions with the above command, the four enabled functions have a Function# represented by (Bus#, Device#, Function#) in sequence starting from 0 to 3. @@ -513,7 +514,7 @@ The setup procedure is as follows: .. code-block:: console make install T=x86_64-native-linuxapp-gcc - ./x86_64-native-linuxapp-gcc/app/testpmd -c f -n 4 -- -i + ./x86_64-native-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -- -i #. Finally, access the Guest OS using vncviewer with the localhost:5900 port and check the lspci command output in the Guest OS. The virtual functions will be listed as available for use. diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst index 3b6851b6..696ff693 100644 --- a/doc/guides/nics/ixgbe.rst +++ b/doc/guides/nics/ixgbe.rst @@ -95,9 +95,6 @@ Other features are supported using optional MACRO configuration. They include: * HW extend dual VLAN -* Enabled by RX_OLFLAGS (RTE_IXGBE_RX_OLFLAGS_ENABLE=y) - - To guarantee the constraint, configuration flags in dev_conf.rxmode will be checked: * hw_vlan_strip @@ -129,7 +126,7 @@ The tx_rs_thresh value must be greater than or equal to RTE_PMD_IXGBE_TX_MAX_BUR but less or equal to RTE_IXGBE_TX_MAX_FREE_BUF_SZ. Consequently, by default the tx_rs_thresh value is in the range 32 to 64. -Feature not Supported by RX Vector PMD +Feature not Supported by TX Vector PMD ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ TX vPMD only works when txq_flags is set to IXGBE_SIMPLE_FLAGS. @@ -148,45 +145,33 @@ The following MACROs are used for these three features: * ETH_TXQ_FLAGS_NOXSUMTCP Application Programming Interface -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +--------------------------------- In DPDK release v16.11 an API for ixgbe specific functions has been added to the ixgbe PMD. The declarations for the API functions are in the header ``rte_pmd_ixgbe.h``. Sample Application Notes -~~~~~~~~~~~~~~~~~~~~~~~~ - -testpmd -^^^^^^^ - -By default, using CONFIG_RTE_IXGBE_RX_OLFLAGS_ENABLE=y: - -.. code-block:: console - - ./x86_64-native-linuxapp-gcc/app/testpmd -c 300 -n 4 -- -i --burst=32 --rxfreet=32 --mbcache=250 --txpt=32 --rxht=8 --rxwt=0 --txfreet=32 --txrst=32 --txqflags=0xf01 - -When CONFIG_RTE_IXGBE_RX_OLFLAGS_ENABLE=n, better performance can be achieved: - -.. code-block:: console - - ./x86_64-native-linuxapp-gcc/app/testpmd -c 300 -n 4 -- -i --burst=32 --rxfreet=32 --mbcache=250 --txpt=32 --rxht=8 --rxwt=0 --txfreet=32 --txrst=32 --txqflags=0xf01 --disable-hw-vlan +------------------------ l3fwd -^^^^^ +~~~~~ When running l3fwd with vPMD, there is one thing to note. In the configuration, ensure that port_conf.rxmode.hw_ip_checksum=0. Otherwise, by default, RX vPMD is disabled. load_balancer -^^^^^^^^^^^^^ +~~~~~~~~~~~~~ As in the case of l3fwd, set configure port_conf.rxmode.hw_ip_checksum=0 to enable vPMD. In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in load_balancer to avoid using the default burst size of 144. +Limitations or Known issues +--------------------------- + Malicious Driver Detection not Supported ----------------------------------------- +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The Intel x550 series NICs support a feature called MDD (Malicious Driver Detection) which checks the behavior of the VF driver. @@ -200,14 +185,17 @@ There's significant performance impact to support MDD. DPDK should check if the advanced context descriptor should be set and set it. And DPDK has to ask the info about the header length from the upper layer, because parsing the packet itself is not acceptable. So, it's too expensive to support MDD. -When using kernel PF + DPDK VF on x550, please make sure using the kernel -driver that disables MDD or can disable MDD. (Some kernel driver can use -this CLI 'insmod ixgbe.ko MDD=0,0' to disable MDD. Some kernel driver disables -it by default.) +When using kernel PF + DPDK VF on x550, please make sure to use a kernel +PF driver that disables MDD or can disable MDD. + +Some kernel drivers already disable MDD by default while some kernels can use +the command ``insmod ixgbe.ko MDD=0,0`` to disable MDD. Each "0" in the +command refers to a port. For example, if there are 6 ixgbe ports, the command +should be changed to ``insmod ixgbe.ko MDD=0,0,0,0,0,0``. Statistics ----------- +~~~~~~~~~~ The statistics of ixgbe hardware must be polled regularly in order for it to remain consistent. Running a DPDK application without polling the statistics will @@ -230,6 +218,15 @@ be calculated as follows: In order to ensure valid results, it is recommended to poll every 4 minutes. +MTU setting +~~~~~~~~~~~ + +Although the user can set the MTU separately on PF and VF ports, the ixgbe NIC +only supports one global MTU per physical port. +So when the user sets different MTUs on PF and VF ports in one physical port, +the real MTU for all these PF and VF ports is the largest value set. +This behavior is based on the kernel driver behavior. + Supported Chipsets and NICs --------------------------- diff --git a/doc/guides/nics/kni.rst b/doc/guides/nics/kni.rst new file mode 100644 index 00000000..77542b56 --- /dev/null +++ b/doc/guides/nics/kni.rst @@ -0,0 +1,197 @@ +.. BSD LICENSE + Copyright(c) 2017 Intel Corporation. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +KNI Poll Mode Driver +====================== + +KNI PMD is wrapper to the :ref:`librte_kni <kni>` library. + +This PMD enables using KNI without having a KNI specific application, +any forwarding application can use PMD interface for KNI. + +Sending packets to any DPDK controlled interface or sending to the +Linux networking stack will be transparent to the DPDK application. + +To create a KNI device ``net_kni#`` device name should be used, and this +will create ``kni#`` Linux virtual network interface. + +There is no physical device backend for the virtual KNI device. + +Packets sent to the KNI Linux interface will be received by the DPDK +application, and DPDK application may forward packets to a physical NIC +or to a virtual device (like another KNI interface or PCAP interface). + +To forward any traffic from physical NIC to the Linux networking stack, +an application should control a physical port and create one virtual KNI port, +and forward between two. + +Using this PMD requires KNI kernel module be inserted. + + +Usage +----- + +EAL ``--vdev`` argument can be used to create KNI device instance, like:: + + testpmd --vdev=net_kni0 --vdev=net_kn1 -- -i + +Above command will create ``kni0`` and ``kni1`` Linux network interfaces, +those interfaces can be controlled by standard Linux tools. + +When testpmd forwarding starts, any packets sent to ``kni0`` interface +forwarded to the ``kni1`` interface and vice versa. + +There is no hard limit on number of interfaces that can be created. + + +Default interface configuration +------------------------------- + +``librte_kni`` can create Linux network interfaces with different features, +feature set controlled by a configuration struct, and KNI PMD uses a fixed +configuration: + + .. code-block:: console + + Interface name: kni# + force bind kernel thread to a core : NO + mbuf size: MAX_PACKET_SZ + +KNI control path is not supported with the PMD, since there is no physical +backend device by default. + + +PMD arguments +------------- + +``no_request_thread``, by default PMD creates a phtread for each KNI interface +to handle Linux network interface control commands, like ``ifconfig kni0 up`` + +With ``no_request_thread`` option, pthread is not created and control commands +not handled by PMD. + +By default request thread is enabled. And this argument should not be used +most of the time, unless this PMD used with customized DPDK application to handle +requests itself. + +Argument usage:: + + testpmd --vdev "net_kni0,no_request_thread=1" -- -i + + +PMD log messages +---------------- + +If KNI kernel module (rte_kni.ko) not inserted, following error log printed:: + + "KNI: KNI subsystem has not been initialized. Invoke rte_kni_init() first" + + +PMD testing +----------- + +It is possible to test PMD quickly using KNI kernel module loopback feature: + +* Insert KNI kernel module with loopback support: + + .. code-block:: console + + insmod build/kmod/rte_kni.ko lo_mode=lo_mode_fifo_skb + +* Start testpmd with no physical device but two KNI virtual devices: + + .. code-block:: console + + ./testpmd --vdev net_kni0 --vdev net_kni1 -- -i + + .. code-block:: console + + ... + Configuring Port 0 (socket 0) + KNI: pci: 00:00:00 c580:b8 + Port 0: 1A:4A:5B:7C:A2:8C + Configuring Port 1 (socket 0) + KNI: pci: 00:00:00 600:b9 + Port 1: AE:95:21:07:93:DD + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Port 1 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> + +* Observe Linux interfaces + + .. code-block:: console + + $ ifconfig kni0 && ifconfig kni1 + kni0: flags=4098<BROADCAST,MULTICAST> mtu 1500 + ether ae:8e:79:8e:9b:c8 txqueuelen 1000 (Ethernet) + RX packets 0 bytes 0 (0.0 B) + RX errors 0 dropped 0 overruns 0 frame 0 + TX packets 0 bytes 0 (0.0 B) + TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 + + kni1: flags=4098<BROADCAST,MULTICAST> mtu 1500 + ether 9e:76:43:53:3e:9b txqueuelen 1000 (Ethernet) + RX packets 0 bytes 0 (0.0 B) + RX errors 0 dropped 0 overruns 0 frame 0 + TX packets 0 bytes 0 (0.0 B) + TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 + + +* Start forwarding with tx_first: + + .. code-block:: console + + testpmd> start tx_first + +* Quit and check forwarding stats: + + .. code-block:: console + + testpmd> quit + Telling cores to stop... + Waiting for lcores to finish... + + ---------------------- Forward statistics for port 0 ---------------------- + RX-packets: 35637905 RX-dropped: 0 RX-total: 35637905 + TX-packets: 35637947 TX-dropped: 0 TX-total: 35637947 + ---------------------------------------------------------------------------- + + ---------------------- Forward statistics for port 1 ---------------------- + RX-packets: 35637915 RX-dropped: 0 RX-total: 35637915 + TX-packets: 35637937 TX-dropped: 0 TX-total: 35637937 + ---------------------------------------------------------------------------- + + +++++++++++++++ Accumulated forward statistics for all ports+++++++++++++++ + RX-packets: 71275820 RX-dropped: 0 RX-total: 71275820 + TX-packets: 71275884 TX-dropped: 0 TX-total: 71275884 + ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ + diff --git a/doc/guides/nics/liquidio.rst b/doc/guides/nics/liquidio.rst new file mode 100644 index 00000000..f04cb16d --- /dev/null +++ b/doc/guides/nics/liquidio.rst @@ -0,0 +1,223 @@ +.. BSD LICENSE + Copyright(c) 2017 Cavium, Inc.. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Cavium, Inc. nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER(S) OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +LiquidIO VF Poll Mode Driver +============================ + +The LiquidIO VF PMD library (librte_pmd_lio) provides poll mode driver support for +Cavium LiquidIO® II server adapter VFs. PF management and VF creation can be +done using kernel driver. + +More information can be found at `Cavium Official Website +<http://cavium.com/LiquidIO_Adapters.html>`_. + +Supported LiquidIO Adapters +----------------------------- + +- LiquidIO II CN2350 210SV/225SV +- LiquidIO II CN2360 210SV/225SV + + +Pre-Installation Configuration +------------------------------ + +The following options can be modified in the ``config`` file. +Please note that enabling debugging options may affect system performance. + +- ``CONFIG_RTE_LIBRTE_LIO_PMD`` (default ``y``) + + Toggle compilation of LiquidIO PMD. + +- ``CONFIG_RTE_LIBRTE_LIO_DEBUG_DRIVER`` (default ``n``) + + Toggle display of generic debugging messages. + +- ``CONFIG_RTE_LIBRTE_LIO_DEBUG_INIT`` (default ``n``) + + Toggle display of initialization related messages. + +- ``CONFIG_RTE_LIBRTE_LIO_DEBUG_RX`` (default ``n``) + + Toggle display of receive fast path run-time messages. + +- ``CONFIG_RTE_LIBRTE_LIO_DEBUG_TX`` (default ``n``) + + Toggle display of transmit fast path run-time messages. + +- ``CONFIG_RTE_LIBRTE_LIO_DEBUG_MBOX`` (default ``n``) + + Toggle display of mailbox messages. + +- ``CONFIG_RTE_LIBRTE_LIO_DEBUG_REGS`` (default ``n``) + + Toggle display of register reads and writes. + + +SR-IOV: Prerequisites and Sample Application Notes +-------------------------------------------------- + +This section provides instructions to configure SR-IOV with Linux OS. + +#. Verify SR-IOV and ARI capabilities are enabled on the adapter using ``lspci``: + + .. code-block:: console + + lspci -s <slot> -vvv + + Example output: + + .. code-block:: console + + [...] + Capabilities: [148 v1] Alternative Routing-ID Interpretation (ARI) + [...] + Capabilities: [178 v1] Single Root I/O Virtualization (SR-IOV) + [...] + Kernel driver in use: LiquidIO + +#. Load the kernel module: + + .. code-block:: console + + modprobe liquidio + +#. Bring up the PF ports: + + .. code-block:: console + + ifconfig p4p1 up + ifconfig p4p2 up + +#. Change PF MTU if required: + + .. code-block:: console + + ifconfig p4p1 mtu 9000 + ifconfig p4p2 mtu 9000 + +#. Create VF device(s): + + Echo number of VFs to be created into ``"sriov_numvfs"`` sysfs entry + of the parent PF. + + .. code-block:: console + + echo 1 > /sys/bus/pci/devices/0000:03:00.0/sriov_numvfs + echo 1 > /sys/bus/pci/devices/0000:03:00.1/sriov_numvfs + +#. Assign VF MAC address: + + Assign MAC address to the VF using iproute2 utility. The syntax is:: + + ip link set <PF iface> vf <VF id> mac <macaddr> + + Example output: + + .. code-block:: console + + ip link set p4p1 vf 0 mac F2:A8:1B:5E:B4:66 + +#. Assign VF(s) to VM. + + The VF devices may be passed through to the guest VM using qemu or + virt-manager or virsh etc. + + Example qemu guest launch command: + + .. code-block:: console + + ./qemu-system-x86_64 -name lio-vm -machine accel=kvm \ + -cpu host -m 4096 -smp 4 \ + -drive file=<disk_file>,if=none,id=disk1,format=<type> \ + -device virtio-blk-pci,scsi=off,drive=disk1,id=virtio-disk1,bootindex=1 \ + -device vfio-pci,host=03:00.3 -device vfio-pci,host=03:08.3 + +#. Running testpmd + + Refer to the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to run + ``testpmd`` application. + + .. note:: + + Use ``igb_uio`` instead of ``vfio-pci`` in VM. + + Example output: + + .. code-block:: console + + [...] + EAL: PCI device 0000:03:00.3 on NUMA socket 0 + EAL: probe driver: 177d:9712 net_liovf + EAL: using IOMMU type 1 (Type 1) + PMD: net_liovf[03:00.3]INFO: DEVICE : CN23XX VF + EAL: PCI device 0000:03:08.3 on NUMA socket 0 + EAL: probe driver: 177d:9712 net_liovf + PMD: net_liovf[03:08.3]INFO: DEVICE : CN23XX VF + Interactive-mode selected + USER1: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0 + Configuring Port 0 (socket 0) + PMD: net_liovf[03:00.3]INFO: Starting port 0 + Port 0: F2:A8:1B:5E:B4:66 + Configuring Port 1 (socket 0) + PMD: net_liovf[03:08.3]INFO: Starting port 1 + Port 1: 32:76:CC:EE:56:D7 + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Port 1 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> + + +Limitations +----------- + +VF MTU +~~~~~~ + +VF MTU is limited by PF MTU. Raise PF value before configuring VF for larger packet size. + +VLAN offload +~~~~~~~~~~~~ + +Tx VLAN insertion is not supported and consequently VLAN offload feature is +marked partial. + +Ring size +~~~~~~~~~ + +Number of descriptors for Rx/Tx ring should be in the range 128 to 512. + +CRC striping +~~~~~~~~~~~~ + +LiquidIO adapters strip ethernet FCS of every packet coming to the host +interface. So, CRC will be stripped even when the ``rxmode.hw_strip_crc`` +member is set to 0 in ``struct rte_eth_conf``. diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst index 49f46263..f1f26d4f 100644 --- a/doc/guides/nics/mlx4.rst +++ b/doc/guides/nics/mlx4.rst @@ -162,6 +162,12 @@ Run-time configuration - **ethtool** operations on related kernel interfaces also affect the PMD. +- ``port`` parameter [int] + + This parameter provides a physical port to probe and can be specified multiple + times for additional ports. All ports are probed by default if left + unspecified. + Kernel module parameters ~~~~~~~~~~~~~~~~~~~~~~~~ @@ -238,8 +244,8 @@ DPDK and must be installed separately: Currently supported by DPDK: -- Mellanox OFED **3.1**. -- Firmware version **2.35.5100** and higher. +- Mellanox OFED **4.0-2.0.0.0**. +- Firmware version **2.40.7000**. - Supported architectures: **x86_64** and **POWER8**. Getting Mellanox OFED @@ -262,6 +268,11 @@ required from that distribution. this DPDK release was developed and tested against is strongly recommended. Please check the `prerequisites`_. +Supported NICs +-------------- + +* Mellanox(R) ConnectX(R)-3 Pro 40G MCX354A-FCC_Ax (2*40G) + Usage example ------------- @@ -338,7 +349,7 @@ devices managed by librte_pmd_mlx4. .. code-block:: console - testpmd -c 0xff00 -n 4 -w 0000:83:00.0 -w 0000:84:00.0 -- --rxq=2 --txq=2 -i + testpmd -l 8-15 -n 4 -w 0000:83:00.0 -w 0000:84:00.0 -- --rxq=2 --txq=2 -i Example output: diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 98d13419..da6dc278 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -30,10 +30,10 @@ MLX5 poll mode driver ===================== -The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support for -**Mellanox ConnectX-4** and **Mellanox ConnectX-4 Lx** families of -10/25/40/50/100 Gb/s adapters as well as their virtual functions (VF) in -SR-IOV context. +The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support +for **Mellanox ConnectX-4**, **Mellanox ConnectX-4 Lx** and **Mellanox +ConnectX-5** families of 10/25/40/50/100 Gb/s adapters as well as their +virtual functions (VF) in SR-IOV context. Information and documentation about these adapters can be found on the `Mellanox website <http://www.mellanox.com>`__. Help is also provided by the @@ -86,16 +86,19 @@ Features - Hardware checksum offloads. - Flow director (RTE_FDIR_MODE_PERFECT, RTE_FDIR_MODE_PERFECT_MAC_VLAN and RTE_ETH_FDIR_REJECT). +- Flow API. - Secondary process TX is supported. - KVM and VMware ESX SR-IOV modes are supported. - RSS hash result is supported. +- Hardware TSO. +- Hardware checksum TX offload for VXLAN and GRE. Limitations ----------- - Inner RSS for VXLAN frames is not supported yet. - Port statistics through software counters only. -- Hardware checksum offloads for VXLAN inner header are not supported yet. +- Hardware checksum RX offloads for VXLAN inner header are not supported yet. - Secondary process RX is not supported. Configuration @@ -180,13 +183,47 @@ Run-time configuration - ``txq_mpw_en`` parameter [int] - A nonzero value enables multi-packet send. This feature allows the TX - burst function to pack up to five packets in two descriptors in order to - save PCI bandwidth and improve performance at the cost of a slightly - higher CPU usage. + A nonzero value enables multi-packet send (MPS) for ConnectX-4 Lx and + enhanced multi-packet send (Enhanced MPS) for ConnectX-5. MPS allows the + TX burst function to pack up multiple packets in a single descriptor + session in order to save PCI bandwidth and improve performance at the + cost of a slightly higher CPU usage. When ``txq_inline`` is set along + with ``txq_mpw_en``, TX burst function tries to copy entire packet data + on to TX descriptor instead of including pointer of packet only if there + is enough room remained in the descriptor. ``txq_inline`` sets + per-descriptor space for either pointers or inlined packets. In addition, + Enhanced MPS supports hybrid mode - mixing inlined packets and pointers + in the same descriptor. - It is currently only supported on the ConnectX-4 Lx family of adapters. - Enabled by default. + This option cannot be used in conjunction with ``tso`` below. When ``tso`` + is set, ``txq_mpw_en`` is disabled. + + It is currently only supported on the ConnectX-4 Lx and ConnectX-5 + families of adapters. Enabled by default. + +- ``txq_mpw_hdr_dseg_en`` parameter [int] + + A nonzero value enables including two pointers in the first block of TX + descriptor. This can be used to lessen CPU load for memory copy. + + Effective only when Enhanced MPS is supported. Disabled by default. + +- ``txq_max_inline_len`` parameter [int] + + Maximum size of packet to be inlined. This limits the size of packet to + be inlined. If the size of a packet is larger than configured value, the + packet isn't inlined even though there's enough space remained in the + descriptor. Instead, the packet is included with pointer. + + Effective only when Enhanced MPS is supported. The default value is 256. + +- ``tso`` parameter [int] + + A nonzero value enables hardware TSO. + When hardware TSO is enabled, packets marked with TCP segmentation + offload will be divided into segments by the hardware. + + Disabled by default. Prerequisites ------------- @@ -207,8 +244,8 @@ DPDK and must be installed separately: - **libmlx5** - Low-level user space driver library for Mellanox ConnectX-4 devices, - it is automatically loaded by libibverbs. + Low-level user space driver library for Mellanox ConnectX-4/ConnectX-5 + devices, it is automatically loaded by libibverbs. This library basically implements send/receive calls to the hardware queues. @@ -222,14 +259,15 @@ DPDK and must be installed separately: Unlike most other PMDs, these modules must remain loaded and bound to their devices: - - mlx5_core: hardware driver managing Mellanox ConnectX-4 devices and - related Ethernet kernel network devices. + - mlx5_core: hardware driver managing Mellanox ConnectX-4/ConnectX-5 + devices and related Ethernet kernel network devices. - mlx5_ib: InifiniBand device driver. - ib_uverbs: user space driver for Verbs (entry point for libibverbs). - **Firmware update** - Mellanox OFED releases include firmware updates for ConnectX-4 adapters. + Mellanox OFED releases include firmware updates for ConnectX-4/ConnectX-5 + adapters. Because each release provides new features, these updates must be applied to match the kernel modules and libraries they come with. @@ -241,12 +279,13 @@ DPDK and must be installed separately: Currently supported by DPDK: -- Mellanox OFED **3.4-1.0.0.0**. - +- Mellanox OFED version: **4.0-2.0.0.0** - firmware version: - - ConnectX-4: **12.17.1010** - - ConnectX-4 Lx: **14.17.1010** + - ConnectX-4: **12.18.2000** + - ConnectX-4 Lx: **14.18.2000** + - ConnectX-5: **16.19.1200** + - ConnectX-5 Ex: **16.19.1200** Getting Mellanox OFED ~~~~~~~~~~~~~~~~~~~~~ @@ -268,6 +307,29 @@ required from that distribution. this DPDK release was developed and tested against is strongly recommended. Please check the `prerequisites`_. +Supported NICs +-------------- + +* Mellanox(R) ConnectX(R)-4 10G MCX4111A-XCAT (1x10G) +* Mellanox(R) ConnectX(R)-4 10G MCX4121A-XCAT (2x10G) +* Mellanox(R) ConnectX(R)-4 25G MCX4111A-ACAT (1x25G) +* Mellanox(R) ConnectX(R)-4 25G MCX4121A-ACAT (2x25G) +* Mellanox(R) ConnectX(R)-4 40G MCX4131A-BCAT (1x40G) +* Mellanox(R) ConnectX(R)-4 40G MCX413A-BCAT (1x40G) +* Mellanox(R) ConnectX(R)-4 40G MCX415A-BCAT (1x40G) +* Mellanox(R) ConnectX(R)-4 50G MCX4131A-GCAT (1x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX413A-GCAT (1x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX414A-BCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX415A-GCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX416A-BCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX416A-GCAT (2x50G) +* Mellanox(R) ConnectX(R)-4 50G MCX415A-CCAT (1x100G) +* Mellanox(R) ConnectX(R)-4 100G MCX416A-CCAT (2x100G) +* Mellanox(R) ConnectX(R)-4 Lx 10G MCX4121A-XCAT (2x10G) +* Mellanox(R) ConnectX(R)-4 Lx 25G MCX4121A-ACAT (2x25G) +* Mellanox(R) ConnectX(R)-5 100G MCX556A-ECAT (2x100G) +* Mellanox(R) ConnectX(R)-5 Ex EN 100G MCX516A-CDAT (2x100G) + Notes for testpmd ----------------- @@ -288,8 +350,8 @@ behavior as librte_pmd_mlx4: Usage example ------------- -This section demonstrates how to launch **testpmd** with Mellanox ConnectX-4 -devices managed by librte_pmd_mlx5. +This section demonstrates how to launch **testpmd** with Mellanox +ConnectX-4/ConnectX-5 devices managed by librte_pmd_mlx5. #. Load the kernel modules: @@ -356,7 +418,7 @@ devices managed by librte_pmd_mlx5. .. code-block:: console - testpmd -c 0xff00 -n 4 -w 05:00.0 -w 05:00.1 -w 06:00.0 -w 06:00.1 -- --rxq=2 --txq=2 -i + testpmd -l 8-15 -n 4 -w 05:00.0 -w 05:00.1 -w 06:00.0 -w 06:00.1 -- --rxq=2 --txq=2 -i Example output: diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst index 4ef6e026..c732fb1f 100644 --- a/doc/guides/nics/nfp.rst +++ b/doc/guides/nics/nfp.rst @@ -68,10 +68,12 @@ Building the software --------------------- Netronome's PMD code is provided in the **drivers/net/nfp** directory. -Because Netronome´s BSP dependencies the driver is disabled by default -in DPDK build using **common_linuxapp configuration** file. Enabling the -driver or if you use another configuration file and want to have NFP -support, this variable is needed: +Although NFP PMD has Netronome´s BSP dependencies, it is possible to +compile it along with other DPDK PMDs even if no BSP was installed before. +Of course, a DPDK app will require such a BSP installed for using the +NFP PMD. + +Default PMD configuration is at **common_linuxapp configuration** file: - **CONFIG_RTE_LIBRTE_NFP_PMD=y** @@ -79,85 +81,15 @@ Once DPDK is built all the DPDK apps and examples include support for the NFP PMD. -System configuration --------------------- - -Using the NFP PMD is not different to using other PMDs. Usual steps are: - -#. **Configure hugepages:** All major Linux distributions have the hugepages - functionality enabled by default. By default this allows the system uses for - working with transparent hugepages. But in this case some hugepages need to - be created/reserved for use with the DPDK through the hugetlbfs file system. - First the virtual file system need to be mounted: - - .. code-block:: console - - mount -t hugetlbfs none /mnt/hugetlbfs - - The command uses the common mount point for this file system and it needs to - be created if necessary. - - Configuring hugepages is performed via sysfs: - - .. code-block:: console - - /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages - - This sysfs file is used to specify the number of hugepages to reserve. - For example: - - .. code-block:: console - - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages - - This will reserve 2GB of memory using 1024 2MB hugepages. The file may be - read to see if the operation was performed correctly: - - .. code-block:: console - - cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages - - The number of unused hugepages may also be inspected. - - Before executing the DPDK app it should match the value of nr_hugepages. - - .. code-block:: console - - cat /sys/kernel/mm/hugepages/hugepages-2048kB/free_hugepages - - The hugepages reservation should be performed at system initialization and - it is usual to use a kernel parameter for configuration. If the reservation - is attempted on a busy system it will likely fail. Reserving memory for - hugepages may be done adding the following to the grub kernel command line: - - .. code-block:: console - - default_hugepagesz=1M hugepagesz=2M hugepages=1024 - - This will reserve 2GBytes of memory using 2Mbytes huge pages. - - Finally, for a NUMA system the allocation needs to be made on the correct - NUMA node. In a DPDK app there is a master core which will (usually) perform - memory allocation. It is important that some of the hugepages are reserved - on the NUMA memory node where the network device is attached. This is because - of a restriction in DPDK by which TX and RX descriptors rings must be created - on the master code. - - Per-node allocation of hugepages may be inspected and controlled using sysfs. - For example: - - .. code-block:: console +Driver compilation and testing +------------------------------ - cat /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. - For a NUMA system there will be a specific hugepage directory per node - allowing control of hugepage reservation. A common problem may occur when - hugepages reservation is performed after the system has been working for - some time. Configuration using the global sysfs hugepage interface will - succeed but the per-node allocations may be unsatisfactory. - The number of hugepages that need to be reserved depends on how the app uses - TX and RX descriptors, and packets mbufs. +System configuration +-------------------- #. **Enable SR-IOV on the NFP-6xxx device:** The current NFP PMD works with Virtual Functions (VFs) on a NFP device. Make sure that one of the Physical @@ -189,62 +121,3 @@ Using the NFP PMD is not different to using other PMDs. Usual steps are: -k option shows the device driver, if any, that devices are bound to. Depending on the modules loaded at this point the new PCI devices may be bound to nfp_netvf driver. - -#. **To install the uio kernel module (manually):** All major Linux - distributions have support for this kernel module so it is straightforward - to install it: - - .. code-block:: console - - modprobe uio - - The module should now be listed by the lsmod command. - -#. **To install the igb_uio kernel module (manually):** This module is part - of DPDK sources and configured by default (CONFIG_RTE_EAL_IGB_UIO=y). - - .. code-block:: console - - modprobe igb_uio.ko - - The module should now be listed by the lsmod command. - - Depending on which NFP modules are loaded, it could be necessary to - detach NFP devices from the nfp_netvf module. If this is the case the - device needs to be unbound, for example: - - .. code-block:: console - - echo 0000:03:08.0 > /sys/bus/pci/devices/0000:03:08.0/driver/unbind - - lspci -d19ee: -k - - The output of lspci should now show that 0000:03:08.0 is not bound to - any driver. - - The next step is to add the NFP PCI ID to the IGB UIO driver: - - .. code-block:: console - - echo 19ee 6003 > /sys/bus/pci/drivers/igb_uio/new_id - - And then to bind the device to the igb_uio driver: - - .. code-block:: console - - echo 0000:03:08.0 > /sys/bus/pci/drivers/igb_uio/bind - - lspci -d19ee: -k - - lspci should show that device bound to igb_uio driver. - -#. **Using scripts to install and bind modules:** DPDK provides scripts which are - useful for installing the UIO modules and for binding the right device to those - modules avoiding doing so manually: - - * **dpdk-setup.sh** - * **dpdk-devbind.py** - - Configuration may be performed by running dpdk-setup.sh which invokes - dpdk-devbind.py as needed. Executing dpdk-setup.sh will display a menu of - configuration options. diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst index 2c7f5eb9..757a3c90 100644 --- a/doc/guides/nics/overview.rst +++ b/doc/guides/nics/overview.rst @@ -50,28 +50,6 @@ Most of these differences are summarized below. .. _table_net_pmd_features: -.. raw:: html - - <style> - table#id1 th { - font-size: 80%; - white-space: pre-wrap; - text-align: center; - vertical-align: top; - padding: 2px; - } - table#id1 th:first-child { - vertical-align: bottom; - } - table#id1 td { - font-size: 70%; - padding: 1px; - } - table#id1 td:first-child { - padding-left: 1em; - } - </style> - .. include:: overview_table.txt .. Note:: diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst index 79c95255..5e4f5f60 100644 --- a/doc/guides/nics/pcap_ring.rst +++ b/doc/guides/nics/pcap_ring.rst @@ -69,7 +69,7 @@ Device name and stream options must be separated by commas as shown below: .. code-block:: console - $RTE_TARGET/app/testpmd -c f -n 4 \ + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ --vdev 'net_pcap0,stream_opt0=..,stream_opt1=..' \ --vdev='net_pcap1,stream_opt0=..' @@ -122,7 +122,7 @@ Read packets from one pcap file and write them to another: .. code-block:: console - $RTE_TARGET/app/testpmd -c '0xf' -n 4 \ + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_pcap=file_tx.pcap' \ -- --port-topology=chained @@ -130,7 +130,7 @@ Read packets from a network interface and write them to a pcap file: .. code-block:: console - $RTE_TARGET/app/testpmd -c '0xf' -n 4 \ + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ --vdev 'net_pcap0,rx_iface=eth0,tx_pcap=file_tx.pcap' \ -- --port-topology=chained @@ -138,7 +138,7 @@ Read packets from a pcap file and write them to a network interface: .. code-block:: console - $RTE_TARGET/app/testpmd -c '0xf' -n 4 \ + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_iface=eth1' \ -- --port-topology=chained @@ -146,7 +146,7 @@ Forward packets through two network interfaces: .. code-block:: console - $RTE_TARGET/app/testpmd -c '0xf' -n 4 \ + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ --vdev 'net_pcap0,iface=eth0' --vdev='net_pcap1;iface=eth1' Using libpcap-based PMD with the testpmd Application @@ -171,7 +171,7 @@ Otherwise, the first 512 packets from the input pcap file will be discarded by t .. code-block:: console - $RTE_TARGET/app/testpmd -c '0xf' -n 4 \ + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ --vdev 'net_pcap0,rx_pcap=file_rx.pcap,tx_pcap=file_tx.pcap' \ -- --port-topology=chained --no-flush-rx @@ -185,7 +185,7 @@ Multiple devices may be specified, separated by commas. .. code-block:: console - ./testpmd -c E -n 4 --vdev=net_ring0 --vdev=net_ring1 -- -i + ./testpmd -l 1-3 -n 4 --vdev=net_ring0 --vdev=net_ring1 -- -i EAL: Detected lcore 1 as core 1 on socket 0 ... diff --git a/doc/guides/nics/qede.rst b/doc/guides/nics/qede.rst index d22ecdd9..afe2df89 100644 --- a/doc/guides/nics/qede.rst +++ b/doc/guides/nics/qede.rst @@ -32,7 +32,7 @@ QEDE Poll Mode Driver ====================== The QEDE poll mode driver library (**librte_pmd_qede**) implements support -for **QLogic FastLinQ QL4xxxx 25G/40G/100G CNA** family of adapters as well +for **QLogic FastLinQ QL4xxxx 10G/25G/40G/50G/100G CNA** family of adapters as well as their virtual functions (VF) in SR-IOV context. It is supported on several standard Linux distros like RHEL7.x, SLES12.x and Ubuntu. It is compile-tested under FreeBSD OS. @@ -59,27 +59,29 @@ Supported Features - MTU change - Multiprocess aware - Scatter-Gather +- VXLAN tunneling offload +- N-tuple filter and flow director (limited support) +- LRO/TSO Non-supported Features ---------------------- - SR-IOV PF -- Tunneling offloads -- LRO/TSO +- GENEVE and NVGRE Tunneling offloads - NPAR Supported QLogic Adapters ------------------------- -- QLogic FastLinQ QL4xxxx 10G/25G/40G/100G CNAs. +- QLogic FastLinQ QL4xxxx 10G/25G/40G/50G/100G CNAs. Prerequisites ------------- -- Requires firmware version **8.10.x.** and management firmware - version **8.10.x or higher**. Firmware may be available +- Requires firmware version **8.18.x.** and management firmware + version **8.18.x or higher**. Firmware may be available inbox in certain newer Linux distros under the standard directory - ``E.g. /lib/firmware/qed/qed_init_values-8.10.9.0.bin`` + ``E.g. /lib/firmware/qed/qed_init_values-8.18.9.0.bin`` - If the required firmware files are not available then visit `QLogic Driver Download Center <http://driverdownloads.qlogic.com>`_. @@ -118,72 +120,104 @@ enabling debugging options may affect system performance. - ``CONFIG_RTE_LIBRTE_QEDE_FW`` (default **""**) Gives absolute path of firmware file. - ``Eg: "/lib/firmware/qed/qed_init_values_zipped-8.10.9.0.bin"`` + ``Eg: "/lib/firmware/qed/qed_init_values_zipped-8.18.9.0.bin"`` Empty string indicates driver will pick up the firmware file from the default location. -Driver Compilation -~~~~~~~~~~~~~~~~~~ +Driver compilation and testing +------------------------------ -To compile QEDE PMD for Linux x86_64 gcc target, run the following ``make`` -command:: +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. - cd <DPDK-source-directory> - make config T=x86_64-native-linuxapp-gcc install +SR-IOV: Prerequisites and Sample Application Notes +-------------------------------------------------- -To compile QEDE PMD for Linux x86_64 clang target, run the following ``make`` -command:: +This section provides instructions to configure SR-IOV with Linux OS. - cd <DPDK-source-directory> - make config T=x86_64-native-linuxapp-clang install +**Note**: librte_pmd_qede will be used to bind to SR-IOV VF device and Linux native kernel driver (QEDE) will function as SR-IOV PF driver. Requires PF driver to be 8.10.x.x or higher. -To compile QEDE PMD for FreeBSD x86_64 clang target, run the following ``gmake`` -command:: +#. Verify SR-IOV and ARI capability is enabled on the adapter using ``lspci``: - cd <DPDK-source-directory> - gmake config T=x86_64-native-bsdapp-clang install + .. code-block:: console -To compile QEDE PMD for FreeBSD x86_64 gcc target, run the following ``gmake`` -command:: + lspci -s <slot> -vvv - cd <DPDK-source-directory> - gmake config T=x86_64-native-bsdapp-gcc install -Wl,-rpath=\ - /usr/local/lib/gcc49 CC=gcc49 + Example output: + .. code-block:: console -Sample Application Notes -~~~~~~~~~~~~~~~~~~~~~~~~ + [...] + Capabilities: [1b8 v1] Alternative Routing-ID Interpretation (ARI) + [...] + Capabilities: [1c0 v1] Single Root I/O Virtualization (SR-IOV) + [...] + Kernel driver in use: igb_uio -This section demonstrates how to launch ``testpmd`` with QLogic 4xxxx -devices managed by ``librte_pmd_qede`` in Linux operating system. +#. Load the kernel module: -#. Request huge pages: + .. code-block:: console + + modprobe qede + + Example output: + + .. code-block:: console + + systemd-udevd[4848]: renamed network interface eth0 to ens5f0 + systemd-udevd[4848]: renamed network interface eth1 to ens5f1 + +#. Bring up the PF ports: .. code-block:: console - echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages/ \ - nr_hugepages + ifconfig ens5f0 up + ifconfig ens5f1 up + +#. Create VF device(s): + + Echo the number of VFs to be created into ``"sriov_numvfs"`` sysfs entry + of the parent PF. -#. Load ``igb_uio`` driver: + Example output: .. code-block:: console - insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko + echo 2 > /sys/devices/pci0000:00/0000:00:03.0/0000:81:00.0/sriov_numvfs -#. Bind the QLogic 4xxxx adapters to ``igb_uio`` loaded in the - previous step: + +#. Assign VF MAC address: + + Assign MAC address to the VF using iproute2 utility. The syntax is:: + + ip link set <PF iface> vf <VF id> mac <macaddr> + + Example output: .. code-block:: console - ./tools/dpdk-devbind.py --bind igb_uio 0000:84:00.0 0000:84:00.1 \ - 0000:84:00.2 0000:84:00.3 + ip link set ens5f0 vf 0 mac 52:54:00:2f:9d:e8 + -#. Start ``testpmd`` with basic parameters: - (Enable QEDE_DEBUG_INFO=y to view informational messages) +#. PCI Passthrough: + + The VF devices may be passed through to the guest VM using ``virt-manager`` or + ``virsh``. QEDE PMD should be used to bind the VF devices in the guest VM + using the instructions from Driver compilation and testing section above. + + +#. Running testpmd + (Enable QEDE_DEBUG_INFO=y to view informational messages): + + Refer to the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to run + ``testpmd`` application. + + Example output: .. code-block:: console - testpmd -c 0xff1 -n 4 -- -i --nb-cores=8 --portmask=0xf --rxd=4096 \ + testpmd -l 0,4-11 -n 4 -- -i --nb-cores=8 --portmask=0xf --rxd=4096 \ --txd=4096 --txfreet=4068 --enable-rx-cksum --rxq=4 --txq=4 \ --rss-ip --rss-udp @@ -234,79 +268,3 @@ devices managed by ``librte_pmd_qede`` in Linux operating system. Port 3 Link Up - speed 25000 Mbps - full-duplex Done testpmd> - - -SR-IOV: Prerequisites and Sample Application Notes -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This section provides instructions to configure SR-IOV with Linux OS. - -**Note**: librte_pmd_qede will be used to bind to SR-IOV VF device and Linux native kernel driver (QEDE) will function as SR-IOV PF driver. Requires PF driver to be 8.10.x.x or higher. - -#. Verify SR-IOV and ARI capability is enabled on the adapter using ``lspci``: - - .. code-block:: console - - lspci -s <slot> -vvv - - Example output: - - .. code-block:: console - - [...] - Capabilities: [1b8 v1] Alternative Routing-ID Interpretation (ARI) - [...] - Capabilities: [1c0 v1] Single Root I/O Virtualization (SR-IOV) - [...] - Kernel driver in use: igb_uio - -#. Load the kernel module: - - .. code-block:: console - - modprobe qede - - Example output: - - .. code-block:: console - - systemd-udevd[4848]: renamed network interface eth0 to ens5f0 - systemd-udevd[4848]: renamed network interface eth1 to ens5f1 - -#. Bring up the PF ports: - - .. code-block:: console - - ifconfig ens5f0 up - ifconfig ens5f1 up - -#. Create VF device(s): - - Echo the number of VFs to be created into ``"sriov_numvfs"`` sysfs entry - of the parent PF. - - Example output: - - .. code-block:: console - - echo 2 > /sys/devices/pci0000:00/0000:00:03.0/0000:81:00.0/sriov_numvfs - - -#. Assign VF MAC address: - - Assign MAC address to the VF using iproute2 utility. The syntax is:: - - ip link set <PF iface> vf <VF id> mac <macaddr> - - Example output: - - .. code-block:: console - - ip link set ens5f0 vf 0 mac 52:54:00:2f:9d:e8 - - -#. PCI Passthrough: - - The VF devices may be passed through to the guest VM using ``virt-manager`` or - ``virsh``. QEDE PMD should be used to bind the VF devices in the guest VM - using the instructions outlined in the Application notes above. diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst new file mode 100644 index 00000000..5f825e9a --- /dev/null +++ b/doc/guides/nics/sfc_efx.rst @@ -0,0 +1,277 @@ +.. BSD LICENSE + Copyright (c) 2016 Solarflare Communications Inc. + All rights reserved. + + This software was jointly developed between OKTET Labs (under contract + for Solarflare) and Solarflare Communications, Inc. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions are met: + + 1. Redistributions of source code must retain the above copyright notice, + this list of conditions and the following disclaimer. + 2. Redistributions in binary form must reproduce the above copyright notice, + this list of conditions and the following disclaimer in the documentation + and/or other materials provided with the distribution. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" + AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, + THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR + CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, + EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, + PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; + OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, + WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR + OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, + EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Solarflare libefx-based Poll Mode Driver +======================================== + +The SFC EFX PMD (**librte_pmd_sfc_efx**) provides poll mode driver support +for **Solarflare SFN7xxx and SFN8xxx** family of 10/40 Gbps adapters. +SFC EFX PMD has support for the latest Linux and FreeBSD operating systems. + +More information can be found at `Solarflare Communications website +<http://solarflare.com>`_. + + +Features +-------- + +SFC EFX PMD has support for: + +- Multiple transmit and receive queues + +- Link state information including link status change interrupt + +- IPv4/IPv6 TCP/UDP transmit checksum offload + +- Port hardware statistics + +- Extended statistics (see Solarflare Server Adapter User's Guide for + the statistics description) + +- Basic flow control + +- MTU update + +- Jumbo frames up to 9K + +- Promiscuous mode + +- Allmulticast mode + +- TCP segmentation offload (TSO) + +- Multicast MAC filter + +- IPv4/IPv6 TCP/UDP receive checksum offload + +- Received packet type information + +- Receive side scaling (RSS) + +- RSS hash + +- Scattered Rx DMA for packet that are larger that a single Rx descriptor + +- Deferred receive and transmit queue start + +- Transmit VLAN insertion (if running firmware variant supports it) + +- Flow API + + +Non-supported Features +---------------------- + +The features not yet supported include: + +- Receive queue interupts + +- Priority-based flow control + +- Loopback + +- Configurable RX CRC stripping (always stripped) + +- Header split on receive + +- VLAN filtering + +- VLAN stripping + +- LRO + + +Limitations +----------- + +Due to requirements on receive buffer alignment and usage of the receive +buffer for the auxiliary packet information provided by the NIC up to +extra 269 (14 bytes prefix plus up to 255 bytes for end padding) bytes may be +required in the receive buffer. +It should be taken into account when mbuf pool for receive is created. + + +Flow API support +---------------- + +Supported attributes: + +- Ingress + +Supported pattern items: + +- VOID + +- ETH (exact match of source/destination addresses, individual/group match + of destination address, EtherType) + +- VLAN (exact match of VID, double-tagging is supported) + +- IPV4 (exact match of source/destination addresses, + IP transport protocol) + +- IPV6 (exact match of source/destination addresses, + IP transport protocol) + +- TCP (exact match of source/destination ports) + +- UDP (exact match of source/destination ports) + +Supported actions: + +- VOID + +- QUEUE + +Validating flow rules depends on the firmware variant. + +Ethernet destinaton individual/group match +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Ethernet item supports I/G matching, if only the corresponding bit is set +in the mask of destination address. If destinaton address in the spec is +multicast, it matches all multicast (and broadcast) packets, oherwise it +matches unicast packets that are not filtered by other flow rules. + + +Supported NICs +-------------- + +- Solarflare Flareon [Ultra] Server Adapters: + + - Solarflare SFN8522 Dual Port SFP+ Server Adapter + + - Solarflare SFN8542 Dual Port QSFP+ Server Adapter + + - Solarflare SFN7002F Dual Port SFP+ Server Adapter + + - Solarflare SFN7004F Quad Port SFP+ Server Adapter + + - Solarflare SFN7042Q Dual Port QSFP+ Server Adapter + + - Solarflare SFN7122F Dual Port SFP+ Server Adapter + + - Solarflare SFN7124F Quad Port SFP+ Server Adapter + + - Solarflare SFN7142Q Dual Port QSFP+ Server Adapter + + - Solarflare SFN7322F Precision Time Synchronization Server Adapter + + +Prerequisites +------------- + +- Requires firmware version: + + - SFN7xxx: **4.7.1.1001** or higher + + - SFN8xxx: **6.0.2.1004** or higher + +Visit `Solarflare Support Downloads <https://support.solarflare.com>`_ to get +Solarflare Utilities (either Linux or FreeBSD) with the latest firmware. +Follow instructions from Solarflare Server Adapter User's Guide to +update firmware and configure the adapter. + + +Pre-Installation Configuration +------------------------------ + + +Config File Options +~~~~~~~~~~~~~~~~~~~ + +The following options can be modified in the ``.config`` file. +Please note that enabling debugging options may affect system performance. + +- ``CONFIG_RTE_LIBRTE_SFC_EFX_PMD`` (default **y**) + + Enable compilation of Solarflare libefx-based poll-mode driver. + +- ``CONFIG_RTE_LIBRTE_SFC_EFX_DEBUG`` (default **n**) + + Enable compilation of the extra run-time consistency checks. + + +Per-Device Parameters +~~~~~~~~~~~~~~~~~~~~~ + +The following per-device parameters can be passed via EAL PCI device +whitelist option like "-w 02:00.0,arg1=value1,...". + +Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify +boolean parameters value. + +- ``rx_datapath`` [auto|efx|ef10] (default **auto**) + + Choose receive datapath implementation. + **auto** allows the driver itself to make a choice based on firmware + features available and required by the datapath implementation. + **efx** chooses libefx-based datapath which supports Rx scatter. + **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is + more efficient than libefx-based and provides richer packet type + classification, but lacks Rx scatter support. + +- ``tx_datapath`` [auto|efx|ef10|ef10_simple] (default **auto**) + + Choose transmit datapath implementation. + **auto** allows the driver itself to make a choice based on firmware + features available and required by the datapath implementation. + **efx** chooses libefx-based datapath which supports VLAN insertion + (full-feature firmware variant only), TSO and multi-segment mbufs. + **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is + more efficient than libefx-based but has no VLAN insertion and TSO + support yet. + **ef10_simple** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which + is even more faster then **ef10** but does not support multi-segment + mbufs. + +- ``perf_profile`` [auto|throughput|low-latency] (default **throughput**) + + Choose hardware tunning to be optimized for either throughput or + low-latency. + **auto** allows NIC firmware to make a choice based on + installed licences and firmware variant configured using **sfboot**. + +- ``debug_init`` [bool] (default **n**) + + Enable extra logging during device intialization and startup. + +- ``mcdi_logging`` [bool] (default **n**) + + Enable extra logging of the communication with the NIC's management CPU. + The logging is done using RTE_LOG() with INFO level and PMD type. + The format is consumed by the Solarflare netlogdecode cross-platform tool. + +- ``stats_update_period_ms`` [long] (default **1000**) + + Adjust period in milliseconds to update port hardware statistics. + The accepted range is 0 to 65535. The value of **0** may be used + to disable periodic statistics update. One should note that it's + only possible to set an arbitrary value on SFN8xxx provided that + firmware version is 6.2.1.1033 or higher, otherwise any positive + value will select a fixed update period of **1000** milliseconds diff --git a/doc/guides/nics/szedata2.rst b/doc/guides/nics/szedata2.rst index 741b4008..60080a9f 100644 --- a/doc/guides/nics/szedata2.rst +++ b/doc/guides/nics/szedata2.rst @@ -31,16 +31,16 @@ SZEDATA2 poll mode driver library ================================= -The SZEDATA2 poll mode driver library implements support for cards from COMBO -family (**COMBO-80G**, **COMBO-100G**). -The SZEDATA2 PMD uses interface provided by libsze2 library to communicate -with COMBO cards over sze2 layer. +The SZEDATA2 poll mode driver library implements support for the Netcope +FPGA Boards (**NFB-***), FPGA-based programmable NICs. +The SZEDATA2 PMD uses interface provided by the libsze2 library to communicate +with the NFB cards over the sze2 layer. -More information about family of -`COMBO cards <https://www.liberouter.org/technologies/cards/>`_ +More information about the +`NFB cards <http://www.netcope.com/en/products/fpga-boards>`_ and used technology -(`NetCOPE platform <https://www.liberouter.org/technologies/netcope/>`_) can be -found on the `Liberouter website <https://www.liberouter.org/>`_. +(`Netcope Development Kit <http://www.netcope.com/en/products/fpga-development-kit>`_) +can be found on the `Netcope Technologies website <http://www.netcope.com/>`_. .. note:: @@ -77,7 +77,7 @@ separately: sharing of resources for user space applications. Information about getting the dependencies can be found `here -<https://www.liberouter.org/technologies/netcope/access-to-libsze2-library/>`_. +<http://www.netcope.com/en/company/community-support/dpdk-libsze2>`_. Configuration ------------- @@ -117,7 +117,7 @@ transmit channel: .. code-block:: console - $RTE_TARGET/app/testpmd -c 0xf -n 2 \ + $RTE_TARGET/app/testpmd -l 0-3 -n 2 \ -- --port-topology=chained --rxq=2 --txq=2 --nb-cores=2 -i -a Example output: diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst new file mode 100644 index 00000000..5c5ba535 --- /dev/null +++ b/doc/guides/nics/tap.rst @@ -0,0 +1,197 @@ +.. BSD LICENSE + Copyright(c) 2016 Intel Corporation. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +Tun/Tap Poll Mode Driver +======================== + +The ``rte_eth_tap.c`` PMD creates a device using TUN/TAP interfaces on the +local host. The PMD allows for DPDK and the host to communicate using a raw +device interface on the host and in the DPDK application. + +The device created is a TAP device, which sends/receives packet in a raw +format with a L2 header. The usage for a TAP PMD is for connectivity to the +local host using a TAP interface. When the TAP PMD is initialized it will +create a number of tap devices in the host accessed via ``ifconfig -a`` or +``ip`` command. The commands can be used to assign and query the virtual like +device. + +These TAP interfaces can be used with Wireshark or tcpdump or Pktgen-DPDK +along with being able to be used as a network connection to the DPDK +application. The method enable one or more interfaces is to use the +``--vdev=net_tap0`` option on the DPDK application command line. Each +``--vdev=net_tap1`` option give will create an interface named dtap0, dtap1, +and so on. + +The interface name can be changed by adding the ``iface=foo0``, for example:: + + --vdev=net_tap0,iface=foo0 --vdev=net_tap1,iface=foo1, ... + +Also the speed of the interface can be changed from 10G to whatever number +needed, but the interface does not enforce that speed, for example:: + + --vdev=net_tap0,iface=foo0,speed=25000 + +It is possible to specify a remote netdevice to capture packets from by adding +``remote=foo1``, for example:: + + --vdev=net_tap,iface=tap0,remote=foo1 + +If a ``remote`` is set, the tap MAC address will be set to match the remote one +just after netdevice creation. Using TC rules, traffic from the remote netdevice +will be redirected to the tap. If the tap is in promiscuous mode, then all +packets will be redirected. In allmulti mode, all multicast packets will be +redirected. + +Using the remote feature is especially useful for capturing traffic from a +netdevice that has no support in the DPDK. It is possible to add explicit +rte_flow rules on the tap PMD to capture specific traffic (see next section for +examples). + +After the DPDK application is started you can send and receive packets on the +interface using the standard rx_burst/tx_burst APIs in DPDK. From the host +point of view you can use any host tool like tcpdump, Wireshark, ping, Pktgen +and others to communicate with the DPDK application. The DPDK application may +not understand network protocols like IPv4/6, UDP or TCP unless the +application has been written to understand these protocols. + +If you need the interface as a real network interface meaning running and has +a valid IP address then you can do this with the following commands:: + + sudo ip link set dtap0 up; sudo ip addr add 192.168.0.250/24 dev dtap0 + sudo ip link set dtap1 up; sudo ip addr add 192.168.1.250/24 dev dtap1 + +Please change the IP addresses as you see fit. + +If routing is enabled on the host you can also communicate with the DPDK App +over the internet via a standard socket layer application as long as you +account for the protocol handing in the application. + +If you have a Network Stack in your DPDK application or something like it you +can utilize that stack to handle the network protocols. Plus you would be able +to address the interface using an IP address assigned to the internal +interface. + +Flow API support +---------------- + +The tap PMD supports major flow API pattern items and actions, when running on +linux kernels above 4.2 ("Flower" classifier required). Supported items: + +- eth: src and dst (with variable masks), and eth_type (0xffff mask). +- vlan: vid, pcp, tpid, but not eid. (requires kernel 4.9) +- ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask). +- udp/tcp: src and dst port (0xffff) mask. + +Supported actions: + +- DROP +- QUEUE +- PASSTHRU + +It is generally not possible to provide a "last" item. However, if the "last" +item, once masked, is identical to the masked spec, then it is supported. + +Only IPv4/6 and MAC addresses can use a variable mask. All other items need a +full mask (exact match). + +As rules are translated to TC, it is possible to show them with something like:: + + tc -s filter show dev tap1 parent 1: + +Examples of testpmd flow rules +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Drop packets for destination IP 192.168.0.1:: + + testpmd> flow create 0 priority 1 ingress pattern eth / ipv4 dst is 1.1.1.1 \ + / end actions drop / end + +Ensure packets from a given MAC address are received on a queue 2:: + + testpmd> flow create 0 priority 2 ingress pattern eth src is 06:05:04:03:02:01 \ + / end actions queue index 2 / end + +Drop UDP packets in vlan 3:: + + testpmd> flow create 0 priority 3 ingress pattern eth / vlan vid is 3 / \ + ipv4 proto is 17 / end actions drop / end + +Example +------- + +The following is a simple example of using the TUN/TAP PMD with the Pktgen +packet generator. It requires that the ``socat`` utility is installed on the +test system. + +Build DPDK, then pull down Pktgen and build pktgen using the DPDK SDK/Target +used to build the dpdk you pulled down. + +Run pktgen from the pktgen directory in a terminal with a commandline like the +following:: + + sudo ./app/app/x86_64-native-linuxapp-gcc/app/pktgen -l 1-5 -n 4 \ + --proc-type auto --log-level 8 --socket-mem 512,512 --file-prefix pg \ + --vdev=net_tap0 --vdev=net_tap1 -b 05:00.0 -b 05:00.1 \ + -b 04:00.0 -b 04:00.1 -b 04:00.2 -b 04:00.3 \ + -b 81:00.0 -b 81:00.1 -b 81:00.2 -b 81:00.3 \ + -b 82:00.0 -b 83:00.0 -- -T -P -m [2:3].0 -m [4:5].1 \ + -f themes/black-yellow.theme + +.. Note: + + Change the ``-b`` options to blacklist all of your physical ports. The + following command line is all one line. + + Also, ``-f themes/black-yellow.theme`` is optional if the default colors + work on your system configuration. See the Pktgen docs for more + information. + +Verify with ``ifconfig -a`` command in a different xterm window, should have a +``dtap0`` and ``dtap1`` interfaces created. + +Next set the links for the two interfaces to up via the commands below:: + + sudo ip link set dtap0 up; sudo ip addr add 192.168.0.250/24 dev dtap0 + sudo ip link set dtap1 up; sudo ip addr add 192.168.1.250/24 dev dtap1 + +Then use socat to create a loopback for the two interfaces:: + + sudo socat interface:dtap0 interface:dtap1 + +Then on the Pktgen command line interface you can start sending packets using +the commands ``start 0`` and ``start 1`` or you can start both at the same +time with ``start all``. The command ``str`` is an alias for ``start all`` and +``stp`` is an alias for ``stop all``. + +While running you should see the 64 byte counters increasing to verify the +traffic is being looped back. You can use ``set all size XXX`` to change the +size of the packets after you stop the traffic. Use pktgen ``help`` +command to see a list of all commands. You can also use the ``-f`` option to +load commands at startup in command line or Lua script in pktgen. diff --git a/doc/guides/nics/thunderx.rst b/doc/guides/nics/thunderx.rst index 187c9a4a..4fa0039d 100644 --- a/doc/guides/nics/thunderx.rst +++ b/doc/guides/nics/thunderx.rst @@ -77,9 +77,8 @@ Config File Options The following options can be modified in the ``config`` file. Please note that enabling debugging options may affect system performance. -- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD`` (default ``n``) +- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD`` (default ``y``) - By default it is enabled only for defconfig_arm64-thunderx-* config. Toggle compilation of the ``librte_pmd_thunderx_nicvf`` driver. - ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_INIT`` (default ``n``) @@ -102,95 +101,18 @@ Please note that enabling debugging options may affect system performance. Toggle display of PF mailbox related run-time check messages -Driver Compilation -~~~~~~~~~~~~~~~~~~ - -To compile the ThunderX NICVF PMD for Linux arm64 gcc target, run the -following “make” command: +Driver compilation and testing +------------------------------ -.. code-block:: console +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. - cd <DPDK-source-directory> - make config T=arm64-thunderx-linuxapp-gcc install +To compile the ThunderX NICVF PMD for Linux arm64 gcc, +use arm64-thunderx-linuxapp-gcc as target. Linux ----- -.. _thunderx_testpmd_example: - -Running testpmd -~~~~~~~~~~~~~~~ - -This section demonstrates how to launch ``testpmd`` with ThunderX NIC VF device -managed by ``librte_pmd_thunderx_nicvf`` in the Linux operating system. - -#. Load ``vfio-pci`` driver: - - .. code-block:: console - - modprobe vfio-pci - - .. _thunderx_vfio_noiommu: - -#. Enable **VFIO-NOIOMMU** mode (optional): - - .. code-block:: console - - echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode - - .. note:: - - **VFIO-NOIOMMU** is required only when running in VM context and should not be enabled otherwise. - See also :ref:`SR-IOV: Prerequisites and sample Application Notes <thunderx_sriov_example>`. - -#. Bind the ThunderX NIC VF device to ``vfio-pci`` loaded in the previous step: - - Setup VFIO permissions for regular users and then bind to ``vfio-pci``: - - .. code-block:: console - - ./tools/dpdk-devbind.py --bind vfio-pci 0002:01:00.2 - -#. Start ``testpmd`` with basic parameters: - - .. code-block:: console - - ./arm64-thunderx-linuxapp-gcc/app/testpmd -c 0xf -n 4 -w 0002:01:00.2 \ - -- -i --disable-hw-vlan-filter --crc-strip --no-flush-rx \ - --port-topology=loop - - Example output: - - .. code-block:: console - - ... - - PMD: rte_nicvf_pmd_init(): librte_pmd_thunderx nicvf version 1.0 - - ... - EAL: probe driver: 177d:11 rte_nicvf_pmd - EAL: using IOMMU type 1 (Type 1) - EAL: PCI memory mapped at 0x3ffade50000 - EAL: Trying to map BAR 4 that contains the MSI-X table. - Trying offsets: 0x40000000000:0x0000, 0x10000:0x1f0000 - EAL: PCI memory mapped at 0x3ffadc60000 - PMD: nicvf_eth_dev_init(): nicvf: device (177d:11) 2:1:0:2 - PMD: nicvf_eth_dev_init(): node=0 vf=1 mode=tns-bypass sqs=false - loopback_supported=true - PMD: nicvf_eth_dev_init(): Port 0 (177d:11) mac=a6:c6:d9:17:78:01 - Interactive-mode selected - Configuring Port 0 (socket 0) - ... - - PMD: nicvf_dev_configure(): Configured ethdev port0 hwcap=0x0 - Port 0: A6:C6:D9:17:78:01 - Checking link statuses... - Port 0 Link Up - speed 10000 Mbps - full-duplex - Done - testpmd> - -.. _thunderx_sriov_example: - SR-IOV: Prerequisites and sample Application Notes ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -249,58 +171,10 @@ This section provides instructions to configure SR-IOV with Linux OS. Unless ``thunder-nicvf`` driver is in use make sure your kernel config includes ``CONFIG_THUNDER_NIC_VF`` setting. -#. Verify PF/VF bind using ``dpdk-devbind.py``: - - .. code-block:: console - - ./tools/dpdk-devbind.py --status - - Example output: - - .. code-block:: console - - ... - 0002:01:00.0 'Device a01e' if= drv=thunder-nic unused=vfio-pci - 0002:01:00.1 'Device 0011' if=eth0 drv=thunder-nicvf unused=vfio-pci - 0002:01:00.2 'Device 0011' if=eth1 drv=thunder-nicvf unused=vfio-pci - ... - -#. Load ``vfio-pci`` driver: - - .. code-block:: console - - modprobe vfio-pci - -#. Bind VF devices to ``vfio-pci`` using ``dpdk-devbind.py``: - - .. code-block:: console - - ./tools/dpdk-devbind.py --bind vfio-pci 0002:01:00.1 - ./tools/dpdk-devbind.py --bind vfio-pci 0002:01:00.2 - -#. Verify VF bind using ``dpdk-devbind.py``: - - .. code-block:: console - - ./tools/dpdk-devbind.py --status - - Example output: - - .. code-block:: console - - ... - 0002:01:00.1 'Device 0011' drv=vfio-pci unused= - 0002:01:00.2 'Device 0011' drv=vfio-pci unused= - ... - 0002:01:00.0 'Device a01e' if= drv=thunder-nic unused=vfio-pci - ... - #. Pass VF device to VM context (PCIe Passthrough): The VF devices may be passed through to the guest VM using qemu or virt-manager or virsh etc. - ``librte_pmd_thunderx_nicvf`` or ``thunder-nicvf`` should be used to bind - the VF devices in the guest VM in :ref:`VFIO-NOIOMMU <thunderx_vfio_noiommu>` mode. Example qemu guest launch command: @@ -321,8 +195,55 @@ This section provides instructions to configure SR-IOV with Linux OS. -serial stdio \ -mem-path /dev/huge -#. Refer to section :ref:`Running testpmd <thunderx_testpmd_example>` for instruction - how to launch ``testpmd`` application. +#. Enable **VFIO-NOIOMMU** mode (optional): + + .. code-block:: console + + echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode + + .. note:: + + **VFIO-NOIOMMU** is required only when running in VM context and should not be enabled otherwise. + +#. Running testpmd: + + Follow instructions available in the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` + to run testpmd. + + Example output: + + .. code-block:: console + + ./arm64-thunderx-linuxapp-gcc/app/testpmd -l 0-3 -n 4 -w 0002:01:00.2 \ + -- -i --disable-hw-vlan-filter --disable-crc-strip --no-flush-rx \ + --port-topology=loop + + ... + + PMD: rte_nicvf_pmd_init(): librte_pmd_thunderx nicvf version 1.0 + + ... + EAL: probe driver: 177d:11 rte_nicvf_pmd + EAL: using IOMMU type 1 (Type 1) + EAL: PCI memory mapped at 0x3ffade50000 + EAL: Trying to map BAR 4 that contains the MSI-X table. + Trying offsets: 0x40000000000:0x0000, 0x10000:0x1f0000 + EAL: PCI memory mapped at 0x3ffadc60000 + PMD: nicvf_eth_dev_init(): nicvf: device (177d:11) 2:1:0:2 + PMD: nicvf_eth_dev_init(): node=0 vf=1 mode=tns-bypass sqs=false + loopback_supported=true + PMD: nicvf_eth_dev_init(): Port 0 (177d:11) mac=a6:c6:d9:17:78:01 + Interactive-mode selected + Configuring Port 0 (socket 0) + ... + + PMD: nicvf_dev_configure(): Configured ethdev port0 hwcap=0x0 + Port 0: A6:C6:D9:17:78:01 + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> Multiple Queue Set per DPDK port configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -352,7 +273,7 @@ driver' list, secondary VFs are on the remaining on the remaining part of the li .. note:: The VNIC driver in the multiqueue setup works differently than other drivers like `ixgbe`. - We need to bind separately each specific queue set device with the ``tools/dpdk-devbind.py`` utility. + We need to bind separately each specific queue set device with the ``usertools/dpdk-devbind.py`` utility. .. note:: @@ -372,7 +293,7 @@ on a non-NUMA machine. .. code-block:: console - # tools/dpdk-devbind.py --status + # usertools/dpdk-devbind.py --status Network devices using DPDK-compatible driver ============================================ @@ -416,17 +337,17 @@ We will choose four secondary queue sets from the ending of the list (0002:01:01 .. code-block:: console - tools/dpdk-devbind.py -b vfio-pci 0002:01:00.2 - tools/dpdk-devbind.py -b vfio-pci 0002:01:00.3 + usertools/dpdk-devbind.py -b vfio-pci 0002:01:00.2 + usertools/dpdk-devbind.py -b vfio-pci 0002:01:00.3 #. Bind four primary VFs to the ``vfio-pci`` driver: .. code-block:: console - tools/dpdk-devbind.py -b vfio-pci 0002:01:01.7 - tools/dpdk-devbind.py -b vfio-pci 0002:01:02.0 - tools/dpdk-devbind.py -b vfio-pci 0002:01:02.1 - tools/dpdk-devbind.py -b vfio-pci 0002:01:02.2 + usertools/dpdk-devbind.py -b vfio-pci 0002:01:01.7 + usertools/dpdk-devbind.py -b vfio-pci 0002:01:02.0 + usertools/dpdk-devbind.py -b vfio-pci 0002:01:02.1 + usertools/dpdk-devbind.py -b vfio-pci 0002:01:02.2 The nicvf thunderx driver will make use of attached secondary VFs automatically during the interface configuration stage. diff --git a/doc/guides/nics/vhost.rst b/doc/guides/nics/vhost.rst index 6b30b54e..e651a166 100644 --- a/doc/guides/nics/vhost.rst +++ b/doc/guides/nics/vhost.rst @@ -92,7 +92,7 @@ This section demonstrates vhost PMD with testpmd DPDK sample application. .. code-block:: console - ./testpmd -c f -n 4 --vdev 'net_vhost0,iface=/tmp/sock0,queues=1' -- -i + ./testpmd -l 0-3 -n 4 --vdev 'net_vhost0,iface=/tmp/sock0,queues=1' -- -i Other basic DPDK preparations like hugepage enabling here. Please refer to the *DPDK Getting Started Guide* for detailed instructions. diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst index 54310157..91bedea6 100644 --- a/doc/guides/nics/virtio.rst +++ b/doc/guides/nics/virtio.rst @@ -87,6 +87,8 @@ In this release, the virtio PMD driver provides the basic functionality of packe * Virtio supports Link State interrupt. +* Virtio supports Rx interrupt (so far, only support 1:1 mapping for queue/interrupt). + * Virtio supports software vlan stripping and inserting. * Virtio supports using port IO to get PCI resource when uio/igb_uio module is not available. @@ -128,7 +130,7 @@ Host2VM communication example .. code-block:: console - examples/kni/build/app/kni -c 0xf -n 4 -- -p 0x1 -P --config="(0,1,3)" + examples/kni/build/app/kni -l 0-3 -n 4 -- -p 0x1 -P --config="(0,1,3)" This command generates one network device vEth0 for physical port. If specify more physical ports, the generated network device will be vEth1, vEth2, and so on. @@ -172,7 +174,7 @@ Host2VM communication example modprobe uio echo 512 > /sys/devices/system/node/node0/hugepages/hugepages-2048kB/nr_hugepages modprobe uio_pci_generic - python tools/dpdk-devbind.py -b uio_pci_generic 00:03.0 + python usertools/dpdk-devbind.py -b uio_pci_generic 00:03.0 We use testpmd as the forwarding application in this example. @@ -273,4 +275,62 @@ The corresponding callbacks are: Example of using the vector version of the virtio poll mode driver in ``testpmd``:: - testpmd -c 0x7 -n 4 -- -i --txqflags=0xF01 --rxq=1 --txq=1 --nb-cores=1 + testpmd -l 0-2 -n 4 -- -i --txqflags=0xF01 --rxq=1 --txq=1 --nb-cores=1 + + +Interrupt mode +-------------- + +.. _virtio_interrupt_mode: + +There are three kinds of interrupts from a virtio device over PCI bus: config +interrupt, Rx interrupts, and Tx interrupts. Config interrupt is used for +notification of device configuration changes, especially link status (lsc). +Interrupt mode is translated into Rx interrupts in the context of DPDK. + +Prerequisites for Rx interrupts +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To support Rx interrupts, +#. Check if guest kernel supports VFIO-NOIOMMU: + + Linux started to support VFIO-NOIOMMU since 4.8.0. Make sure the guest + kernel is compiled with: + + .. code-block:: console + + CONFIG_VFIO_NOIOMMU=y + +#. Properly set msix vectors when starting VM: + + Enable multi-queue when starting VM, and specify msix vectors in qemu + cmdline. (N+1) is the minimum, and (2N+2) is mostly recommended. + + .. code-block:: console + + $(QEMU) ... -device virtio-net-pci,mq=on,vectors=2N+2 ... + +#. In VM, insert vfio module in NOIOMMU mode: + + .. code-block:: console + + modprobe vfio enable_unsafe_noiommu_mode=1 + modprobe vfio-pci + +#. In VM, bind the virtio device with vfio-pci: + + .. code-block:: console + + python usertools/dpdk-devbind.py -b vfio-pci 00:03.0 + +Example +~~~~~~~ + +Here we use l3fwd-power as an example to show how to get started. + + Example: + + .. code-block:: console + + $ l3fwd-power -l 0-1 -- -p 1 -P --config="(0,0,1)" \ + --no-numa --parse-ptype |