diff options
Diffstat (limited to 'doc/guides/nics')
61 files changed, 2533 insertions, 826 deletions
diff --git a/doc/guides/nics/axgbe.rst b/doc/guides/nics/axgbe.rst new file mode 100644 index 00000000..e30f4944 --- /dev/null +++ b/doc/guides/nics/axgbe.rst @@ -0,0 +1,89 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright (c) 2018 Advanced Micro Devices, Inc. All rights reserved. + +AXGBE Poll Mode Driver +====================== + +The AXGBE poll mode driver library (**librte_pmd_axgbe**) implements support +for AMD 10 Gbps family of adapters. It is compiled and tested in standard linux distro like Ubuntu. + +Detailed information about SoCs that use these devices can be found here: + +- `AMD EPYC™ EMBEDDED 3000 family <https://www.amd.com/en/products/embedded-epyc-3000-series>`_. + + +Supported Features +------------------ + +AXGBE PMD has support for: + +- Base L2 features +- TSS (Transmit Side Scaling) +- Promiscuous mode +- Port statistics +- Multicast mode +- RSS (Receive Side Scaling) +- Checksum offload +- Jumbo Frame upto 9K + + +Configuration Information +------------------------- + +The following options can be modified in the ``.config`` file. Please note that +enabling debugging options may affect system performance. + +- ``CONFIG_RTE_LIBRTE_AXGBE_PMD`` (default **y**) + + Toggle compilation of axgbe PMD. + +- ``CONFIG_RTE_LIBRTE_AXGBE_PMD_DEBUG`` (default **n**) + + Toggle display for PMD debug related messages. + + +Building DPDK +------------- + +See the :ref:`DPDK Getting Started Guide for Linux <linux_gsg>` for +instructions on how to build DPDK. + +By default the AXGBE PMD library will be built into the DPDK library. + +For configuring and using UIO frameworks, please also refer :ref:`the +documentation that comes with DPDK suite <linux_gsg>`. + + +Prerequisites and Pre-conditions +-------------------------------- +- Prepare the system as recommended by DPDK suite. + +- Bind the intended AMD device to ``igb_uio`` or ``vfio-pci`` module. + +Now system is ready to run DPDK application. + + +Usage Example +------------- + +Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` +for details. + +Example output: + +.. code-block:: console + + [...] + EAL: PCI device 0000:02:00.4 on NUMA socket 0 + EAL: probe driver: 1022:1458 net_axgbe + Interactive-mode selected + USER1: create a new mbuf pool <mbuf_pool_socket_0>: n=171456, size=2176, socket=0 + USER1: create a new mbuf pool <mbuf_pool_socket_1>: n=171456, size=2176, socket=1 + USER1: create a new mbuf pool <mbuf_pool_socket_2>: n=171456, size=2176, socket=2 + USER1: create a new mbuf pool <mbuf_pool_socket_3>: n=171456, size=2176, socket=3 + Configuring Port 0 (socket 0) + Port 0: 00:00:1A:1C:6A:17 + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> diff --git a/doc/guides/nics/bnx2x.rst b/doc/guides/nics/bnx2x.rst index 31f146a0..cecbfc2e 100644 --- a/doc/guides/nics/bnx2x.rst +++ b/doc/guides/nics/bnx2x.rst @@ -194,6 +194,7 @@ This section provides instructions to configure SR-IOV with Linux OS. using the instructions outlined in the Application notes below. #. Running testpmd: + (Supply ``--log-level="pmd.net.bnx2x.driver",7`` to view informational messages): Follow instructions available in the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` diff --git a/doc/guides/nics/bnxt.rst b/doc/guides/nics/bnxt.rst index 9826b350..697b97e6 100644 --- a/doc/guides/nics/bnxt.rst +++ b/doc/guides/nics/bnxt.rst @@ -1,46 +1,20 @@ -.. BSD LICENSE - Copyright 2016 Broadcom Limited - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of Broadcom Limited nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +.. SPDX-License-Identifier: BSD-3-Clause + Copyright 2016-2018 Broadcom BNXT Poll Mode Driver ===================== The bnxt poll mode library (**librte_pmd_bnxt**) implements support for: - * **Broadcom NetXtreme-C®/NetXtreme-E® BCM5730X and BCM574XX family of - Ethernet Network Controllers** + * **Broadcom NetXtreme-C®/NetXtreme-E®/NetXtreme-S® + BCM5730X / BCM574XX / BCM58000 family of Ethernet Network Controllers** These adapters support Standards compliant 10/25/50/100Gbps 30MPPS full-duplex throughput. Information about the NetXtreme family of adapters can be found in the `NetXtreme® Brand section - <https://www.broadcom.com/products/ethernet-communication-and-switching?technology%5B%5D=88>`_ + <https://www.broadcom.com/products/ethernet-connectivity/controllers/>`_ of the `Broadcom website <http://www.broadcom.com/>`_. * **Broadcom StrataGX® BCM5871X Series of Communucations Processors** diff --git a/doc/guides/nics/cxgbe.rst b/doc/guides/nics/cxgbe.rst index 8651a7be..58d88eef 100644 --- a/doc/guides/nics/cxgbe.rst +++ b/doc/guides/nics/cxgbe.rst @@ -1,32 +1,6 @@ -.. BSD LICENSE - Copyright 2015-2017 Chelsio Communications. - All rights reserved. - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of Chelsio Communications nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2014-2018 Chelsio Communications. + All rights reserved. CXGBE Poll Mode Driver ====================== @@ -35,22 +9,28 @@ The CXGBE PMD (**librte_pmd_cxgbe**) provides poll mode driver support for **Chelsio Terminator** 10/25/40/100 Gbps family of adapters. CXGBE PMD has support for the latest Linux and FreeBSD operating systems. +CXGBEVF PMD provides poll mode driver support for SR-IOV Virtual functions +and has support for the latest Linux operating systems. + More information can be found at `Chelsio Communications Official Website <http://www.chelsio.com>`_. Features -------- -CXGBE PMD has support for: +CXGBE and CXGBEVF PMD has support for: - Multiple queues for TX and RX - Receiver Side Steering (RSS) + Receiver Side Steering (RSS) on IPv4, IPv6, IPv4-TCP/UDP, IPv6-TCP/UDP. + For 4-tuple, enabling 'RSS on TCP' and 'RSS on TCP + UDP' is supported. - VLAN filtering - Checksum offload - Promiscuous mode - All multicast mode - Port hardware statistics - Jumbo frames +- Flow API - Support for both Wildcard (LE-TCAM) and Exact (HASH) match filters. Limitations ----------- @@ -63,6 +43,8 @@ port. For this reason, one cannot whitelist/blacklist a single port without whitelisting/blacklisting the other ports on the same device. +.. _t5-nics: + Supported Chelsio T5 NICs ------------------------- @@ -71,16 +53,24 @@ Supported Chelsio T5 NICs - 40G NICs: T580-CR, T580-LP-CR, T580-SO-CR - Other T5 NICs: T522-CR +.. _t6-nics: + Supported Chelsio T6 NICs ------------------------- - 25G NICs: T6425-CR, T6225-CR, T6225-LL-CR, T6225-SO-CR - 100G NICs: T62100-CR, T62100-LP-CR, T62100-SO-CR +Supported SR-IOV Chelsio NICs +----------------------------- + +SR-IOV virtual functions are supported on all the Chelsio NICs listed +in :ref:`t5-nics` and :ref:`t6-nics`. + Prerequisites ------------- -- Requires firmware version **1.16.43.0** and higher. Visit +- Requires firmware version **1.17.14.0** and higher. Visit `Chelsio Download Center <http://service.chelsio.com>`_ to get latest firmware bundled with the latest Chelsio Unified Wire package. @@ -110,6 +100,10 @@ enabling debugging options may affect system performance. Toggle compilation of librte_pmd_cxgbe driver. + .. note:: + + This controls compilation of both CXGBE and CXGBEVF PMD. + - ``CONFIG_RTE_LIBRTE_CXGBE_DEBUG`` (default **n**) Toggle display of generic debugging messages. @@ -134,6 +128,28 @@ enabling debugging options may affect system performance. Toggle behaviour to prefer Throughput or Latency. +Runtime Options +~~~~~~~~~~~~~~~ + +The following ``devargs`` options can be enabled at runtime. They must +be passed as part of EAL arguments. For example, + +.. code-block:: console + + testpmd -w 02:00.4,keep_ovlan=1 -- -i + +- ``keep_ovlan`` (default **0**) + + Toggle behaviour to keep/strip outer VLAN in Q-in-Q packets. If + enabled, the outer VLAN tag is preserved in Q-in-Q packets. Otherwise, + the outer VLAN tag is stripped in Q-in-Q packets. + +- ``force_link_up`` (default **0**) + + When set to 1, CXGBEVF PMD always forces link as up for all VFs on + underlying Chelsio NICs. This enables multiple VFs on the same NIC + to send traffic to each other even when the physical link is down. + .. _driver-compilation: Driver compilation and testing @@ -208,7 +224,7 @@ Unified Wire package for Linux operating system are as follows: .. code-block:: console - firmware-version: 1.16.43.0, TP 0.1.4.9 + firmware-version: 1.17.14.0, TP 0.1.4.9 Running testpmd ~~~~~~~~~~~~~~~ @@ -266,7 +282,7 @@ devices managed by librte_pmd_cxgbe in Linux operating system. EAL: PCI memory mapped at 0x7fd7c0200000 EAL: PCI memory mapped at 0x7fd77cdfd000 EAL: PCI memory mapped at 0x7fd7c10b7000 - PMD: rte_cxgbe_pmd: fw: 1.16.43.0, TP: 0.1.4.9 + PMD: rte_cxgbe_pmd: fw: 1.17.14.0, TP: 0.1.4.9 PMD: rte_cxgbe_pmd: Coming up as MASTER: Initializing adapter Interactive-mode selected Configuring Port 0 (socket 0) @@ -286,6 +302,114 @@ devices managed by librte_pmd_cxgbe in Linux operating system. Flow control pause TX/RX is disabled by default and can be enabled via testpmd. Refer section :ref:`flow-control` for more details. +Configuring SR-IOV Virtual Functions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This section demonstrates how to enable SR-IOV virtual functions +on Chelsio NICs and demonstrates how to run testpmd with SR-IOV +virtual functions. + +#. Load the kernel module: + + .. code-block:: console + + modprobe cxgb4 + +#. Get the PCI bus addresses of the interfaces bound to cxgb4 driver: + + .. code-block:: console + + dmesg | tail -2 + + Example output: + + .. code-block:: console + + cxgb4 0000:02:00.4 p1p1: renamed from eth0 + cxgb4 0000:02:00.4 p1p2: renamed from eth1 + + .. note:: + + Both the interfaces of a Chelsio 2-port adapter are bound to the + same PCI bus address. + +#. Use ifconfig to get the interface name assigned to Chelsio card: + + .. code-block:: console + + ifconfig -a | grep "00:07:43" + + Example output: + + .. code-block:: console + + p1p1 Link encap:Ethernet HWaddr 00:07:43:2D:EA:C0 + p1p2 Link encap:Ethernet HWaddr 00:07:43:2D:EA:C8 + +#. Bring up the interfaces: + + .. code-block:: console + + ifconfig p1p1 up + ifconfig p1p2 up + +#. Instantiate SR-IOV Virtual Functions. PF0..3 can be used for + SR-IOV VFs. Multiple VFs can be instantiated on each of PF0..3. + To instantiate one SR-IOV VF on each PF0 and PF1: + + .. code-block:: console + + echo 1 > /sys/bus/pci/devices/0000\:02\:00.0/sriov_numvfs + echo 1 > /sys/bus/pci/devices/0000\:02\:00.1/sriov_numvfs + +#. Get the PCI bus addresses of the virtual functions: + + .. code-block:: console + + lspci | grep -i "Chelsio" | grep -i "VF" + + Example output: + + .. code-block:: console + + 02:01.0 Ethernet controller: Chelsio Communications Inc T540-CR Unified Wire Ethernet Controller [VF] + 02:01.1 Ethernet controller: Chelsio Communications Inc T540-CR Unified Wire Ethernet Controller [VF] + +#. Running testpmd + + Follow instructions available in the document + :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` + to bind virtual functions and run testpmd. + + Example output: + + .. code-block:: console + + [...] + EAL: PCI device 0000:02:01.0 on NUMA socket 0 + EAL: probe driver: 1425:5803 net_cxgbevf + PMD: rte_cxgbe_pmd: Firmware version: 1.17.14.0 + PMD: rte_cxgbe_pmd: TP Microcode version: 0.1.4.9 + PMD: rte_cxgbe_pmd: Chelsio rev 0 + PMD: rte_cxgbe_pmd: No bootstrap loaded + PMD: rte_cxgbe_pmd: No Expansion ROM loaded + PMD: rte_cxgbe_pmd: 0000:02:01.0 Chelsio rev 0 1G/10GBASE-SFP + EAL: PCI device 0000:02:01.1 on NUMA socket 0 + EAL: probe driver: 1425:5803 net_cxgbevf + PMD: rte_cxgbe_pmd: Firmware version: 1.17.14.0 + PMD: rte_cxgbe_pmd: TP Microcode version: 0.1.4.9 + PMD: rte_cxgbe_pmd: Chelsio rev 0 + PMD: rte_cxgbe_pmd: No bootstrap loaded + PMD: rte_cxgbe_pmd: No Expansion ROM loaded + PMD: rte_cxgbe_pmd: 0000:02:01.1 Chelsio rev 0 1G/10GBASE-SFP + Configuring Port 0 (socket 0) + Port 0: 06:44:29:44:40:00 + Configuring Port 1 (socket 0) + Port 1: 06:44:29:44:40:10 + Checking link statuses... + Done + testpmd> + FreeBSD ------- @@ -350,7 +474,7 @@ Unified Wire package for FreeBSD operating system are as follows: .. code-block:: console - dev.t5nex.0.firmware_version: 1.16.43.0 + dev.t5nex.0.firmware_version: 1.17.14.0 Running testpmd ~~~~~~~~~~~~~~~ @@ -468,7 +592,7 @@ devices managed by librte_pmd_cxgbe in FreeBSD operating system. EAL: PCI memory mapped at 0x8007ec000 EAL: PCI memory mapped at 0x842800000 EAL: PCI memory mapped at 0x80086c000 - PMD: rte_cxgbe_pmd: fw: 1.16.43.0, TP: 0.1.4.9 + PMD: rte_cxgbe_pmd: fw: 1.17.14.0, TP: 0.1.4.9 PMD: rte_cxgbe_pmd: Coming up as MASTER: Initializing adapter Interactive-mode selected Configuring Port 0 (socket 0) diff --git a/doc/guides/nics/dpaa.rst b/doc/guides/nics/dpaa.rst index 0a13996c..620c045d 100644 --- a/doc/guides/nics/dpaa.rst +++ b/doc/guides/nics/dpaa.rst @@ -162,6 +162,16 @@ Manager. this pool. +Whitelisting & Blacklisting +--------------------------- + +For blacklisting a DPAA device, following commands can be used. + + .. code-block:: console + + <dpdk app> <EAL args> -b "dpaa_bus:fmX-macY" -- ... + e.g. "dpaa_bus:fm1-mac4" + Supported DPAA SoCs ------------------- diff --git a/doc/guides/nics/dpaa2.rst b/doc/guides/nics/dpaa2.rst index 9c66edd4..66c03e10 100644 --- a/doc/guides/nics/dpaa2.rst +++ b/doc/guides/nics/dpaa2.rst @@ -494,28 +494,12 @@ Please note that enabling debugging options may affect system performance. - ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_DRIVER`` (default ``n``) - Toggle display of generic debugging messages + Toggle display of debugging messages/logic - ``CONFIG_RTE_LIBRTE_DPAA2_USE_PHYS_IOVA`` (default ``y``) Toggle to use physical address vs virtual address for hardware accelerators. -- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_INIT`` (default ``n``) - - Toggle display of initialization related messages. - -- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_RX`` (default ``n``) - - Toggle display of receive fast path run-time message - -- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_TX`` (default ``n``) - - Toggle display of transmit fast path run-time message - -- ``CONFIG_RTE_LIBRTE_DPAA2_DEBUG_TX_FREE`` (default ``n``) - - Toggle display of transmit fast path buffer free run-time message - Driver compilation and testing ------------------------------ @@ -532,8 +516,7 @@ for details. .. code-block:: console - ./arm64-dpaa2-linuxapp-gcc/testpmd -c 0xff -n 1 \ - -- -i --portmask=0x3 --nb-cores=1 --no-flush-rx + ./testpmd -c 0xff -n 1 -- -i --portmask=0x3 --nb-cores=1 --no-flush-rx ..... EAL: Registered [pci] bus. @@ -557,6 +540,38 @@ for details. Done testpmd> +Enabling logs +------------- + +For enabling logging for DPAA2 PMD, following log-level prefix can be used: + + .. code-block:: console + + <dpdk app> <EAL args> --log-level=bus.fslmc:<level> -- ... + +Using ``bus.fslmc`` as log matching criteria, all FSLMC bus logs can be enabled +which are lower than logging ``level``. + + Or + + .. code-block:: console + + <dpdk app> <EAL args> --log-level=pmd.net.dpaa2:<level> -- ... + +Using ``pmd.dpaa2`` as log matching criteria, all PMD logs can be enabled +which are lower than logging ``level``. + +Whitelisting & Blacklisting +--------------------------- + +For blacklisting a DPAA2 device, following commands can be used. + + .. code-block:: console + + <dpdk app> <EAL args> -b "fslmc:dpni.x" -- ... + +Where x is the device object id as configured in resource container. + Limitations ----------- diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst index 4dffce1a..438a83d5 100644 --- a/doc/guides/nics/enic.rst +++ b/doc/guides/nics/enic.rst @@ -1,32 +1,7 @@ -.. BSD LICENSE +.. SPDX-License-Identifier: BSD-3-Clause Copyright (c) 2017, Cisco Systems, Inc. All rights reserved. - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - 1. Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - - 2. Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS - FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE - COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, - INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, - BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; - LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER - CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT - LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN - ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE - POSSIBILITY OF SUCH DAMAGE. - ENIC Poll Mode Driver ===================== @@ -114,11 +89,24 @@ Configuration information - **Interrupts** - Only one interrupt per vNIC interface should be configured in the UCS + At least one interrupt per vNIC interface should be configured in the UCS manager regardless of the number receive/transmit queues. The ENIC PMD uses this interrupt to get information about link status and errors in the fast path. + In addition to the interrupt for link status and errors, when using Rx queue + interrupts, increase the number of configured interrupts so that there is at + least one interrupt for each Rx queue. For example, if the app uses 3 Rx + queues and wants to use per-queue interrupts, configure 4 (3 + 1) interrupts. + + - **Receive Side Scaling** + + In order to fully utilize RSS in DPDK, enable all RSS related settings in + CIMC or UCSM. These include the following items listed under + Receive Side Scaling: + TCP, IPv4, TCP-IPv4, IPv6, TCP-IPv6, IPv6 Extension, TCP-IPv6 Extension. + + .. _enic-flow-director: Flow director support @@ -140,20 +128,21 @@ perfect filtering of the 5-tuple with no masking of fields supported. SR-IOV mode utilization ----------------------- -UCS blade servers configured with dynamic vNIC connection policies in UCS -manager are capable of supporting assigned devices on virtual machines (VMs) -through a KVM hypervisor. Assigned devices, also known as 'passthrough' -devices, are SR-IOV virtual functions (VFs) on the host which are exposed -to VM instances. +UCS blade servers configured with dynamic vNIC connection policies in UCSM +are capable of supporting SR-IOV. SR-IOV virtual functions (VFs) are +specialized vNICs, distinct from regular Ethernet vNICs. These VFs can be +directly assigned to virtual machines (VMs) as 'passthrough' devices. -The Cisco Virtual Machine Fabric Extender (VM-FEX) gives the VM a dedicated +In UCS, SR-IOV VFs require the use of the Cisco Virtual Machine Fabric Extender +(VM-FEX), which gives the VM a dedicated interface on the Fabric Interconnect (FI). Layer 2 switching is done at the FI. This may eliminate the requirement for software switching on the host to route intra-host VM traffic. Please refer to `Creating a Dynamic vNIC Connection Policy <http://www.cisco.com/c/en/us/td/docs/unified_computing/ucs/sw/vm_fex/vmware/gui/config_guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide/b_GUI_VMware_VM-FEX_UCSM_Configuration_Guide_chapter_010.html#task_433E01651F69464783A68E66DA8A47A5>`_ -for information on configuring SR-IOV adapter policies using UCS manager. +for information on configuring SR-IOV adapter policies and port profiles +using UCSM. Once the policies are in place and the host OS is rebooted, VFs should be visible on the host, E.g.: @@ -170,30 +159,37 @@ visible on the host, E.g.: 0d:00.6 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) 0d:00.7 Ethernet controller: Cisco Systems Inc VIC SR-IOV VF (rev a2) -Enable Intel IOMMU on the host and install KVM and libvirt. A VM instance should -be created with an assigned device. When using libvirt, this configuration can -be done within the domain (i.e. VM) config file. For example this entry maps -host VF 0d:00:01 into the VM. +Enable Intel IOMMU on the host and install KVM and libvirt, and reboot again as +required. Then, using libvirt, create a VM instance with an assigned device. +Below is an example ``interface`` block (part of the domain configuration XML) +that adds the host VF 0d:00:01 to the VM. ``profileid='pp-vlan-25'`` indicates +the port profile that has been configured in UCSM. .. code-block:: console <interface type='hostdev' managed='yes'> <mac address='52:54:00:ac:ff:b6'/> + <driver name='vfio'/> <source> <address type='pci' domain='0x0000' bus='0x0d' slot='0x00' function='0x1'/> </source> + <virtualport type='802.1Qbh'> + <parameters profileid='pp-vlan-25'/> + </virtualport> + </interface> + Alternatively, the configuration can be done in a separate file using the ``network`` keyword. These methods are described in the libvirt documentation for `Network XML format <https://libvirt.org/formatnetwork.html>`_. -When the VM instance is started, the ENIC KVM driver will bind the host VF to +When the VM instance is started, libvirt will bind the host VF to vfio, complete provisioning on the FI and bring up the link. .. note:: It is not possible to use a VF directly from the host because it is not - fully provisioned until the hypervisor brings up the VM that it is assigned + fully provisioned until libvirt brings up the VM that it is assigned to. In the VM instance, the VF will now be visible. E.g., here the VF 00:04.0 is @@ -207,9 +203,27 @@ seen on the VM instance and should be available for binding to a DPDK. Follow the normal DPDK install procedure, binding the VF to either ``igb_uio`` or ``vfio`` in non-IOMMU mode. +In the VM, the kernel enic driver may be automatically bound to the VF during +boot. Unbinding it currently hangs due to a known issue with the driver. To +work around the issue, blacklist the enic module as follows. Please see :ref:`Limitations <enic_limitations>` for limitations in the use of SR-IOV. +.. code-block:: console + + # cat /etc/modprobe.d/enic.conf + blacklist enic + + # dracut --force + +.. note:: + + Passthrough does not require SR-IOV. If VM-FEX is not desired, the user + may create as many regular vNICs as necessary and assign them to VMs as + passthrough devices. Since these vNICs are not SR-IOV VFs, using them as + passthrough devices do not require libvirt, port profiles, and VM-FEX. + + .. _enic-genic-flow-api: Generic Flow API support @@ -227,7 +241,7 @@ Generic Flow API is supported. The baseline support is: - Actions: queue and void - Selectors: 'is' -- **1300 series VICS with advanced filters disabled** +- **1300 and later series VICS with advanced filters disabled** With advanced filters disabled, an IPv4 or IPv6 item must be specified in the pattern. @@ -238,17 +252,99 @@ Generic Flow API is supported. The baseline support is: - Selectors: 'is', 'spec' and 'mask'. 'last' is not supported - In total, up to 64 bytes of mask is allowed across all headers -- **1300 series VICS with advanced filters enabled** +- **1300 and later series VICS with advanced filters enabled** - Attributes: ingress - Items: eth, ipv4, ipv6, udp, tcp, vxlan, inner eth, ipv4, ipv6, udp, tcp - - Actions: queue, mark, flag and void + - Actions: queue, mark, drop, flag and void - Selectors: 'is', 'spec' and 'mask'. 'last' is not supported - In total, up to 64 bytes of mask is allowed across all headers More features may be added in future firmware and new versions of the VIC. Please refer to the release notes. +.. _overlay_offload: + +Overlay Offload +--------------- + +Recent hardware models support overlay offload. When enabled, the NIC performs +the following operations for VXLAN, NVGRE, and GENEVE packets. In all cases, +inner and outer packets can be IPv4 or IPv6. + +- TSO for VXLAN and GENEVE packets. + + Hardware supports NVGRE TSO, but DPDK currently has no NVGRE offload flags. + +- Tx checksum offloads. + + The NIC fills in IPv4/UDP/TCP checksums for both inner and outer packets. + +- Rx checksum offloads. + + The NIC validates IPv4/UDP/TCP checksums of both inner and outer packets. + Good checksum flags (e.g. ``PKT_RX_L4_CKSUM_GOOD``) indicate that the inner + packet has the correct checksum, and if applicable, the outer packet also + has the correct checksum. Bad checksum flags (e.g. ``PKT_RX_L4_CKSUM_BAD``) + indicate that the inner and/or outer packets have invalid checksum values. + +- Inner Rx packet type classification + + PMD sets inner L3/L4 packet types (e.g. ``RTE_PTYPE_INNER_L4_TCP``), and + ``RTE_PTYPE_TUNNEL_GRENAT`` to indicate that the packet is tunneled. + PMD does not set L3/L4 packet types for outer packets. + +- Inner RSS + + RSS hash calculation, therefore queue selection, is done on inner packets. + +In order to enable overlay offload, the 'Enable VXLAN' box should be checked +via CIMC or UCSM followed by a reboot of the server. When PMD successfully +enables overlay offload, it prints the following message on the console. + +.. code-block:: console + + Overlay offload is enabled + +By default, PMD enables overlay offload if hardware supports it. To disable +it, set ``devargs`` parameter ``disable-overlay=1``. For example:: + + -w 12:00.0,disable-overlay=1 + +By default, the NIC uses 4789 as the VXLAN port. The user may change +it through ``rte_eth_dev_udp_tunnel_port_{add,delete}``. However, as +the current NIC has a single VXLAN port number, the user cannot +configure multiple port numbers. + +Ingress VLAN Rewrite +-------------------- + +VIC adapters can tag, untag, or modify the VLAN headers of ingress +packets. The ingress VLAN rewrite mode controls this behavior. By +default, it is set to pass-through, where the NIC does not modify the +VLAN header in any way so that the application can see the original +header. This mode is sufficient for many applications, but may not be +suitable for others. Such applications may change the mode by setting +``devargs`` parameter ``ig-vlan-rewrite`` to one of the following. + +- ``pass``: Pass-through mode. The NIC does not modify the VLAN + header. This is the default mode. + +- ``priority``: Priority-tag default VLAN mode. If the ingress packet + is tagged with the default VLAN, the NIC replaces its VLAN header + with the priority tag (VLAN ID 0). + +- ``trunk``: Default trunk mode. The NIC tags untagged ingress packets + with the default VLAN. Tagged ingress packets are not modified. To + the application, every packet appears as tagged. + +- ``untag``: Untag default VLAN mode. If the ingress packet is tagged + with the default VLAN, the NIC removes or untags its VLAN header so + that the application sees an untagged packet. As a result, the + default VLAN becomes `untagged`. This mode can be useful for + applications such as OVS-DPDK performance benchmarks that utilize + only the default VLAN and want to see only untagged packets. + .. _enic_limitations: Limitations @@ -264,9 +360,10 @@ Limitations In test setups where an Ethernet port of a Cisco adapter in TRUNK mode is connected point-to-point to another adapter port or connected though a router instead of a switch, all ingress packets will be VLAN tagged. Programs such - as l3fwd which do not account for VLAN tags in packets will misbehave. The - solution is to enable VLAN stripping on ingress. The following code fragment is - an example of how to accomplish this: + as l3fwd may not account for VLAN tags in packets and may misbehave. One + solution is to enable VLAN stripping on ingress so the VLAN tag is removed + from the packet and put into the mbuf->vlan_tci field. Here is an example + of how to accomplish this: .. code-block:: console @@ -274,6 +371,14 @@ Limitations vlan_offload |= ETH_VLAN_STRIP_OFFLOAD; rte_eth_dev_set_vlan_offload(port, vlan_offload); +Another alternative is modify the adapter's ingress VLAN rewrite mode so that +packets with the default VLAN tag are stripped by the adapter and presented to +DPDK as untagged packets. In this case mbuf->vlan_tci and the PKT_RX_VLAN and +PKT_RX_VLAN_STRIPPED mbuf flags would not be set. This mode is enabled with the +``devargs`` parameter ``ig-vlan-rewrite=untag``. For example:: + + -w 12:00.0,ig-vlan-rewrite=untag + - Limited flow director support on 1200 series and 1300 series Cisco VIC adapters with old firmware. Please see :ref:`enic-flow-director`. @@ -305,6 +410,24 @@ Limitations were added. Since there currently is no grouping or priority support, 'catch-all' filters should be added last. +- **Statistics** + + - ``rx_good_bytes`` (ibytes) always includes VLAN header (4B) and CRC bytes (4B). + This behavior applies to 1300 and older series VIC adapters. + 1400 series VICs do not count CRC bytes, and count VLAN header only when VLAN + stripping is disabled. + - When the NIC drops a packet because the Rx queue has no free buffers, + ``rx_good_bytes`` still increments by 4B if the packet is not VLAN tagged or + VLAN stripping is disabled, or by 8B if the packet is VLAN tagged and stripping + is enabled. + This behavior applies to 1300 and older series VIC adapters. 1400 series VICs + do not increment this byte counter when packets are dropped. + +- **RSS Hashing** + + - Hardware enables and disables UDP and TCP RSS hashing together. The driver + cannot control UDP and TCP hashing individually. + How to build the suite ---------------------- @@ -322,17 +445,9 @@ Supported Cisco VIC adapters ENIC PMD supports all recent generations of Cisco VIC adapters including: -- VIC 1280 -- VIC 1240 -- VIC 1225 -- VIC 1285 -- VIC 1225T -- VIC 1227 -- VIC 1227T -- VIC 1380 -- VIC 1340 -- VIC 1385 -- VIC 1387 +- VIC 1200 series +- VIC 1300 series +- VIC 1400 series Supported Operating Systems --------------------------- @@ -356,10 +471,16 @@ Supported features - VLAN filtering (supported via UCSM/CIMC only) - Execution of application by unprivileged system users - IPV4, IPV6 and TCP RSS hashing +- UDP RSS hashing (1400 series and later adapters) - Scattered Rx - MTU update - SR-IOV on UCS managed servers connected to Fabric Interconnects - Flow API +- Overlay offload + + - Rx/Tx checksum offloads for VXLAN, NVGRE, GENEVE + - TSO for VXLAN and GENEVE packets + - Inner RSS Known bugs and unsupported features in this release --------------------------------------------------- @@ -369,8 +490,8 @@ Known bugs and unsupported features in this release - VLAN based flow direction - Non-IPV4 flow direction - Setting of extended VLAN -- UDP RSS hashing - MTU update only works if Scattered Rx mode is disabled +- Maximum receive packet length is ignored if Scattered Rx mode is used Prerequisites ------------- @@ -427,4 +548,4 @@ Any questions or bugs should be reported to DPDK community and to the ENIC PMD maintainers: - John Daley <johndale@cisco.com> -- Nelson Escobar <neescoba@cisco.com> +- Hyong Youb Kim <hyonkim@cisco.com> diff --git a/doc/guides/nics/fail_safe.rst b/doc/guides/nics/fail_safe.rst index 3f72b593..6c02d7ef 100644 --- a/doc/guides/nics/fail_safe.rst +++ b/doc/guides/nics/fail_safe.rst @@ -1,32 +1,6 @@ -.. BSD LICENSE +.. SPDX-License-Identifier: BSD-3-Clause Copyright 2017 6WIND S.A. - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of 6WIND S.A. nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - Fail-safe poll mode driver library ================================== diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst index 1b4fb979..cddc877d 100644 --- a/doc/guides/nics/features.rst +++ b/doc/guides/nics/features.rst @@ -278,6 +278,17 @@ Supports RSS hashing on RX. * **[provides] mbuf**: ``mbuf.ol_flags:PKT_RX_RSS_HASH``, ``mbuf.rss``. +.. _nic_features_inner_rss: + +Inner RSS +--------- + +Supports RX RSS hashing on Inner headers. + +* **[users] rte_flow_action_rss**: ``level``. +* **[provides] mbuf**: ``mbuf.ol_flags:PKT_RX_RSS_HASH``, ``mbuf.rss``. + + .. _nic_features_rss_key_update: RSS key update @@ -503,7 +514,7 @@ CRC offload Supports CRC stripping by hardware. -* **[uses] rte_eth_rxconf,rte_eth_rxmode**: ``offloads:DEV_RX_OFFLOAD_CRC_STRIP``. +* **[uses] rte_eth_rxconf,rte_eth_rxmode**: ``offloads:DEV_RX_OFFLOAD_CRC_STRIP,DEV_RX_OFFLOAD_KEEP_CRC``. .. _nic_features_vlan_offload: @@ -566,7 +577,6 @@ Supports L4 checksum offload. * **[uses] rte_eth_rxconf,rte_eth_rxmode**: ``offloads:DEV_RX_OFFLOAD_UDP_CKSUM,DEV_RX_OFFLOAD_TCP_CKSUM``. * **[uses] rte_eth_txconf,rte_eth_txmode**: ``offloads:DEV_TX_OFFLOAD_UDP_CKSUM,DEV_TX_OFFLOAD_TCP_CKSUM,DEV_TX_OFFLOAD_SCTP_CKSUM``. -* **[uses] user config**: ``dev_conf.rxmode.hw_ip_checksum``. * **[uses] mbuf**: ``mbuf.ol_flags:PKT_TX_IPV4`` | ``PKT_TX_IPV6``, ``mbuf.ol_flags:PKT_TX_L4_NO_CKSUM`` | ``PKT_TX_TCP_CKSUM`` | ``PKT_TX_SCTP_CKSUM`` | ``PKT_TX_UDP_CKSUM``. @@ -749,6 +759,17 @@ Supports getting/setting device eeprom data. ``rte_eth_dev_set_eeprom()``. +.. _nic_features_module_eeprom_dump: + +Module EEPROM dump +------------------ + +Supports getting information and data of plugin module eeprom. + +* **[implements] eth_dev_ops**: ``get_module_info``, ``get_module_eeprom``. +* **[related] API**: ``rte_eth_dev_get_module_info()``, ``rte_eth_dev_get_module_eeprom()``. + + .. _nic_features_register_dump: Registers dump @@ -892,7 +913,25 @@ Documentation describes performance values. See ``dpdk.org/doc/perf/*``. +.. _nic_features_runtime_rx_queue_setup: + +Runtime Rx queue setup +---------------------- + +Supports Rx queue setup after device started. + +* **[provides] rte_eth_dev_info**: ``dev_capa:RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP``. +* **[related] API**: ``rte_eth_dev_info_get()``. +.. _nic_features_runtime_tx_queue_setup: + +Runtime Tx queue setup +---------------------- + +Supports Tx queue setup after device started. + +* **[provides] rte_eth_dev_info**: ``dev_capa:RTE_ETH_DEV_CAPA_RUNTIME_TX_QUEUE_SETUP``. +* **[related] API**: ``rte_eth_dev_info_get()``. .. _nic_features_other: diff --git a/doc/guides/nics/features/avf.ini b/doc/guides/nics/features/avf.ini index ccb9edde..35ceada2 100644 --- a/doc/guides/nics/features/avf.ini +++ b/doc/guides/nics/features/avf.ini @@ -6,7 +6,6 @@ [Features] Speed capabilities = Y Link status = Y -Link status event = Y Rx interrupt = Y Queue start/stop = Y MTU update = Y diff --git a/doc/guides/nics/features/avf_vec.ini b/doc/guides/nics/features/avf_vec.ini index 89249948..3050bc4a 100644 --- a/doc/guides/nics/features/avf_vec.ini +++ b/doc/guides/nics/features/avf_vec.ini @@ -6,7 +6,6 @@ [Features] Speed capabilities = Y Link status = Y -Link status event = Y Rx interrupt = Y Queue start/stop = Y MTU update = Y diff --git a/doc/guides/nics/features/axgbe.ini b/doc/guides/nics/features/axgbe.ini new file mode 100644 index 00000000..ab4da559 --- /dev/null +++ b/doc/guides/nics/features/axgbe.ini @@ -0,0 +1,19 @@ +; +; Supported features of the 'axgbe' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Speed capabilities = Y +Link status = Y +Jumbo frame = Y +Promiscuous mode = Y +Allmulticast mode = Y +RSS hash = Y +CRC offload = Y +L3 checksum offload = Y +L4 checksum offload = Y +Basic stats = Y +Linux UIO = Y +x86-32 = Y +x86-64 = Y diff --git a/doc/guides/nics/features/cxgbe.ini b/doc/guides/nics/features/cxgbe.ini index 3d0fde2f..88f2f92b 100644 --- a/doc/guides/nics/features/cxgbe.ini +++ b/doc/guides/nics/features/cxgbe.ini @@ -14,7 +14,9 @@ TSO = Y Promiscuous mode = Y Allmulticast mode = Y RSS hash = Y +RSS key update = Y Flow control = Y +Flow API = Y CRC offload = Y VLAN offload = Y L3 checksum offload = Y @@ -24,6 +26,7 @@ Basic stats = Y Stats per queue = Y EEPROM dump = Y Registers dump = Y +Multiprocess aware = Y BSD nic_uio = Y Linux UIO = Y Linux VFIO = Y diff --git a/doc/guides/nics/features/cxgbevf.ini b/doc/guides/nics/features/cxgbevf.ini new file mode 100644 index 00000000..b41fc365 --- /dev/null +++ b/doc/guides/nics/features/cxgbevf.ini @@ -0,0 +1,29 @@ +; +; Supported features of the 'cxgbevf' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Speed capabilities = Y +Link status = Y +Queue start/stop = Y +MTU update = Y +Jumbo frame = Y +Scattered Rx = Y +TSO = Y +Promiscuous mode = Y +Allmulticast mode = Y +RSS hash = Y +CRC offload = Y +VLAN offload = Y +L3 checksum offload = Y +L4 checksum offload = Y +Packet type parsing = Y +Basic stats = Y +Stats per queue = Y +Multiprocess aware = Y +Linux UIO = Y +Linux VFIO = Y +x86-32 = Y +x86-64 = Y +Usage doc = Y diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini index dae2ad77..f1a39d0f 100644 --- a/doc/guides/nics/features/default.ini +++ b/doc/guides/nics/features/default.ini @@ -17,6 +17,8 @@ Lock-free Tx queue = Fast mbuf free = Free Tx mbuf on demand = Queue start/stop = +Runtime Rx queue setup = +Runtime Tx queue setup = MTU update = Jumbo frame = Scattered Rx = @@ -29,6 +31,7 @@ Multicast MAC filter = RSS hash = RSS key update = RSS reta update = +Inner RSS = VMDq = SR-IOV = DCB = @@ -63,6 +66,7 @@ Extended stats = Stats per queue = FW version = EEPROM dump = +Module EEPROM dump = Registers dump = LED = Multiprocess aware = diff --git a/doc/guides/nics/features/enic.ini b/doc/guides/nics/features/enic.ini index 498341f0..8a4bad29 100644 --- a/doc/guides/nics/features/enic.ini +++ b/doc/guides/nics/features/enic.ini @@ -6,23 +6,29 @@ [Features] Link status = Y Link status event = Y +Rx interrupt = Y Queue start/stop = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y TSO = Y Promiscuous mode = Y +Allmulticast mode = Y Unicast MAC filter = Y -Multicast MAC filter = Y +Multicast MAC filter = RSS hash = Y +RSS key update = Y +RSS reta update = Y +Inner RSS = Y SR-IOV = Y -VLAN filter = Y CRC offload = Y VLAN offload = Y Flow director = Y Flow API = Y L3 checksum offload = Y L4 checksum offload = Y +Inner L3 checksum = Y +Inner L4 checksum = Y Packet type parsing = Y Basic stats = Y Multiprocess aware = Y diff --git a/doc/guides/nics/features/fm10k.ini b/doc/guides/nics/features/fm10k.ini index f0f61a7d..0acdf0d3 100644 --- a/doc/guides/nics/features/fm10k.ini +++ b/doc/guides/nics/features/fm10k.ini @@ -5,6 +5,8 @@ ; [Features] Speed capabilities = P +Link status = Y +Link status event = Y Rx interrupt = Y Queue start/stop = Y Jumbo frame = Y @@ -24,6 +26,8 @@ VLAN offload = Y L3 checksum offload = Y L4 checksum offload = Y Packet type parsing = Y +Rx descriptor status = Y +Tx descriptor status = Y Basic stats = Y Extended stats = Y Stats per queue = Y diff --git a/doc/guides/nics/features/fm10k_vf.ini b/doc/guides/nics/features/fm10k_vf.ini index 32b93df4..44b50faa 100644 --- a/doc/guides/nics/features/fm10k_vf.ini +++ b/doc/guides/nics/features/fm10k_vf.ini @@ -5,6 +5,8 @@ ; [Features] Speed capabilities = P +Link status = Y +Link status event = Y Rx interrupt = Y Queue start/stop = Y Jumbo frame = Y diff --git a/doc/guides/nics/features/i40e.ini b/doc/guides/nics/features/i40e.ini index e862712c..16eab7f4 100644 --- a/doc/guides/nics/features/i40e.ini +++ b/doc/guides/nics/features/i40e.ini @@ -9,6 +9,8 @@ Link status = Y Link status event = Y Rx interrupt = Y Queue start/stop = Y +Runtime Rx queue setup = Y +Runtime Tx queue setup = Y Jumbo frame = Y Scattered Rx = Y TSO = Y @@ -44,6 +46,7 @@ Tx descriptor status = Y Basic stats = Y Extended stats = Y FW version = Y +Module EEPROM dump = Y Multiprocess aware = Y BSD nic_uio = Y Linux UIO = Y diff --git a/doc/guides/nics/features/i40e_vec.ini b/doc/guides/nics/features/i40e_vec.ini index 7d7b3a92..c65e8b03 100644 --- a/doc/guides/nics/features/i40e_vec.ini +++ b/doc/guides/nics/features/i40e_vec.ini @@ -34,6 +34,7 @@ Rx descriptor status = Y Tx descriptor status = Y Basic stats = Y Extended stats = Y +Module EEPROM dump = Y Multiprocess aware = Y BSD nic_uio = Y Linux UIO = Y diff --git a/doc/guides/nics/features/i40e_vf.ini b/doc/guides/nics/features/i40e_vf.ini index 46e0d9fc..ba2d8cbe 100644 --- a/doc/guides/nics/features/i40e_vf.ini +++ b/doc/guides/nics/features/i40e_vf.ini @@ -5,6 +5,7 @@ ; [Features] Rx interrupt = Y +Link status = Y Queue start/stop = Y Jumbo frame = Y Scattered Rx = Y diff --git a/doc/guides/nics/features/i40e_vf_vec.ini b/doc/guides/nics/features/i40e_vf_vec.ini index c2c6c19f..421ed919 100644 --- a/doc/guides/nics/features/i40e_vf_vec.ini +++ b/doc/guides/nics/features/i40e_vf_vec.ini @@ -5,6 +5,7 @@ ; [Features] Rx interrupt = Y +Link status = Y Queue start/stop = Y Jumbo frame = Y Scattered Rx = Y diff --git a/doc/guides/nics/features/ifcvf.ini b/doc/guides/nics/features/ifcvf.ini new file mode 100644 index 00000000..ef1fc471 --- /dev/null +++ b/doc/guides/nics/features/ifcvf.ini @@ -0,0 +1,8 @@ +; +; Supported features of the 'ifcvf' vDPA driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +x86-32 = Y +x86-64 = Y diff --git a/doc/guides/nics/features/igb.ini b/doc/guides/nics/features/igb.ini index 33d64d99..c53fd075 100644 --- a/doc/guides/nics/features/igb.ini +++ b/doc/guides/nics/features/igb.ini @@ -41,6 +41,7 @@ Basic stats = Y Extended stats = Y FW version = Y EEPROM dump = Y +Module EEPROM dump = Y Registers dump = Y BSD nic_uio = Y Linux UIO = Y diff --git a/doc/guides/nics/features/igb_vf.ini b/doc/guides/nics/features/igb_vf.ini index e641a2c9..d9653234 100644 --- a/doc/guides/nics/features/igb_vf.ini +++ b/doc/guides/nics/features/igb_vf.ini @@ -4,6 +4,7 @@ ; Refer to default.ini for the full list of available PMD features. ; [Features] +Link status = Y Rx interrupt = Y Scattered Rx = Y TSO = Y diff --git a/doc/guides/nics/features/ixgbe.ini b/doc/guides/nics/features/ixgbe.ini index 1d68ee8e..41431117 100644 --- a/doc/guides/nics/features/ixgbe.ini +++ b/doc/guides/nics/features/ixgbe.ini @@ -51,6 +51,7 @@ Extended stats = Y Stats per queue = Y FW version = Y EEPROM dump = Y +Module EEPROM dump = Y Registers dump = Y Multiprocess aware = Y BSD nic_uio = Y diff --git a/doc/guides/nics/features/ixgbe_vec.ini b/doc/guides/nics/features/ixgbe_vec.ini index 28bc0547..ef3ee688 100644 --- a/doc/guides/nics/features/ixgbe_vec.ini +++ b/doc/guides/nics/features/ixgbe_vec.ini @@ -40,6 +40,7 @@ Basic stats = Y Extended stats = Y Stats per queue = Y EEPROM dump = Y +Module EEPROM dump = Y Registers dump = Y Multiprocess aware = Y BSD nic_uio = Y diff --git a/doc/guides/nics/features/mlx4.ini b/doc/guides/nics/features/mlx4.ini index f6efd21d..98a3f611 100644 --- a/doc/guides/nics/features/mlx4.ini +++ b/doc/guides/nics/features/mlx4.ini @@ -13,6 +13,7 @@ Queue start/stop = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y +TSO = Y Promiscuous mode = Y Allmulticast mode = Y Unicast MAC filter = Y diff --git a/doc/guides/nics/features/mlx5.ini b/doc/guides/nics/features/mlx5.ini index c3636391..b28b43e5 100644 --- a/doc/guides/nics/features/mlx5.ini +++ b/doc/guides/nics/features/mlx5.ini @@ -21,6 +21,7 @@ Multicast MAC filter = Y RSS hash = Y RSS key update = Y RSS reta update = Y +Inner RSS = Y SR-IOV = Y VLAN filter = Y Flow director = Y @@ -29,6 +30,9 @@ CRC offload = Y VLAN offload = Y L3 checksum offload = Y L4 checksum offload = Y +Timestamp offload = Y +Inner L3 checksum = Y +Inner L4 checksum = Y Packet type parsing = Y Rx descriptor status = Y Tx descriptor status = Y @@ -39,5 +43,6 @@ Multiprocess aware = Y Other kdrv = Y ARMv8 = Y Power8 = Y +x86-32 = Y x86-64 = Y Usage doc = Y diff --git a/doc/guides/nics/features/mrvl.ini b/doc/guides/nics/features/mvpp2.ini index 00d96218..ef47546d 100644 --- a/doc/guides/nics/features/mrvl.ini +++ b/doc/guides/nics/features/mvpp2.ini @@ -1,5 +1,5 @@ ; -; Supported features of the 'mrvl' network poll mode driver. +; Supported features of the 'mvpp2' network poll mode driver. ; ; Refer to default.ini for the full list of available PMD features. ; @@ -13,11 +13,13 @@ Allmulticast mode = Y Unicast MAC filter = Y Multicast MAC filter = Y RSS hash = Y +Flow control = Y VLAN filter = Y CRC offload = Y L3 checksum offload = Y L4 checksum offload = Y Packet type parsing = Y Basic stats = Y +Extended stats = Y ARMv8 = Y Usage doc = Y diff --git a/doc/guides/nics/features/netvsc.ini b/doc/guides/nics/features/netvsc.ini new file mode 100644 index 00000000..2ff6042b --- /dev/null +++ b/doc/guides/nics/features/netvsc.ini @@ -0,0 +1,23 @@ +; +; Supported features of the 'netvsc' network poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +Speed capabilities = P +Link status = Y +Queue start/stop = Y +Scattered Rx = Y +Promiscuous mode = Y +Allmulticast mode = Y +Basic stats = Y +Stats per queue = Y +Extended stats = Y +Multiprocess aware = Y +Other kdrv = Y +ARMv7 = Y +ARMv8 = Y +x86-32 = Y +x86-64 = Y +Usage doc = Y +MTU update = Y diff --git a/doc/guides/nics/features/qede.ini b/doc/guides/nics/features/qede.ini index cbadc194..0d081002 100644 --- a/doc/guides/nics/features/qede.ini +++ b/doc/guides/nics/features/qede.ini @@ -6,10 +6,11 @@ [Features] Speed capabilities = Y Link status = Y -Link status event = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y +LRO = Y +TSO = Y Promiscuous mode = Y Allmulticast mode = Y Unicast MAC filter = Y @@ -18,12 +19,14 @@ RSS hash = Y RSS key update = Y RSS reta update = Y VLAN filter = Y +N-tuple filter = Y +Tunnel filter = Y +Flow director = Y Flow control = Y CRC offload = Y VLAN offload = Y L3 checksum offload = Y L4 checksum offload = Y -Tunnel filter = Y Inner L3 checksum = Y Inner L4 checksum = Y Packet type parsing = Y @@ -32,11 +35,8 @@ Extended stats = Y Stats per queue = Y Multiprocess aware = Y Linux UIO = Y +Linux VFIO = Y ARMv8 = Y x86-32 = Y x86-64 = Y Usage doc = Y -N-tuple filter = Y -Flow director = Y -LRO = Y -TSO = Y diff --git a/doc/guides/nics/features/qede_vf.ini b/doc/guides/nics/features/qede_vf.ini index 18857b6e..e796b313 100644 --- a/doc/guides/nics/features/qede_vf.ini +++ b/doc/guides/nics/features/qede_vf.ini @@ -6,7 +6,6 @@ [Features] Speed capabilities = Y Link status = Y -Link status event = Y MTU update = Y Jumbo frame = Y Scattered Rx = Y @@ -30,6 +29,7 @@ Extended stats = Y Stats per queue = Y Multiprocess aware = Y Linux UIO = Y +Linux VFIO = Y ARMv8 = Y x86-32 = Y x86-64 = Y diff --git a/doc/guides/nics/features/softnic.ini b/doc/guides/nics/features/softnic.ini new file mode 100644 index 00000000..0583381c --- /dev/null +++ b/doc/guides/nics/features/softnic.ini @@ -0,0 +1,9 @@ +; +; Supported features of the 'softnic' poll mode driver. +; +; Refer to default.ini for the full list of available PMD features. +; +[Features] +x86-32 = Y +x86-64 = Y +Usage doc = Y diff --git a/doc/guides/nics/features/vhost.ini b/doc/guides/nics/features/vhost.ini index dffd1f49..ef81abb4 100644 --- a/doc/guides/nics/features/vhost.ini +++ b/doc/guides/nics/features/vhost.ini @@ -5,7 +5,6 @@ ; [Features] Link status = Y -Link status event = Y Free Tx mbuf on demand = Y Queue status event = Y Basic stats = Y diff --git a/doc/guides/nics/features/virtio.ini b/doc/guides/nics/features/virtio.ini index 16e577df..a16b8172 100644 --- a/doc/guides/nics/features/virtio.ini +++ b/doc/guides/nics/features/virtio.ini @@ -6,6 +6,7 @@ [Features] Speed capabilities = P Link status = Y +Link status event = Y Rx interrupt = Y Queue start/stop = Y Scattered Rx = Y diff --git a/doc/guides/nics/features/virtio_vec.ini b/doc/guides/nics/features/virtio_vec.ini index c06c860d..e60fe36a 100644 --- a/doc/guides/nics/features/virtio_vec.ini +++ b/doc/guides/nics/features/virtio_vec.ini @@ -6,6 +6,7 @@ [Features] Speed capabilities = P Link status = Y +Link status event = Y Rx interrupt = Y Queue start/stop = Y Promiscuous mode = Y diff --git a/doc/guides/nics/fm10k.rst b/doc/guides/nics/fm10k.rst index c44e226e..d1391e99 100644 --- a/doc/guides/nics/fm10k.rst +++ b/doc/guides/nics/fm10k.rst @@ -79,14 +79,14 @@ Other features are supported using optional MACRO configuration. They include: To enable via ``RX_OLFLAGS`` use ``RTE_LIBRTE_FM10K_RX_OLFLAGS_ENABLE=y``. -To guarantee the constraint, the following configuration flags in ``dev_conf.rxmode`` +To guarantee the constraint, the following capabilities in ``dev_conf.rxmode.offloads`` will be checked: -* ``hw_vlan_extend`` +* ``DEV_RX_OFFLOAD_VLAN_EXTEND`` -* ``hw_ip_checksum`` +* ``DEV_RX_OFFLOAD_CHECKSUM`` -* ``header_split`` +* ``DEV_RX_OFFLOAD_HEADER_SPLIT`` * ``fdir_conf->mode`` @@ -106,19 +106,9 @@ TX Constraint Features not Supported by TX Vector PMD ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -TX vPMD only works when ``txq_flags`` is set to ``FM10K_SIMPLE_TX_FLAG``. -This means that it does not support TX multi-segment, VLAN offload or TX csum -offload. The following MACROs are used for these three features: +TX vPMD only works when offloads is set to 0 -* ``ETH_TXQ_FLAGS_NOMULTSEGS`` - -* ``ETH_TXQ_FLAGS_NOVLANOFFL`` - -* ``ETH_TXQ_FLAGS_NOXSUMSCTP`` - -* ``ETH_TXQ_FLAGS_NOXSUMUDP`` - -* ``ETH_TXQ_FLAGS_NOXSUMTCP`` +This means that it does not support any TX offload. Limitations ----------- @@ -149,9 +139,8 @@ CRC striping ~~~~~~~~~~~~ The FM10000 family of NICs strip the CRC for every packets coming into the -host interface. So, CRC will be stripped even when the -``rxmode.hw_strip_crc`` member is set to 0 in ``struct rte_eth_conf``. - +host interface. So, CRC will be stripped even when ``DEV_RX_OFFLOAD_CRC_STRIP`` +in ``rxmode.offloads`` is NOT set in ``struct rte_eth_conf``. Maximum packet length ~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index e1b8083c..65d87f86 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -4,14 +4,16 @@ I40E Poll Mode Driver ====================== -The I40E PMD (librte_pmd_i40e) provides poll mode driver support -for the Intel X710/XL710/X722 10/40 Gbps family of adapters. +The i40e PMD (librte_pmd_i40e) provides poll mode driver support for +10/25/40 Gbps Intel® Ethernet 700 Series Network Adapters based on +the Intel Ethernet Controller X710/XL710/XXV710 and Intel Ethernet +Connection X722 (only support part of features). Features -------- -Features of the I40E PMD are: +Features of the i40e PMD are: - Multiple queues for TX and RX - Receiver Side Scaling (RSS) @@ -40,6 +42,7 @@ Features of the I40E PMD are: - VF Daemon (VFD) - EXPERIMENTAL - Dynamic Device Personalization (DDP) - Queue region configuration +- Virtual Function Port Representors Prerequisites ------------- @@ -53,7 +56,37 @@ Prerequisites section of the :ref:`Getting Started Guide for Linux <linux_gsg>`. - Upgrade the NVM/FW version following the `Intel® Ethernet NVM Update Tool Quick Usage Guide for Linux - <https://www-ssl.intel.com/content/www/us/en/embedded/products/networking/nvm-update-tool-quick-linux-usage-guide.html>`_ if needed. + <https://www-ssl.intel.com/content/www/us/en/embedded/products/networking/nvm-update-tool-quick-linux-usage-guide.html>`_ and `Intel® Ethernet NVM Update Tool: Quick Usage Guide for EFI <https://www.intel.com/content/www/us/en/embedded/products/networking/nvm-update-tool-quick-efi-usage-guide.html>`_ if needed. + +Recommended Matching List +------------------------- + +It is highly recommended to upgrade the i40e kernel driver and firmware to +avoid the compatibility issues with i40e PMD. Here is the suggested matching +list which has been tested and verified. The detailed information can refer +to chapter Tested Platforms/Tested NICs in release notes. + + +--------------+-----------------------+------------------+ + | DPDK version | Kernel driver version | Firmware version | + +==============+=======================+==================+ + | 18.05 | 2.4.6 | 6.01 | + +--------------+-----------------------+------------------+ + | 18.02 | 2.4.3 | 6.01 | + +--------------+-----------------------+------------------+ + | 17.11 | 2.1.26 | 6.01 | + +--------------+-----------------------+------------------+ + | 17.08 | 2.0.19 | 6.01 | + +--------------+-----------------------+------------------+ + | 17.05 | 1.5.23 | 5.05 | + +--------------+-----------------------+------------------+ + | 17.02 | 1.5.23 | 5.05 | + +--------------+-----------------------+------------------+ + | 16.11 | 1.5.23 | 5.05 | + +--------------+-----------------------+------------------+ + | 16.07 | 1.4.25 | 5.04 | + +--------------+-----------------------+------------------+ + | 16.04 | 1.4.25 | 5.02 | + +--------------+-----------------------+------------------+ Pre-Installation Configuration ------------------------------ @@ -93,11 +126,6 @@ Please note that enabling debugging options may affect system performance. Number of queues reserved for each VMDQ Pool. -- ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` (default ``-1``) - - Interrupt Throttling interval. - - Runtime Config Options ~~~~~~~~~~~~~~~~~~~~~~ @@ -121,6 +149,20 @@ Runtime Config Options will switch PF interrupt from IntN to Int0 to avoid interrupt conflict between DPDK and Linux Kernel. +- ``Support VF Port Representor`` (default ``not enabled``) + + The i40e PF PMD supports the creation of VF port representors for the control + and monitoring of i40e virtual function devices. Each port representor + corresponds to a single virtual function of that device. Using the ``devargs`` + option ``representor`` the user can specify which virtual functions to create + port representors for on initialization of the PF PMD by passing the VF IDs of + the VFs which are required.:: + + -w DBDF,representor=[0,1,4] + + Currently hot-plugging of representor ports is not supported so all required + representors must be specified on the creation of the PF. + Driver compilation and testing ------------------------------ @@ -324,7 +366,7 @@ Delete all flow director rules on a port: Floating VEB ~~~~~~~~~~~~~ -The Intel® Ethernet Controller X710 and XL710 Family support a feature called +The Intel® Ethernet 700 Series support a feature called "Floating VEB". A Virtual Ethernet Bridge (VEB) is an IEEE Edge Virtual Bridging (EVB) term @@ -370,21 +412,22 @@ or greater. Dynamic Device Personalization (DDP) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The Intel® Ethernet Controller X*710 support a feature called "Dynamic Device -Personalization (DDP)", which is used to configure hardware by downloading -a profile to support protocols/filters which are not supported by default. -The DDP functionality requires a NIC firmware version of 6.0 or greater. +The Intel® Ethernet 700 Series except for the Intel Ethernet Connection +X722 support a feature called "Dynamic Device Personalization (DDP)", +which is used to configure hardware by downloading a profile to support +protocols/filters which are not supported by default. The DDP +functionality requires a NIC firmware version of 6.0 or greater. -Current implementation supports MPLSoUDP/MPLSoGRE/GTP-C/GTP-U/PPPoE/PPPoL2TP, +Current implementation supports GTP-C/GTP-U/PPPoE/PPPoL2TP, steering can be used with rte_flow API. -Load a profile which supports MPLSoUDP/MPLSoGRE and store backup profile: +Load a profile which supports GTP and store backup profile: .. code-block:: console - testpmd> ddp add 0 ./mpls.pkgo,./backup.pkgo + testpmd> ddp add 0 ./gtp.pkgo,./backup.pkgo -Delete a MPLS profile and restore backup profile: +Delete a GTP profile and restore backup profile: .. code-block:: console @@ -396,11 +439,11 @@ Get loaded DDP package info list: testpmd> ddp get list 0 -Display information about a MPLS profile: +Display information about a GTP profile: .. code-block:: console - testpmd> ddp get info ./mpls.pkgo + testpmd> ddp get info ./gtp.pkgo Input set configuration ~~~~~~~~~~~~~~~~~~~~~~~ @@ -416,7 +459,7 @@ For example, to use only 48bit prefix for IPv6 src address for IPv6 TCP RSS: Queue region configuration ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The Ethernet Controller X710/XL710 supports a feature of queue regions +The Intel® Ethernet 700 Series supports a feature of queue regions configuration for RSS in the PF, so that different traffic classes or different packet classification types can be separated to different queues in different queue regions. There is an API for configuration @@ -440,8 +483,8 @@ details please refer to :doc:`../testpmd_app_ug/index`. Limitations or Known issues --------------------------- -MPLS packet classification on X710/XL710 -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +MPLS packet classification +~~~~~~~~~~~~~~~~~~~~~~~~~~ For firmware versions prior to 5.0, MPLS packets are not recognized by the NIC. The L2 Payload flow type in flow director can be used to classify MPLS packet @@ -489,14 +532,14 @@ Incorrect Rx statistics when packet is oversize ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When a packet is over maximum frame size, the packet is dropped. -However the Rx statistics, when calling `rte_eth_stats_get` incorrectly +However, the Rx statistics, when calling `rte_eth_stats_get` incorrectly shows it as received. VF & TC max bandwidth setting ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The per VF max bandwidth and per TC max bandwidth cannot be enabled in parallel. -The dehavior is different when handling per VF and per TC max bandwidth setting. +The behavior is different when handling per VF and per TC max bandwidth setting. When enabling per VF max bandwidth, SW will check if per TC max bandwidth is enabled. If so, return failure. When enabling per TC max bandwidth, SW will check if per VF max bandwidth @@ -517,11 +560,11 @@ VF performance is impacted by PCI extended tag setting ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To reach maximum NIC performance in the VF the PCI extended tag must be -enabled. The DPDK I40E PF driver will set this feature during initialization, +enabled. The DPDK i40e PF driver will set this feature during initialization, but the kernel PF driver does not. So when running traffic on a VF which is managed by the kernel PF driver, a significant NIC performance downgrade has -been observed (for 64 byte packets, there is about 25% linerate downgrade for -a 25G device and about 35% for a 40G device). +been observed (for 64 byte packets, there is about 25% line-rate downgrade for +a 25GbE device and about 35% for a 40GbE device). For kernel version >= 4.11, the kernel's PCI driver will enable the extended tag if it detects that the device supports it. So by default, this is not an @@ -562,12 +605,12 @@ with DPDK, then the configuration will also impact port B in the NIC with kernel driver, which don't want to use the TPID. So PMD reports warning to clarify what is changed by writing global register. -High Performance of Small Packets on 40G NIC --------------------------------------------- +High Performance of Small Packets on 40GbE NIC +---------------------------------------------- As there might be firmware fixes for performance enhancement in latest version of firmware image, the firmware update might be needed for getting high performance. -Check with the local Intel's Network Division application engineers for firmware updates. +Check the Intel support website for the latest firmware updates. Users should consult the release notes specific to a DPDK release to identify the validated firmware version for a NIC using the i40e driver. @@ -577,23 +620,13 @@ Use 16 Bytes RX Descriptor Size As i40e PMD supports both 16 and 32 bytes RX descriptor sizes, and 16 bytes size can provide helps to high performance of small packets. Configuration of ``CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC`` in config files can be changed to use 16 bytes size RX descriptors. -High Performance and per Packet Latency Tradeoff -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -Due to the hardware design, the interrupt signal inside NIC is needed for per -packet descriptor write-back. The minimum interval of interrupts could be set -at compile time by ``CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL`` in configuration files. -Though there is a default configuration, the interval could be tuned by the -users with that configuration item depends on what the user cares about more, -performance or per packet latency. - Example of getting best performance with l3fwd example ------------------------------------------------------ -The following is an example of running the DPDK ``l3fwd`` sample application to get high performance with an -Intel server platform and Intel XL710 NICs. +The following is an example of running the DPDK ``l3fwd`` sample application to get high performance with a +server with Intel Xeon processors and Intel Ethernet CNA XL710. -The example scenario is to get best performance with two Intel XL710 40GbE ports. +The example scenario is to get best performance with two Intel Ethernet CNA XL710 40GbE ports. See :numref:`figure_intel_perf_test_setup` for the performance test setup. .. _figure_intel_perf_test_setup: @@ -603,9 +636,9 @@ See :numref:`figure_intel_perf_test_setup` for the performance test setup. Performance Test Setup -1. Add two Intel XL710 NICs to the platform, and use one port per card to get best performance. - The reason for using two NICs is to overcome a PCIe Gen3's limitation since it cannot provide 80G bandwidth - for two 40G ports, but two different PCIe Gen3 x8 slot can. +1. Add two Intel Ethernet CNA XL710 to the platform, and use one port per card to get best performance. + The reason for using two NICs is to overcome a PCIe v3.0 limitation since it cannot provide 80GbE bandwidth + for two 40GbE ports, but two different PCIe v3.0 x8 slot can. Refer to the sample NICs output above, then we can select ``82:00.0`` and ``85:00.0`` as test ports:: 82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583] @@ -621,7 +654,7 @@ See :numref:`figure_intel_perf_test_setup` for the performance test setup. 4. Bind these two ports to igb_uio. -5. As to XL710 40G port, we need at least two queue pairs to achieve best performance, then two queues per port +5. As to Intel Ethernet CNA XL710 40GbE port, we need at least two queue pairs to achieve best performance, then two queues per port will be required, and each queue pair will need a dedicated CPU core for receiving/transmitting packets. 6. The DPDK sample application ``l3fwd`` will be used for performance testing, with using two ports for bi-directional forwarding. diff --git a/doc/guides/nics/ifc.rst b/doc/guides/nics/ifc.rst new file mode 100644 index 00000000..48f9adf1 --- /dev/null +++ b/doc/guides/nics/ifc.rst @@ -0,0 +1,96 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2018 Intel Corporation. + +IFCVF vDPA driver +================= + +The IFCVF vDPA (vhost data path acceleration) driver provides support for the +Intel FPGA 100G VF (IFCVF). IFCVF's datapath is virtio ring compatible, it +works as a HW vhost backend which can send/receive packets to/from virtio +directly by DMA. Besides, it supports dirty page logging and device state +report/restore, this driver enables its vDPA functionality. + + +Pre-Installation Configuration +------------------------------ + +Config File Options +~~~~~~~~~~~~~~~~~~~ + +The following option can be modified in the ``config`` file. + +- ``CONFIG_RTE_LIBRTE_IFCVF_VDPA_PMD`` (default ``y`` for linux) + + Toggle compilation of the ``librte_ifcvf_vdpa`` driver. + + +IFCVF vDPA Implementation +------------------------- + +IFCVF's vendor ID and device ID are same as that of virtio net pci device, +with its specific subsystem vendor ID and device ID. To let the device be +probed by IFCVF driver, adding "vdpa=1" parameter helps to specify that this +device is to be used in vDPA mode, rather than polling mode, virtio pmd will +skip when it detects this message. + +Different VF devices serve different virtio frontends which are in different +VMs, so each VF needs to have its own DMA address translation service. During +the driver probe a new container is created for this device, with this +container vDPA driver can program DMA remapping table with the VM's memory +region information. + +Key IFCVF vDPA driver ops +~~~~~~~~~~~~~~~~~~~~~~~~~ + +- ifcvf_dev_config: + Enable VF data path with virtio information provided by vhost lib, including + IOMMU programming to enable VF DMA to VM's memory, VFIO interrupt setup to + route HW interrupt to virtio driver, create notify relay thread to translate + virtio driver's kick to a MMIO write onto HW, HW queues configuration. + + This function gets called to set up HW data path backend when virtio driver + in VM gets ready. + +- ifcvf_dev_close: + Revoke all the setup in ifcvf_dev_config. + + This function gets called when virtio driver stops device in VM. + +To create a vhost port with IFC VF +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Create a vhost socket and assign a VF's device ID to this socket via + vhost API. When QEMU vhost connection gets ready, the assigned VF will + get configured automatically. + + +Features +-------- + +Features of the IFCVF driver are: + +- Compatibility with virtio 0.95 and 1.0. + + +Prerequisites +------------- + +- Platform with IOMMU feature. IFC VF needs address translation service to + Rx/Tx directly with virtio driver in VM. + + +Limitations +----------- + +Dependency on vfio-pci +~~~~~~~~~~~~~~~~~~~~~~ + +vDPA driver needs to setup VF MSIX interrupts, each queue's interrupt vector +is mapped to a callfd associated with a virtio ring. Currently only vfio-pci +allows multiple interrupts, so the IFCVF driver is dependent on vfio-pci. + +Live Migration with VIRTIO_NET_F_GUEST_ANNOUNCE +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +IFC VF doesn't support RARP packet generation, virtio frontend supporting +VIRTIO_NET_F_GUEST_ANNOUNCE feature can help to do that. diff --git a/doc/guides/nics/img/szedata2_nfb200g_architecture.svg b/doc/guides/nics/img/szedata2_nfb200g_architecture.svg new file mode 100644 index 00000000..e152e4a8 --- /dev/null +++ b/doc/guides/nics/img/szedata2_nfb200g_architecture.svg @@ -0,0 +1,214 @@ +<?xml version="1.0" encoding="UTF-8" standalone="no"?> +<svg + xmlns:dc="http://purl.org/dc/elements/1.1/" + xmlns:cc="http://creativecommons.org/ns#" + xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns="http://www.w3.org/2000/svg" + id="svg2" + stroke-miterlimit="10" + stroke-linecap="square" + stroke="none" + fill="none" + viewBox="0.0 0.0 568.7322834645669 352.3937007874016" + version="1.1"> + <metadata + id="metadata65"> + <rdf:RDF> + <cc:Work + rdf:about=""> + <dc:format>image/svg+xml</dc:format> + <dc:type + rdf:resource="http://purl.org/dc/dcmitype/StillImage" /> + <dc:title></dc:title> + </cc:Work> + </rdf:RDF> + </metadata> + <defs + id="defs63" /> + <clipPath + id="p.0"> + <path + id="path5" + clip-rule="nonzero" + d="m0 0l568.7323 0l0 352.3937l-568.7323 0l0 -352.3937z" /> + </clipPath> + <g + id="g7" + clip-path="url(#p.0)"> + <path + id="path9" + fill-rule="evenodd" + d="m0 0l568.7323 0l0 352.3937l-568.7323 0z" + fill-opacity="0.0" + fill="#000000" /> + <path + id="path11" + d="m 40.564137,14.365075 254.362203,0 0,131.842535 -254.362203,0 z" + style="fill:#47c3d3;fill-rule:evenodd" /> + <path + id="path15" + d="m 54.075948,146.2076 227.338592,0 0,32.94488 -227.338592,0 z" + style="fill:#c2c2c2;fill-rule:evenodd" /> + <path + id="path19" + d="m 321.90535,146.2076 227.33856,0 0,32.94488 -227.33856,0 z" + style="fill:#c2c2c2;fill-rule:evenodd" /> + <path + id="path23" + d="m 440.30217,146.24338 -11.82364,-20.50632 6.86313,0 0,-44.550399 -120.12924,0 0,6.938519 -20.28345,-11.953539 20.28345,-11.953547 0,6.93852 130.0503,0 0,54.580446 6.8631,0 z" + style="fill:#9a9a9a;fill-rule:evenodd" /> + <path + id="path25" + d="m 112.39353,263.09765 0,0 c 0,-8.08875 6.55722,-14.64597 14.64597,-14.64597 l 58.58208,0 0,0 c 3.88435,0 7.60962,1.54305 10.35626,4.28971 2.74666,2.74664 4.28971,6.47189 4.28971,10.35626 l 0,58.58209 c 0,8.08875 -6.55722,14.64597 -14.64597,14.64597 l -58.58208,0 c -8.08875,0 -14.64597,-6.55722 -14.64597,-14.64597 z" + style="fill:#c2c2c2;fill-rule:evenodd" /> + <path + id="path29" + d="m 391.63763,263.09765 0,0 c 0,-8.08875 6.55722,-14.64597 14.64597,-14.64597 l 58.58209,0 0,0 c 3.88437,0 7.60962,1.54305 10.35626,4.28971 2.74664,2.74664 4.2897,6.47189 4.2897,10.35626 l 0,58.58209 c 0,8.08875 -6.55722,14.64597 -14.64596,14.64597 l -58.58209,0 c -8.08875,0 -14.64597,-6.55722 -14.64597,-14.64597 z" + style="fill:#c2c2c2;fill-rule:evenodd" /> + <path + id="path33" + d="m 135.20981,199.01075 19.85826,-19.85826 19.85828,19.85826 -9.92914,0 0,29.5748 9.92914,0 -19.85828,19.85827 -19.85826,-19.85827 9.92914,0 0,-29.5748 z" + style="fill:#9a9a9a;fill-rule:evenodd" /> + <path + id="path35" + d="m 415.71635,199.01064 19.85828,-19.85826 19.85827,19.85826 -9.92914,0 0,29.57481 9.92914,0 -19.85827,19.85826 -19.85828,-19.85826 9.92914,0 0,-29.57481 z" + style="fill:#9a9a9a;fill-rule:evenodd" /> + <path + id="path37" + d="m 15.205,31.273212 74.362206,0 0,32.944885 -74.362206,0 z" + style="fill:#ff8434;fill-rule:evenodd" /> + <path + id="path41" + d="m 16.05531,80.231216 74.3622,0 0,32.944884 -74.3622,0 z" + style="fill:#ff8434;fill-rule:evenodd" /> + <path + id="path45" + d="m 275.44377,174.07111 0,111.55905 -37.16536,0 0,-111.55905 z" + style="fill:#ff8434;fill-rule:evenodd" /> + <path + id="path49" + d="m 97.923493,174.07111 0,111.55905 -37.16535,0 0,-111.55905 z" + style="fill:#ff8434;fill-rule:evenodd" /> + <path + id="path53" + d="m 366.27543,174.07111 0,111.55905 -37.16537,0 0,-111.55905 z" + style="fill:#ff8434;fill-rule:evenodd" /> + <path + id="path57" + d="m 542.0392,174.07111 0,111.55905 -37.16534,0 0,-111.55905 z" + style="fill:#ff8434;fill-rule:evenodd" /> + <text + id="text4480" + y="54.570911" + x="24.425898" + style="font-style:normal;font-weight:normal;font-size:18.75px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + y="54.570911" + x="24.425898" + id="tspan4482">ETH 0</tspan></text> + <text + id="text4480-3" + y="103.53807" + x="25.51882" + style="font-style:normal;font-weight:normal;font-size:20px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4502" + y="103.53807" + x="25.51882">ETH 1</tspan></text> + <text + id="text4480-7" + y="86.200645" + x="103.15979" + style="font-style:normal;font-weight:normal;font-size:18.75px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + id="tspan4524" + y="86.200645" + x="103.15979">NFB-200G2QL card</tspan></text> + <text + id="text4480-7-3" + y="169.2041" + x="92.195312" + style="font-style:normal;font-weight:normal;font-size:18.75px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4546" + y="169.2041" + x="92.195312">PCI-E master slot</tspan></text> + <text + id="text4480-7-3-6" + y="169.20409" + x="367.98856" + style="font-style:normal;font-weight:normal;font-size:18.75px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4546-2" + y="169.20409" + x="367.98856">PCI-E slave slot</tspan></text> + <text + transform="matrix(0,1,-1,0,0,0)" + id="text4480-3-9" + y="-73.591309" + x="182.29367" + style="font-style:normal;font-weight:normal;font-size:20px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4502-1" + y="-73.591309" + x="182.29367">QUEUE 0</tspan></text> + <text + transform="matrix(0,1.0000002,-0.99999976,0,0,0)" + id="text4480-3-9-2" + y="-251.11163" + x="182.29283" + style="font-style:normal;font-weight:normal;font-size:20px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4502-1-7" + y="-251.11163" + x="182.29283">QUEUE 15</tspan></text> + <text + transform="matrix(0,1,-1,0,0,0)" + id="text4480-3-9-2-0" + y="-341.94324" + x="182.29311" + style="font-style:normal;font-weight:normal;font-size:20px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4502-1-7-9" + y="-341.94324" + x="182.29311">QUEUE 16</tspan></text> + <text + transform="matrix(0,1,-1,0,0,0)" + id="text4480-3-9-2-3" + y="-517.70703" + x="182.29356" + style="font-style:normal;font-weight:normal;font-size:20px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4502-1-7-6" + y="-517.70703" + x="182.29356">QUEUE 31</tspan></text> + <text + id="text4480-3-0" + y="299.21396" + x="128.3978" + style="font-style:normal;font-weight:normal;font-size:20px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4502-6" + y="299.21396" + x="128.3978">CPU 0</tspan></text> + <text + id="text4480-3-0-2" + y="299.21396" + x="407.88452" + style="font-style:normal;font-weight:normal;font-size:20px;line-height:125%;font-family:sans-serif;letter-spacing:0px;word-spacing:0px;fill:#000000;fill-opacity:1;stroke:none;stroke-width:1px;stroke-linecap:butt;stroke-linejoin:miter;stroke-opacity:1" + xml:space="preserve"><tspan + style="font-size:18.75px" + id="tspan4502-6-6" + y="299.21396" + x="407.88452">CPU 1</tspan></text> + </g> +</svg> diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index 59419f43..59f6063d 100644 --- a/doc/guides/nics/index.rst +++ b/doc/guides/nics/index.rst @@ -13,6 +13,7 @@ Network Interface Controller Drivers build_and_test ark avp + axgbe bnx2x bnxt cxgbe @@ -23,6 +24,7 @@ Network Interface Controller Drivers enic fm10k i40e + ifc igb ixgbe intel_vf @@ -30,11 +32,13 @@ Network Interface Controller Drivers liquidio mlx4 mlx5 - mrvl + mvpp2 + netvsc nfp octeontx qede sfc_efx + softnic szedata2 tap thunderx diff --git a/doc/guides/nics/ixgbe.rst b/doc/guides/nics/ixgbe.rst index 0c660f29..16d63902 100644 --- a/doc/guides/nics/ixgbe.rst +++ b/doc/guides/nics/ixgbe.rst @@ -68,15 +68,15 @@ Other features are supported using optional MACRO configuration. They include: * HW extend dual VLAN -To guarantee the constraint, configuration flags in dev_conf.rxmode will be checked: +To guarantee the constraint, capabilities in dev_conf.rxmode.offloads will be checked: -* hw_vlan_strip +* DEV_RX_OFFLOAD_VLAN_STRIP -* hw_vlan_extend +* DEV_RX_OFFLOAD_VLAN_EXTEND -* hw_ip_checksum +* DEV_RX_OFFLOAD_CHECKSUM -* header_split +* DEV_RX_OFFLOAD_HEADER_SPLIT * dev_conf @@ -102,20 +102,9 @@ Consequently, by default the tx_rs_thresh value is in the range 32 to 64. Feature not Supported by TX Vector PMD ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -TX vPMD only works when txq_flags is set to IXGBE_SIMPLE_FLAGS. +TX vPMD only works when offloads is set to 0 -This means that it does not support TX multi-segment, VLAN offload and TX csum offload. -The following MACROs are used for these three features: - -* ETH_TXQ_FLAGS_NOMULTSEGS - -* ETH_TXQ_FLAGS_NOVLANOFFL - -* ETH_TXQ_FLAGS_NOXSUMSCTP - -* ETH_TXQ_FLAGS_NOXSUMUDP - -* ETH_TXQ_FLAGS_NOXSUMTCP +This means that it does not support any TX offload. Application Programming Interface --------------------------------- @@ -130,13 +119,13 @@ l3fwd ~~~~~ When running l3fwd with vPMD, there is one thing to note. -In the configuration, ensure that port_conf.rxmode.hw_ip_checksum=0. +In the configuration, ensure that DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads is NOT set. Otherwise, by default, RX vPMD is disabled. load_balancer ~~~~~~~~~~~~~ -As in the case of l3fwd, set configure port_conf.rxmode.hw_ip_checksum=0 to enable vPMD. +As in the case of l3fwd, to enable vPMD, do NOT set DEV_RX_OFFLOAD_CHECKSUM in port_conf.rxmode.offloads. In addition, for improved performance, use -bsz "(32,32),(64,64),(32,32)" in load_balancer to avoid using the default burst size of 144. @@ -228,6 +217,20 @@ For more details see the IPsec Security Gateway Sample Application and Security library documentation. +Virtual Function Port Representors +---------------------------------- +The IXGBE PF PMD supports the creation of VF port representors for the control +and monitoring of IXGBE virtual function devices. Each port representor +corresponds to a single virtual function of that device. Using the ``devargs`` +option ``representor`` the user can specify which virtual functions to create +port representors for on initialization of the PF PMD by passing the VF IDs of +the VFs which are required.:: + + -w DBDF,representor=[0,1,4] + +Currently hot-plugging of representor ports is not supported so all required +representors must be specified on the creation of the PF. + Supported Chipsets and NICs --------------------------- diff --git a/doc/guides/nics/liquidio.rst b/doc/guides/nics/liquidio.rst index 61485ade..87b42cdc 100644 --- a/doc/guides/nics/liquidio.rst +++ b/doc/guides/nics/liquidio.rst @@ -201,6 +201,4 @@ Number of descriptors for Rx/Tx ring should be in the range 128 to 512. CRC striping ~~~~~~~~~~~~ -LiquidIO adapters strip ethernet FCS of every packet coming to the host -interface. So, CRC will be stripped even when the ``rxmode.hw_strip_crc`` -member is set to 0 in ``struct rte_eth_conf``. +LiquidIO adapters strip ethernet FCS of every packet coming to the host interface. diff --git a/doc/guides/nics/mlx4.rst b/doc/guides/nics/mlx4.rst index 98b97166..4a57c7a6 100644 --- a/doc/guides/nics/mlx4.rst +++ b/doc/guides/nics/mlx4.rst @@ -1,32 +1,6 @@ -.. BSD LICENSE +.. SPDX-License-Identifier: BSD-3-Clause Copyright 2012 6WIND S.A. - Copyright 2015 Mellanox - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of 6WIND S.A. nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + Copyright 2015 Mellanox Technologies, Ltd MLX4 poll mode driver library ============================= @@ -98,9 +72,10 @@ These options can be modified in the ``.config`` file. missing with ``ldd(1)``. It works by moving these dependencies to a purpose-built rdma-core "glue" - plug-in, which must either be installed in ``CONFIG_RTE_EAL_PMD_PATH`` if - set, or in a standard location for the dynamic linker (e.g. ``/lib``) if - left to the default empty string (``""``). + plug-in which must either be installed in a directory whose name is based + on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if set, or in a + standard location for the dynamic linker (e.g. ``/lib``) if left to the + default empty string (``""``). This option has no performance impact. @@ -110,14 +85,6 @@ These options can be modified in the ``.config`` file. adds additional run-time checks and debugging messages at the cost of lower performance. -- ``CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE`` (default **8**) - - Maximum number of cached memory pools (MPs) per TX queue. Each MP from - which buffers are to be transmitted must be associated to memory regions - (MRs). This is a slow operation that must be cached. - - This value is always 1 for RX queues since they use a single MP. - Environment variables ~~~~~~~~~~~~~~~~~~~~~ @@ -168,6 +135,16 @@ below. following limitation: VLAN filtering is not supported with this mode. This is the recommended mode in case VLAN filter is not needed. +Limitations +----------- + +- CRC stripping is supported by default and always reported as "true". + The ability to enable/disable CRC stripping requires OFED version + 4.3-1.5.0.0 and above or rdma-core version v18 and above. + +- TSO (Transmit Segmentation Offload) is supported in OFED version + 4.4 and above. + Prerequisites ------------- @@ -236,7 +213,7 @@ Current RDMA core package and Linux kernel (recommended) Mellanox OFED as a fallback ~~~~~~~~~~~~~~~~~~~~~~~~~~~ -- `Mellanox OFED`_ version: **4.2, 4.3**. +- `Mellanox OFED`_ version: **4.3, 4.4**. - firmware version: **2.42.5000** and above. .. _`Mellanox OFED`: http://www.mellanox.com/page/products_dyn?product_family=26&mtag=linux_sw_drivers @@ -396,6 +373,12 @@ Performance tuning The XXX can be different on different systems. Make sure to configure according to the setpci output. +6. To minimize overhead of searching Memory Regions: + + - '--socket-mem' is recommended to pin memory by predictable amount. + - Configure per-lcore cache when creating Mempools for packet buffer. + - Refrain from dynamically allocating/freeing memory in run-time. + Usage example ------------- diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index 0e6e525c..52e1213c 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -1,40 +1,14 @@ -.. BSD LICENSE +.. SPDX-License-Identifier: BSD-3-Clause Copyright 2015 6WIND S.A. - Copyright 2015 Mellanox - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of 6WIND S.A. nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + Copyright 2015 Mellanox Technologies, Ltd MLX5 poll mode driver ===================== The MLX5 poll mode driver library (**librte_pmd_mlx5**) provides support -for **Mellanox ConnectX-4**, **Mellanox ConnectX-4 Lx** and **Mellanox -ConnectX-5** families of 10/25/40/50/100 Gb/s adapters as well as their -virtual functions (VF) in SR-IOV context. +for **Mellanox ConnectX-4**, **Mellanox ConnectX-4 Lx** , **Mellanox +ConnectX-5** and **Mellanox Bluefield** families of 10/25/40/50/100 Gb/s +adapters as well as their virtual functions (VF) in SR-IOV context. Information and documentation about these adapters can be found on the `Mellanox website <http://www.mellanox.com>`__. Help is also provided by the @@ -75,7 +49,7 @@ libibverbs. Features -------- -- Multi arch support: x86_64, POWER8, ARMv8. +- Multi arch support: x86_64, POWER8, ARMv8, i686. - Multiple TX and RX queues. - Support for scattered TX and RX frames. - IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of queues. @@ -95,17 +69,17 @@ Features - Multiple process. - KVM and VMware ESX SR-IOV modes are supported. - RSS hash result is supported. -- Hardware TSO. -- Hardware checksum TX offload for VXLAN and GRE. +- Hardware TSO for generic IP or UDP tunnel, including VXLAN and GRE. +- Hardware checksum Tx offload for generic IP or UDP tunnel, including VXLAN and GRE. - RX interrupts. - Statistics query including Basic, Extended and per queue. - Rx HW timestamp. +- Tunnel types: VXLAN, L3 VXLAN, VXLAN-GPE, GRE, MPLSoGRE, MPLSoUDP. +- Tunnel HW offloads: packet type, inner/outer RSS, IP and UDP checksum verification. Limitations ----------- -- Inner RSS for VXLAN frames is not supported yet. -- Hardware checksum RX offloads for VXLAN inner header are not supported yet. - For secondary process: - Forked secondary process not supported. @@ -131,11 +105,37 @@ Limitations - A multi segment packet must have less than 6 segments in case the Tx burst function is set to multi-packet send or Enhanced multi-packet send. Otherwise it must have less than 50 segments. + - Count action for RTE flow is **only supported in Mellanox OFED**. + - Flows with a VXLAN Network Identifier equal (or ends to be equal) to 0 are not supported. + - VXLAN TSO and checksum offloads are not supported on VM. +- L3 VXLAN and VXLAN-GPE tunnels cannot be supported together with MPLSoGRE and MPLSoUDP. + +- VF: flow rules created on VF devices can only match traffic targeted at the + configured MAC addresses (see ``rte_eth_dev_mac_addr_add()``). + +.. note:: + + MAC addresses not already present in the bridge table of the associated + kernel network device will be added and cleaned up by the PMD when closing + the device. In case of ungraceful program termination, some entries may + remain present and should be removed manually by other means. + +- When Multi-Packet Rx queue is configured (``mprq_en``), a Rx packet can be + externally attached to a user-provided mbuf with having EXT_ATTACHED_MBUF in + ol_flags. As the mempool for the external buffer is managed by PMD, all the + Rx mbufs must be freed before the device is closed. Otherwise, the mempool of + the external buffers will be freed by PMD and the application which still + holds the external buffers may be corrupted. + +- If Multi-Packet Rx queue is configured (``mprq_en``) and Rx CQE compression is + enabled (``rxq_cqe_comp_en``) at the same time, RSS hash result is not fully + supported. Some Rx packets may not have PKT_RX_RSS_HASH. + Statistics ---------- @@ -171,9 +171,10 @@ These options can be modified in the ``.config`` file. missing with ``ldd(1)``. It works by moving these dependencies to a purpose-built rdma-core "glue" - plug-in, which must either be installed in ``CONFIG_RTE_EAL_PMD_PATH`` if - set, or in a standard location for the dynamic linker (e.g. ``/lib``) if - left to the default empty string (``""``). + plug-in which must either be installed in a directory whose name is based + on ``CONFIG_RTE_EAL_PMD_PATH`` suffixed with ``-glue`` if set, or in a + standard location for the dynamic linker (e.g. ``/lib``) if left to the + default empty string (``""``). This option has no performance impact. @@ -183,14 +184,6 @@ These options can be modified in the ``.config`` file. adds additional run-time checks and debugging messages at the cost of lower performance. -- ``CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE`` (default **8**) - - Maximum number of cached memory pools (MPs) per TX queue. Each MP from - which buffers are to be transmitted must be associated to memory regions - (MRs). This is a slow operation that must be cached. - - This value is always 1 for RX queues since they use a single MP. - Environment variables ~~~~~~~~~~~~~~~~~~~~~ @@ -250,8 +243,55 @@ Run-time configuration Supported on: - - x86_64 with ConnectX-4, ConnectX-4 LX and ConnectX-5. - - POWER8 and ARMv8 with ConnectX-4 LX and ConnectX-5. + - x86_64 with ConnectX-4, ConnectX-4 LX, ConnectX-5 and Bluefield. + - POWER8 and ARMv8 with ConnectX-4 LX, ConnectX-5 and Bluefield. + +- ``mprq_en`` parameter [int] + + A nonzero value enables configuring Multi-Packet Rx queues. Rx queue is + configured as Multi-Packet RQ if the total number of Rx queues is + ``rxqs_min_mprq`` or more and Rx scatter isn't configured. Disabled by + default. + + Multi-Packet Rx Queue (MPRQ a.k.a Striding RQ) can further save PCIe bandwidth + by posting a single large buffer for multiple packets. Instead of posting a + buffers per a packet, one large buffer is posted in order to receive multiple + packets on the buffer. A MPRQ buffer consists of multiple fixed-size strides + and each stride receives one packet. MPRQ can improve throughput for + small-packet tarffic. + + When MPRQ is enabled, max_rx_pkt_len can be larger than the size of + user-provided mbuf even if DEV_RX_OFFLOAD_SCATTER isn't enabled. PMD will + configure large stride size enough to accommodate max_rx_pkt_len as long as + device allows. Note that this can waste system memory compared to enabling Rx + scatter and multi-segment packet. + +- ``mprq_log_stride_num`` parameter [int] + + Log 2 of the number of strides for Multi-Packet Rx queue. Configuring more + strides can reduce PCIe tarffic further. If configured value is not in the + range of device capability, the default value will be set with a warning + message. The default value is 4 which is 16 strides per a buffer, valid only + if ``mprq_en`` is set. + + The size of Rx queue should be bigger than the number of strides. + +- ``mprq_max_memcpy_len`` parameter [int] + + The maximum length of packet to memcpy in case of Multi-Packet Rx queue. Rx + packet is mem-copied to a user-provided mbuf if the size of Rx packet is less + than or equal to this parameter. Otherwise, PMD will attach the Rx packet to + the mbuf by external buffer attachment - ``rte_pktmbuf_attach_extbuf()``. + A mempool for external buffers will be allocated and managed by PMD. If Rx + packet is externally attached, ol_flags field of the mbuf will have + EXT_ATTACHED_MBUF and this flag must be preserved. ``RTE_MBUF_HAS_EXTBUF()`` + checks the flag. The default value is 128, valid only if ``mprq_en`` is set. + +- ``rxqs_min_mprq`` parameter [int] + + Configure Rx queues as Multi-Packet RQ if the total number of Rx queues is + greater or equal to this value. The default value is 12, valid only if + ``mprq_en`` is set. - ``txq_inline`` parameter [int] @@ -270,34 +310,35 @@ Run-time configuration This option should be used in combination with ``txq_inline`` above. - On ConnectX-4, ConnectX-4 LX and ConnectX-5 without Enhanced MPW: + On ConnectX-4, ConnectX-4 LX, ConnectX-5 and Bluefield without + Enhanced MPW: - Disabled by default. - In case ``txq_inline`` is set recommendation is 4. - On ConnectX-5 with Enhanced MPW: + On ConnectX-5 and Bluefield with Enhanced MPW: - Set to 8 by default. - ``txq_mpw_en`` parameter [int] A nonzero value enables multi-packet send (MPS) for ConnectX-4 Lx and - enhanced multi-packet send (Enhanced MPS) for ConnectX-5. MPS allows the - TX burst function to pack up multiple packets in a single descriptor - session in order to save PCI bandwidth and improve performance at the - cost of a slightly higher CPU usage. When ``txq_inline`` is set along - with ``txq_mpw_en``, TX burst function tries to copy entire packet data - on to TX descriptor instead of including pointer of packet only if there - is enough room remained in the descriptor. ``txq_inline`` sets - per-descriptor space for either pointers or inlined packets. In addition, - Enhanced MPS supports hybrid mode - mixing inlined packets and pointers - in the same descriptor. + enhanced multi-packet send (Enhanced MPS) for ConnectX-5 and Bluefield. + MPS allows the TX burst function to pack up multiple packets in a + single descriptor session in order to save PCI bandwidth and improve + performance at the cost of a slightly higher CPU usage. When + ``txq_inline`` is set along with ``txq_mpw_en``, TX burst function tries + to copy entire packet data on to TX descriptor instead of including + pointer of packet only if there is enough room remained in the + descriptor. ``txq_inline`` sets per-descriptor space for either pointers + or inlined packets. In addition, Enhanced MPS supports hybrid mode - + mixing inlined packets and pointers in the same descriptor. This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO, DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``. When those offloads are requested the MPS send function will not be used. - It is currently only supported on the ConnectX-4 Lx and ConnectX-5 + It is currently only supported on the ConnectX-4 Lx, ConnectX-5 and Bluefield families of adapters. Enabled by default. - ``txq_mpw_hdr_dseg_en`` parameter [int] @@ -318,14 +359,14 @@ Run-time configuration - ``tx_vec_en`` parameter [int] - A nonzero value enables Tx vector on ConnectX-5 only NIC if the number of + A nonzero value enables Tx vector on ConnectX-5 and Bluefield NICs if the number of global Tx queues on the port is lesser than MLX5_VPMD_MIN_TXQS. This option cannot be used with certain offloads such as ``DEV_TX_OFFLOAD_TCP_TSO, DEV_TX_OFFLOAD_VXLAN_TNL_TSO, DEV_TX_OFFLOAD_GRE_TNL_TSO, DEV_TX_OFFLOAD_VLAN_INSERT``. When those offloads are requested the MPS send function will not be used. - Enabled by default on ConnectX-5. + Enabled by default on ConnectX-5 and Bluefield. - ``rx_vec_en`` parameter [int] @@ -334,6 +375,53 @@ Run-time configuration Enabled by default. +- ``vf_nl_en`` parameter [int] + + A nonzero value enables Netlink requests from the VF to add/remove MAC + addresses or/and enable/disable promiscuous/all multicast on the Netdevice. + Otherwise the relevant configuration must be run with Linux iproute2 tools. + This is a prerequisite to receive this kind of traffic. + + Enabled by default, valid only on VF devices ignored otherwise. + +- ``l3_vxlan_en`` parameter [int] + + A nonzero value allows L3 VXLAN and VXLAN-GPE flow creation. To enable + L3 VXLAN or VXLAN-GPE, users has to configure firmware and enable this + parameter. This is a prerequisite to receive this kind of traffic. + + Disabled by default. + +- ``representor`` parameter [list] + + This parameter can be used to instantiate DPDK Ethernet devices from + existing port (or VF) representors configured on the device. + + It is a standard parameter whose format is described in + :ref:`ethernet_device_standard_device_arguments`. + + For instance, to probe port representors 0 through 2:: + + representor=[0-2] + +Firmware configuration +~~~~~~~~~~~~~~~~~~~~~~ + +- L3 VXLAN and VXLAN-GPE destination UDP port + + .. code-block:: console + + mlxconfig -d <mst device> set IP_OVER_VXLAN_EN=1 + mlxconfig -d <mst device> set IP_OVER_VXLAN_PORT=<udp dport> + + Verify configurations are set: + + .. code-block:: console + + mlxconfig -d <mst device> query | grep IP_OVER_VXLAN + IP_OVER_VXLAN_EN True(1) + IP_OVER_VXLAN_PORT <udp dport> + Prerequisites ------------- @@ -353,12 +441,19 @@ DPDK and must be installed separately: - **libmlx5** - Low-level user space driver library for Mellanox ConnectX-4/ConnectX-5 - devices, it is automatically loaded by libibverbs. + Low-level user space driver library for Mellanox + ConnectX-4/ConnectX-5/Bluefield devices, it is automatically loaded + by libibverbs. This library basically implements send/receive calls to the hardware queues. +- **libmnl** + + Minimalistic Netlink library mainly relied on to manage E-Switch flow + rules (i.e. those with the "transfer" attribute and typically involving + port representors). + - **Kernel modules** They provide the kernel-side Verbs API and low level device drivers that @@ -368,15 +463,16 @@ DPDK and must be installed separately: Unlike most other PMDs, these modules must remain loaded and bound to their devices: - - mlx5_core: hardware driver managing Mellanox ConnectX-4/ConnectX-5 - devices and related Ethernet kernel network devices. + - mlx5_core: hardware driver managing Mellanox + ConnectX-4/ConnectX-5/Bluefield devices and related Ethernet kernel + network devices. - mlx5_ib: InifiniBand device driver. - ib_uverbs: user space driver for Verbs (entry point for libibverbs). - **Firmware update** - Mellanox OFED releases include firmware updates for ConnectX-4/ConnectX-5 - adapters. + Mellanox OFED releases include firmware updates for + ConnectX-4/ConnectX-5/Bluefield adapters. Because each release provides new features, these updates must be applied to match the kernel modules and libraries they come with. @@ -399,6 +495,10 @@ RMDA Core with Linux Kernel - Minimal kernel version : v4.14 or the most recent 4.14-rc (see `Linux installation documentation`_) - Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull request #227 from yishaih/tm") (see `RDMA Core installation documentation`_) +- When building for i686 use: + + - rdma-core version 18.0 or above built with 32bit support. + - Kernel version 4.14.41 or above. .. _`Linux installation documentation`: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/plain/Documentation/admin-guide/README.rst .. _`RDMA Core installation documentation`: https://raw.githubusercontent.com/linux-rdma/rdma-core/master/README.md @@ -406,13 +506,14 @@ RMDA Core with Linux Kernel Mellanox OFED ^^^^^^^^^^^^^ -- Mellanox OFED version: **4.2, 4.3**. +- Mellanox OFED version: **4.3, 4.4**. - firmware version: - ConnectX-4: **12.21.1000** and above. - ConnectX-4 Lx: **14.21.1000** and above. - ConnectX-5: **16.21.1000** and above. - ConnectX-5 Ex: **16.21.1000** and above. + - Bluefield: **18.99.3950** and above. While these libraries and kernel modules are available on OpenFabrics Alliance's `website <https://www.openfabrics.org/>`__ and provided by package @@ -431,6 +532,19 @@ required from that distribution. this DPDK release was developed and tested against is strongly recommended. Please check the `prerequisites`_. +Libmnl +^^^^^^ + +Minimal version for libmnl is **1.0.3**. + +As a dependency of the **iproute2** suite, this library is often installed +by default. It is otherwise readily available through standard system +packages. + +Its development headers must be installed in order to compile this PMD. +These packages are usually named **libmnl-dev** or **libmnl-devel** +depending on the Linux distribution. + Supported NICs -------------- @@ -603,6 +717,12 @@ Performance tuning The XXX can be different on different systems. Make sure to configure according to the setpci output. +7. To minimize overhead of searching Memory Regions: + + - '--socket-mem' is recommended to pin memory by predictable amount. + - Configure per-lcore cache when creating Mempools for packet buffer. + - Refrain from dynamically allocating/freeing memory in run-time. + Notes for testpmd ----------------- @@ -624,7 +744,7 @@ Usage example ------------- This section demonstrates how to launch **testpmd** with Mellanox -ConnectX-4/ConnectX-5 devices managed by librte_pmd_mlx5. +ConnectX-4/ConnectX-5/Bluefield devices managed by librte_pmd_mlx5. #. Load the kernel modules: diff --git a/doc/guides/nics/mrvl.rst b/doc/guides/nics/mrvl.rst deleted file mode 100644 index b7f32921..00000000 --- a/doc/guides/nics/mrvl.rst +++ /dev/null @@ -1,275 +0,0 @@ -.. BSD LICENSE - Copyright(c) 2017 Marvell International Ltd. - Copyright(c) 2017 Semihalf. - All rights reserved. - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of the copyright holder nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - -.. _mrvl_poll_mode_driver: - -MRVL Poll Mode Driver -====================== - -The MRVL PMD (librte_pmd_mrvl) provides poll mode driver support -for the Marvell PPv2 (Packet Processor v2) 1/10 Gbps adapter. - -Detailed information about SoCs that use PPv2 can be obtained here: - -* https://www.marvell.com/embedded-processors/armada-70xx/ -* https://www.marvell.com/embedded-processors/armada-80xx/ - -.. Note:: - - Due to external dependencies, this driver is disabled by default. It must - be enabled manually by setting relevant configuration option manually. - Please refer to `Config File Options`_ section for further details. - - -Features --------- - -Features of the MRVL PMD are: - -- Speed capabilities -- Link status -- Queue start/stop -- MTU update -- Jumbo frame -- Promiscuous mode -- Allmulticast mode -- Unicast MAC filter -- Multicast MAC filter -- RSS hash -- VLAN filter -- CRC offload -- L3 checksum offload -- L4 checksum offload -- Packet type parsing -- Basic stats -- QoS - - -Limitations ------------ - -- Number of lcores is limited to 9 by MUSDK internal design. If more lcores - need to be allocated, locking will have to be considered. Number of available - lcores can be changed via ``MRVL_MUSDK_HIFS_RESERVED`` define in - ``mrvl_ethdev.c`` source file. - -- Flushing vlans added for filtering is not possible due to MUSDK missing - functionality. Current workaround is to reset board so that PPv2 has a - chance to start in a sane state. - - -Prerequisites -------------- - -- Custom Linux Kernel sources - - .. code-block:: console - - git clone https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b linux-4.4.52-armada-17.10 - -- Out of tree `mvpp2x_sysfs` kernel module sources - - .. code-block:: console - - git clone https://github.com/MarvellEmbeddedProcessors/mvpp2x-marvell.git -b mvpp2x-armada-17.10 - -- MUSDK (Marvell User-Space SDK) sources - - .. code-block:: console - - git clone https://github.com/MarvellEmbeddedProcessors/musdk-marvell.git -b musdk-armada-17.10 - - MUSDK is a light-weight library that provides direct access to Marvell's - PPv2 (Packet Processor v2). Alternatively prebuilt MUSDK library can be - requested from `Marvell Extranet <https://extranet.marvell.com>`_. Once - approval has been granted, library can be found by typing ``musdk`` in - the search box. - - MUSDK must be configured with the following features: - - .. code-block:: console - - --enable-bpool-dma=64 - -- DPDK environment - - Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to setup - DPDK environment. - - -Config File Options -------------------- - -The following options can be modified in the ``config`` file. - -- ``CONFIG_RTE_LIBRTE_MRVL_PMD`` (default ``n``) - - Toggle compilation of the librte_pmd_mrvl driver. - - -QoS Configuration ------------------ - -QoS configuration is done through external configuration file. Path to the -file must be given as `cfg` in driver's vdev parameter list. - -Configuration syntax -~~~~~~~~~~~~~~~~~~~~ - -.. code-block:: console - - [port <portnum> default] - default_tc = <default_tc> - mapping_priority = <mapping_priority> - - [port <portnum> tc <traffic_class>] - rxq = <rx_queue_list> - pcp = <pcp_list> - dscp = <dscp_list> - - [port <portnum> tc <traffic_class>] - rxq = <rx_queue_list> - pcp = <pcp_list> - dscp = <dscp_list> - -Where: - -- ``<portnum>``: DPDK Port number (0..n). - -- ``<default_tc>``: Default traffic class (e.g. 0) - -- ``<mapping_priority>``: QoS priority for mapping (`ip`, `vlan`, `ip/vlan` or `vlan/ip`). - -- ``<traffic_class>``: Traffic Class to be configured. - -- ``<rx_queue_list>``: List of DPDK RX queues (e.g. 0 1 3-4) - -- ``<pcp_list>``: List of PCP values to handle in particular TC (e.g. 0 1 3-4 7). - -- ``<dscp_list>``: List of DSCP values to handle in particular TC (e.g. 0-12 32-48 63). - -Setting PCP/DSCP values for the default TC is not required. All PCP/DSCP -values not assigned explicitly to particular TC will be handled by the -default TC. - -Configuration file example -^^^^^^^^^^^^^^^^^^^^^^^^^^ - -.. code-block:: console - - [port 0 default] - default_tc = 0 - qos_mode = ip - - [port 0 tc 0] - rxq = 0 1 - - [port 0 tc 1] - rxq = 2 - pcp = 5 6 7 - dscp = 26-38 - - [port 1 default] - default_tc = 0 - qos_mode = vlan/ip - - [port 1 tc 0] - rxq = 0 - - [port 1 tc 1] - rxq = 1 2 - pcp = 5 6 7 - dscp = 26-38 - -Usage example -^^^^^^^^^^^^^ - -.. code-block:: console - - ./testpmd --vdev=eth_mrvl,iface=eth0,iface=eth2,cfg=/home/user/mrvl.conf \ - -c 7 -- -i -a --rxq=2 - - -Building DPDK -------------- - -Driver needs precompiled MUSDK library during compilation. - -.. code-block:: console - - export CROSS_COMPILE=<toolchain>/bin/aarch64-linux-gnu- - ./bootstrap - ./configure --host=aarch64-linux-gnu --enable-bpool-dma=64 - make install - -MUSDK will be installed to `usr/local` under current directory. -For the detailed build instructions please consult ``doc/musdk_get_started.txt``. - -Before the DPDK build process the environmental variable ``LIBMUSDK_PATH`` with -the path to the MUSDK installation directory needs to be exported. - -.. code-block:: console - - export LIBMUSDK_PATH=<musdk>/usr/local - export CROSS=aarch64-linux-gnu- - make config T=arm64-armv8a-linuxapp-gcc - sed -ri 's,(MRVL_PMD=)n,\1y,' build/.config - make - -Usage Example -------------- - -MRVL PMD requires extra out of tree kernel modules to function properly. -`musdk_uio` and `mv_pp_uio` sources are part of the MUSDK. Please consult -``doc/musdk_get_started.txt`` for the detailed build instructions. -For `mvpp2x_sysfs` please consult ``Documentation/pp22_sysfs.txt`` for the -detailed build instructions. - -.. code-block:: console - - insmod musdk_uio.ko - insmod mv_pp_uio.ko - insmod mvpp2x_sysfs.ko - -Additionally interfaces used by DPDK application need to be put up: - -.. code-block:: console - - ip link set eth0 up - ip link set eth2 up - -In order to run testpmd example application following command can be used: - -.. code-block:: console - - ./testpmd --vdev=eth_mrvl,iface=eth0,iface=eth2 -c 7 -- \ - --burst=128 --txd=2048 --rxd=1024 --rxq=2 --txq=2 --nb-cores=2 \ - -i -a --rss-udp diff --git a/doc/guides/nics/mvpp2.rst b/doc/guides/nics/mvpp2.rst new file mode 100644 index 00000000..0408752c --- /dev/null +++ b/doc/guides/nics/mvpp2.rst @@ -0,0 +1,520 @@ +.. BSD LICENSE + Copyright(c) 2017 Marvell International Ltd. + Copyright(c) 2017 Semihalf. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of the copyright holder nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +.. _mvpp2_poll_mode_driver: + +MVPP2 Poll Mode Driver +====================== + +The MVPP2 PMD (librte_pmd_mvpp2) provides poll mode driver support +for the Marvell PPv2 (Packet Processor v2) 1/10 Gbps adapter. + +Detailed information about SoCs that use PPv2 can be obtained here: + +* https://www.marvell.com/embedded-processors/armada-70xx/ +* https://www.marvell.com/embedded-processors/armada-80xx/ + +.. Note:: + + Due to external dependencies, this driver is disabled by default. It must + be enabled manually by setting relevant configuration option manually. + Please refer to `Config File Options`_ section for further details. + + +Features +-------- + +Features of the MVPP2 PMD are: + +- Speed capabilities +- Link status +- Queue start/stop +- MTU update +- Jumbo frame +- Promiscuous mode +- Allmulticast mode +- Unicast MAC filter +- Multicast MAC filter +- RSS hash +- VLAN filter +- CRC offload +- L3 checksum offload +- L4 checksum offload +- Packet type parsing +- Basic stats +- Extended stats +- QoS +- RX flow control +- TX queue start/stop + + +Limitations +----------- + +- Number of lcores is limited to 9 by MUSDK internal design. If more lcores + need to be allocated, locking will have to be considered. Number of available + lcores can be changed via ``MRVL_MUSDK_HIFS_RESERVED`` define in + ``mrvl_ethdev.c`` source file. + +- Flushing vlans added for filtering is not possible due to MUSDK missing + functionality. Current workaround is to reset board so that PPv2 has a + chance to start in a sane state. + + +Prerequisites +------------- + +- Custom Linux Kernel sources + + .. code-block:: console + + git clone https://github.com/MarvellEmbeddedProcessors/linux-marvell.git -b linux-4.4.52-armada-17.10 + +- Out of tree `mvpp2x_sysfs` kernel module sources + + .. code-block:: console + + git clone https://github.com/MarvellEmbeddedProcessors/mvpp2x-marvell.git -b mvpp2x-armada-17.10 + +- MUSDK (Marvell User-Space SDK) sources + + .. code-block:: console + + git clone https://github.com/MarvellEmbeddedProcessors/musdk-marvell.git -b musdk-armada-17.10 + + MUSDK is a light-weight library that provides direct access to Marvell's + PPv2 (Packet Processor v2). Alternatively prebuilt MUSDK library can be + requested from `Marvell Extranet <https://extranet.marvell.com>`_. Once + approval has been granted, library can be found by typing ``musdk`` in + the search box. + + To get better understanding of the library one can consult documentation + available in the ``doc`` top level directory of the MUSDK sources. + + MUSDK must be configured with the following features: + + .. code-block:: console + + --enable-bpool-dma=64 + +- DPDK environment + + Follow the DPDK :ref:`Getting Started Guide for Linux <linux_gsg>` to setup + DPDK environment. + + +Config File Options +------------------- + +The following options can be modified in the ``config`` file. + +- ``CONFIG_RTE_LIBRTE_MVPP2_PMD`` (default ``n``) + + Toggle compilation of the librte mvpp2 driver. + + +QoS Configuration +----------------- + +QoS configuration is done through external configuration file. Path to the +file must be given as `cfg` in driver's vdev parameter list. + +Configuration syntax +~~~~~~~~~~~~~~~~~~~~ + +.. code-block:: console + + [port <portnum> default] + default_tc = <default_tc> + mapping_priority = <mapping_priority> + policer_enable = <policer_enable> + token_unit = <token_unit> + color = <color_mode> + cir = <cir> + ebs = <ebs> + cbs = <cbs> + + rate_limit_enable = <rate_limit_enable> + rate_limit = <rate_limit> + burst_size = <burst_size> + + [port <portnum> tc <traffic_class>] + rxq = <rx_queue_list> + pcp = <pcp_list> + dscp = <dscp_list> + default_color = <default_color> + + [port <portnum> tc <traffic_class>] + rxq = <rx_queue_list> + pcp = <pcp_list> + dscp = <dscp_list> + + [port <portnum> txq <txqnum>] + sched_mode = <sched_mode> + wrr_weight = <wrr_weight> + + rate_limit_enable = <rate_limit_enable> + rate_limit = <rate_limit> + burst_size = <burst_size> + +Where: + +- ``<portnum>``: DPDK Port number (0..n). + +- ``<default_tc>``: Default traffic class (e.g. 0) + +- ``<mapping_priority>``: QoS priority for mapping (`ip`, `vlan`, `ip/vlan` or `vlan/ip`). + +- ``<traffic_class>``: Traffic Class to be configured. + +- ``<rx_queue_list>``: List of DPDK RX queues (e.g. 0 1 3-4) + +- ``<pcp_list>``: List of PCP values to handle in particular TC (e.g. 0 1 3-4 7). + +- ``<dscp_list>``: List of DSCP values to handle in particular TC (e.g. 0-12 32-48 63). + +- ``<policer_enable>``: Enable ingress policer. + +- ``<token_unit>``: Policer token unit (`bytes` or `packets`). + +- ``<color_mode>``: Policer color mode (`aware` or `blind`). + +- ``<cir>``: Committed information rate in unit of kilo bits per second (data rate) or packets per second. + +- ``<cbs>``: Committed burst size in unit of kilo bytes or number of packets. + +- ``<ebs>``: Excess burst size in unit of kilo bytes or number of packets. + +- ``<default_color>``: Default color for specific tc. + +- ``<rate_limit_enable>``: Enables per port or per txq rate limiting. + +- ``<rate_limit>``: Committed information rate, in kilo bits per second. + +- ``<burst_size>``: Committed burst size, in kilo bytes. + +- ``<sched_mode>``: Egress scheduler mode (`wrr` or `sp`). + +- ``<wrr_weight>``: Txq weight. + +Setting PCP/DSCP values for the default TC is not required. All PCP/DSCP +values not assigned explicitly to particular TC will be handled by the +default TC. + +Configuration file example +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +.. code-block:: console + + [port 0 default] + default_tc = 0 + mapping_priority = ip + + rate_limit_enable = 1 + rate_limit = 1000 + burst_size = 2000 + + [port 0 tc 0] + rxq = 0 1 + + [port 0 txq 0] + sched_mode = wrr + wrr_weight = 10 + + [port 0 txq 1] + sched_mode = wrr + wrr_weight = 100 + + [port 0 txq 2] + sched_mode = sp + + [port 0 tc 1] + rxq = 2 + pcp = 5 6 7 + dscp = 26-38 + + [port 1 default] + default_tc = 0 + mapping_priority = vlan/ip + + policer_enable = 1 + token_unit = bytes + color = blind + cir = 100000 + ebs = 64 + cbs = 64 + + [port 1 tc 0] + rxq = 0 + dscp = 10 + + [port 1 tc 1] + rxq = 1 + dscp = 11-20 + + [port 1 tc 2] + rxq = 2 + dscp = 30 + + [port 1 txq 0] + rate_limit_enable = 1 + rate_limit = 10000 + burst_size = 2000 + +Usage example +^^^^^^^^^^^^^ + +.. code-block:: console + + ./testpmd --vdev=eth_mvpp2,iface=eth0,iface=eth2,cfg=/home/user/mrvl.conf \ + -c 7 -- -i -a --disable-hw-vlan-strip --rxq=3 --txq=3 + + +Building DPDK +------------- + +Driver needs precompiled MUSDK library during compilation. + +.. code-block:: console + + export CROSS_COMPILE=<toolchain>/bin/aarch64-linux-gnu- + ./bootstrap + ./configure --host=aarch64-linux-gnu --enable-bpool-dma=64 + make install + +MUSDK will be installed to `usr/local` under current directory. +For the detailed build instructions please consult ``doc/musdk_get_started.txt``. + +Before the DPDK build process the environmental variable ``LIBMUSDK_PATH`` with +the path to the MUSDK installation directory needs to be exported. + +.. code-block:: console + + export LIBMUSDK_PATH=<musdk>/usr/local + export CROSS=aarch64-linux-gnu- + make config T=arm64-armv8a-linuxapp-gcc + sed -ri 's,(MVPP2_PMD=)n,\1y,' build/.config + make + +Flow API +-------- + +PPv2 offers packet classification capabilities via classifier engine which +can be configured via generic flow API offered by DPDK. + +Supported flow actions +~~~~~~~~~~~~~~~~~~~~~~ + +Following flow action items are supported by the driver: + +* DROP +* QUEUE + +Supported flow items +~~~~~~~~~~~~~~~~~~~~ + +Following flow items and their respective fields are supported by the driver: + +* ETH + + * source MAC + * destination MAC + * ethertype + +* VLAN + + * PCP + * VID + +* IPV4 + + * DSCP + * protocol + * source address + * destination address + +* IPV6 + + * flow label + * next header + * source address + * destination address + +* UDP + + * source port + * destination port + +* TCP + + * source port + * destination port + +Classifier match engine +~~~~~~~~~~~~~~~~~~~~~~~ + +Classifier has an internal match engine which can be configured to +operate in either exact or maskable mode. + +Mode is selected upon creation of the first unique flow rule as follows: + +* maskable, if key size is up to 8 bytes. +* exact, otherwise, i.e for keys bigger than 8 bytes. + +Where the key size equals the number of bytes of all fields specified +in the flow items. + +.. table:: Examples of key size calculation + + +----------------------------------------------------------------------------+-------------------+-------------+ + | Flow pattern | Key size in bytes | Used engine | + +============================================================================+===================+=============+ + | ETH (destination MAC) / VLAN (VID) | 6 + 2 = 8 | Maskable | + +----------------------------------------------------------------------------+-------------------+-------------+ + | VLAN (VID) / IPV4 (source address) | 2 + 4 = 6 | Maskable | + +----------------------------------------------------------------------------+-------------------+-------------+ + | TCP (source port, destination port) | 2 + 2 = 4 | Maskable | + +----------------------------------------------------------------------------+-------------------+-------------+ + | VLAN (priority) / IPV4 (source address) | 1 + 4 = 5 | Maskable | + +----------------------------------------------------------------------------+-------------------+-------------+ + | IPV4 (destination address) / UDP (source port, destination port) | 6 + 2 + 2 = 10 | Exact | + +----------------------------------------------------------------------------+-------------------+-------------+ + | VLAN (VID) / IPV6 (flow label, destination address) | 2 + 3 + 16 = 21 | Exact | + +----------------------------------------------------------------------------+-------------------+-------------+ + | IPV4 (DSCP, source address, destination address) | 1 + 4 + 4 = 9 | Exact | + +----------------------------------------------------------------------------+-------------------+-------------+ + | IPV6 (flow label, source address, destination address) | 3 + 16 + 16 = 35 | Exact | + +----------------------------------------------------------------------------+-------------------+-------------+ + +From the user perspective maskable mode means that masks specified +via flow rules are respected. In case of exact match mode, masks +which do not provide exact matching (all bits masked) are ignored. + +If the flow matches more than one classifier rule the first +(with the lowest index) matched takes precedence. + +Flow rules usage example +~~~~~~~~~~~~~~~~~~~~~~~~ + +Before proceeding run testpmd user application: + +.. code-block:: console + + ./testpmd --vdev=eth_mvpp2,iface=eth0,iface=eth2 -c 3 -- -i --p 3 -a --disable-hw-vlan-strip + +Example #1 +^^^^^^^^^^ + +.. code-block:: console + + testpmd> flow create 0 ingress pattern eth src is 10:11:12:13:14:15 / end actions drop / end + +In this case key size is 6 bytes thus maskable type is selected. Testpmd +will set mask to ff:ff:ff:ff:ff:ff i.e traffic explicitly matching +above rule will be dropped. + +Example #2 +^^^^^^^^^^ + +.. code-block:: console + + testpmd> flow create 0 ingress pattern ipv4 src spec 10.10.10.0 src mask 255.255.255.0 / tcp src spec 0x10 src mask 0x10 / end action drop / end + +In this case key size is 8 bytes thus maskable type is selected. +Flows which have IPv4 source addresses ranging from 10.10.10.0 to 10.10.10.255 +and tcp source port set to 16 will be dropped. + +Example #3 +^^^^^^^^^^ + +.. code-block:: console + + testpmd> flow create 0 ingress pattern vlan vid spec 0x10 vid mask 0x10 / ipv4 src spec 10.10.1.1 src mask 255.255.0.0 dst spec 11.11.11.1 dst mask 255.255.255.0 / end actions drop / end + +In this case key size is 10 bytes thus exact type is selected. +Even though each item has partial mask set, masks will be ignored. +As a result only flows with VID set to 16 and IPv4 source and destination +addresses set to 10.10.1.1 and 11.11.11.1 respectively will be dropped. + +Limitations +~~~~~~~~~~~ + +Following limitations need to be taken into account while creating flow rules: + +* For IPv4 exact match type the key size must be up to 12 bytes. +* For IPv6 exact match type the key size must be up to 36 bytes. +* Following fields cannot be partially masked (all masks are treated as + if they were exact): + + * ETH: ethertype + * VLAN: PCP, VID + * IPv4: protocol + * IPv6: next header + * TCP/UDP: source port, destination port + +* Only one classifier table can be created thus all rules in the table + have to match table format. Table format is set during creation of + the first unique flow rule. +* Up to 5 fields can be specified per flow rule. +* Up to 20 flow rules can be added. + +For additional information about classifier please consult +``doc/musdk_cls_user_guide.txt``. + +Usage Example +------------- + +MVPP2 PMD requires extra out of tree kernel modules to function properly. +`musdk_uio` and `mv_pp_uio` sources are part of the MUSDK. Please consult +``doc/musdk_get_started.txt`` for the detailed build instructions. +For `mvpp2x_sysfs` please consult ``Documentation/pp22_sysfs.txt`` for the +detailed build instructions. + +.. code-block:: console + + insmod musdk_uio.ko + insmod mv_pp_uio.ko + insmod mvpp2x_sysfs.ko + +Additionally interfaces used by DPDK application need to be put up: + +.. code-block:: console + + ip link set eth0 up + ip link set eth2 up + +In order to run testpmd example application following command can be used: + +.. code-block:: console + + ./testpmd --vdev=eth_mvpp2,iface=eth0,iface=eth2 -c 7 -- \ + --burst=128 --txd=2048 --rxd=1024 --rxq=2 --txq=2 --nb-cores=2 \ + -i -a --rss-udp diff --git a/doc/guides/nics/netvsc.rst b/doc/guides/nics/netvsc.rst new file mode 100644 index 00000000..345f393c --- /dev/null +++ b/doc/guides/nics/netvsc.rst @@ -0,0 +1,105 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) Microsoft Corporation. + +Netvsc poll mode driver +======================= + +The Netvsc Poll Mode driver (PMD) provides support for the paravirtualized +network device for Microsoft Hyper-V. It can be used with +Window Server 2008/2012/2016, Windows 10. +The device offers multi-queue support (if kernel and host support it), +checksum and segmentation offloads. + + +Features and Limitations of Hyper-V PMD +--------------------------------------- + +In this release, the hyper PMD driver provides the basic functionality of packet reception and transmission. + +* It supports merge-able buffers per packet when receiving packets and scattered buffer per packet + when transmitting packets. The packet size supported is from 64 to 65536. + +* The PMD supports multicast packets and promiscuous mode subject to restrictions on the host. + In order to this to work, the guest network configuration on Hyper-V must be configured to allow MAC address + spoofing. + +* The device has only a single MAC address. + Hyper-V driver does not support MAC or VLAN filtering because the Hyper-V host does not support it. + +* VLAN tags are always stripped and presented in mbuf tci field. + +* The Hyper-V driver does not use or support Link State or Rx interrupt. + +* The maximum number of queues is limited by the host (currently 64). + When used with 4.16 kernel only a single queue is available. + +.. note:: + This driver is intended for use with **Hyper-V only** and is + not recommended for use on Azure because accelerated Networking + (SR-IOV) is not supported. + + On Azure, use the :doc:`vdev_netvsc` which + automatically configures the necessary TAP and failsave drivers. + + +Installation +------------ + +The Netvsc PMD is a standalone driver, similar to virtio and vmxnet3. +Using Netvsc PMD requires that the associated VMBUS device be bound to the userspace +I/O device driver for Hyper-V (uio_hv_generic). By default, all netvsc devices +will be bound to the Linux kernel driver; in order to use netvsc PMD the +device must first be overridden. + +The first step is to identify the network device to override. +VMBUS uses Universal Unique Identifiers +(`UUID`_) to identify devices on the bus similar to how PCI uses Domain:Bus:Function. +The UUID associated with a Linux kernel network device can be determined +by looking at the sysfs information. To find the UUID for eth1 and +store it in a shell variable: + + .. code-block:: console + + DEV_UUID=$(basename $(readlink /sys/class/net/eth1/device)) + + +.. _`UUID`: https://en.wikipedia.org/wiki/Universally_unique_identifier + +There are several possible ways to assign the uio device driver for a device. +The easiest way (but only on 4.18 or later) +is to use the `driverctl Device Driver control utility`_ to override +the normal kernel device. + + .. code-block:: console + + driverctl -b vmbus set-override $DEV_UUID uio_hv_generic + +.. _`driverctl Device Driver control utility`: https://gitlab.com/driverctl/driverctl + +Any settings done with driverctl are by default persistent and will be reapplied +on reboot. + +On older kernels, the same effect can be had by manual sysfs bind and unbind +operations: + + .. code-block:: console + + NET_UUID="f8615163-df3e-46c5-913f-f2d2f965ed0e" + modprobe uio_hv_generic + echo $NET_UUID > /sys/bus/vmbus/drivers/uio_hv_generic/new_id + echo $DEV_UUID > /sys/bus/vmbus/drivers/hv_netvsc/unbind + echo $DEV_UUID > /sys/bus/vmbus/drivers/uio_hv_generic/bind + +.. Note:: + + The dpkd-devbind.py script can not be used since it only handles PCI devices. + + +Prerequisites +------------- + +The following prerequisites apply: + +* Linux kernel support for UIO on vmbus is done with the uio_hv_generic driver. + Full support of multiple queues requires the 4.17 kernel. It is possible + to use the netvsc PMD with 4.16 kernel but it is limited to a single queue. diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst index 99a3b76e..927c03c6 100644 --- a/doc/guides/nics/nfp.rst +++ b/doc/guides/nics/nfp.rst @@ -34,14 +34,14 @@ NFP poll mode driver library Netronome's sixth generation of flow processors pack 216 programmable cores and over 100 hardware accelerators that uniquely combine packet, flow, security and content processing in a single device that scales -up to 400 Gbps. +up to 400-Gb/s. This document explains how to use DPDK with the Netronome Poll Mode Driver (PMD) supporting Netronome's Network Flow Processor 6xxx (NFP-6xxx) and Netronome's Flow Processor 4xxx (NFP-4xxx). NFP is a SRIOV capable device and the PMD driver supports the physical -function (PF) and virtual functions (VFs). +function (PF) and the virtual functions (VFs). Dependencies ------------ @@ -49,17 +49,18 @@ Dependencies Before using the Netronome's DPDK PMD some NFP configuration, which is not related to DPDK, is required. The system requires installation of **Netronome's BSP (Board Support Package)** along -with some specific NFP firmware application. Netronome's NSP ABI +with a specific NFP firmware application. Netronome's NSP ABI version should be 0.20 or higher. If you have a NFP device you should already have the code and -documentation for doing all this configuration. Contact +documentation for this configuration. Contact **support@netronome.com** to obtain the latest available firmware. -The NFP Linux netdev kernel driver for VFs is part of vanilla kernel -since kernel version 4.5, and support for the PF since kernel version -4.11. Support for older kernels can be obtained on Github at -**https://github.com/Netronome/nfp-drv-kmods** along with build +The NFP Linux netdev kernel driver for VFs has been a part of the +vanilla kernel since kernel version 4.5, and support for the PF +since kernel version 4.11. Support for older kernels can be obtained +on Github at +**https://github.com/Netronome/nfp-drv-kmods** along with the build instructions. NFP PMD needs to be used along with UIO ``igb_uio`` or VFIO (``vfio-pci``) @@ -70,15 +71,15 @@ Building the software Netronome's PMD code is provided in the **drivers/net/nfp** directory. Although NFP PMD has Netronome´s BSP dependencies, it is possible to -compile it along with other DPDK PMDs even if no BSP was installed before. +compile it along with other DPDK PMDs even if no BSP was installed previously. Of course, a DPDK app will require such a BSP installed for using the NFP PMD, along with a specific NFP firmware application. -Default PMD configuration is at **common_linuxapp configuration** file: +Default PMD configuration is at the **common_linuxapp configuration** file: - **CONFIG_RTE_LIBRTE_NFP_PMD=y** -Once DPDK is built all the DPDK apps and examples include support for +Once the DPDK is built all the DPDK apps and examples include support for the NFP PMD. @@ -91,37 +92,55 @@ for details. Using the PF ------------ -NFP PMD has support for using the NFP PF as another DPDK port, but it does not +NFP PMD supports using the NFP PF as another DPDK port, but it does not have any functionality for controlling VFs. In fact, it is not possible to use the PMD with the VFs if the PF is being used by DPDK, that is, with the NFP PF -bound to ``igb_uio`` or ``vfio-pci`` kernel drivers. Future DPDK version will +bound to ``igb_uio`` or ``vfio-pci`` kernel drivers. Future DPDK versions will have a PMD able to work with the PF and VFs at the same time and with the PF implementing VF management along with other PF-only functionalities/offloads. The PMD PF has extra work to do which will delay the DPDK app initialization -like checking if a firmware is already available in the device, uploading the -firmware if necessary, and configure the Link state properly when starting or -stopping a PF port. Note that firmware upload is not always necessary which is -the main delay for NFP PF PMD initialization. +like uploading the firmware and configure the Link state properly when starting or +stopping a PF port. Since DPDK 18.05 the firmware upload happens when +a PF is initialized, which was not always true with older DPDK versions. Depending on the Netronome product installed in the system, firmware files should be available under ``/lib/firmware/netronome``. DPDK PMD supporting the -PF requires a specific link, ``/lib/firmware/netronome/nic_dpdk_default.nffw``, -which should be created automatically with Netronome's Agilio products -installation. +PF looks for a firmware file in this order: + + 1) First try to find a firmware image specific for this device using the + NFP serial number: + + serial-00-15-4d-12-20-65-10-ff.nffw + + 2) Then try the PCI name: + + pci-0000:04:00.0.nffw + + 3) Finally try the card type and media: + + nic_AMDA0099-0001_2x25.nffw + +Netronome's software packages install firmware files under ``/lib/firmware/netronome`` +to support all the Netronome's SmartNICs and different firmware applications. +This is usually done using file names based on SmartNIC type and media and with a +directory per firmware application. Options 1 and 2 for firmware filenames allow +more than one SmartNIC, same type of SmartNIC or different ones, and to upload a +different firmware to each SmartNIC. + PF multiport support -------------------- Some NFP cards support several physical ports with just one single PCI device. -DPDK core is designed with the 1:1 relationship between PCI devices and DPDK +The DPDK core is designed with a 1:1 relationship between PCI devices and DPDK ports, so NFP PMD PF support requires handling the multiport case specifically. During NFP PF initialization, the PMD will extract the information about the number of PF ports from the firmware and will create as many DPDK ports as needed. Because the unusual relationship between a single PCI device and several DPDK -ports, there are some limitations when using more than one PF DPDK ports: there +ports, there are some limitations when using more than one PF DPDK port: there is no support for RX interrupts and it is not possible either to use those PF ports with the device hotplug functionality. @@ -136,7 +155,7 @@ System configuration get the drivers from the above Github repository and follow the instructions for building and installing it. - Virtual Functions need to be enabled before they can be used with the PMD. + VFs need to be enabled before they can be used with the PMD. Before enabling the VFs it is useful to obtain information about the current NFP PCI device detected by the system: diff --git a/doc/guides/nics/octeontx.rst b/doc/guides/nics/octeontx.rst index 8e2a2b75..f8eaaa63 100644 --- a/doc/guides/nics/octeontx.rst +++ b/doc/guides/nics/octeontx.rst @@ -165,8 +165,7 @@ CRC striping ~~~~~~~~~~~~ The OCTEONTX SoC family NICs strip the CRC for every packets coming into the -host interface. So, CRC will be stripped even when the -``rxmode.hw_strip_crc`` member is set to 0 in ``struct rte_eth_conf``. +host interface irrespective of the offload configuration. Maximum packet length ~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst index 0df0ef81..20cd52b0 100644 --- a/doc/guides/nics/overview.rst +++ b/doc/guides/nics/overview.rst @@ -1,32 +1,6 @@ -.. BSD LICENSE +.. SPDX-License-Identifier: BSD-3-Clause Copyright 2016 6WIND S.A. - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of 6WIND S.A. nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. - Overview of Networking Drivers ============================== diff --git a/doc/guides/nics/pcap_ring.rst b/doc/guides/nics/pcap_ring.rst index 7fd063c9..879e5430 100644 --- a/doc/guides/nics/pcap_ring.rst +++ b/doc/guides/nics/pcap_ring.rst @@ -71,11 +71,19 @@ The different stream types are: tx_pcap=/path/to/file.pcap * rx_iface: Defines a reception stream based on a network interface name. - The driver reads packets coming from the given interface using the Linux kernel driver for that interface. + The driver reads packets from the given interface using the Linux kernel driver for that interface. + The driver captures both the incoming and outgoing packets on that interface. The value is an interface name. rx_iface=eth0 +* rx_iface_in: Defines a reception stream based on a network interface name. + The driver reads packets from the given interface using the Linux kernel driver for that interface. + The driver captures only the incoming packets on that interface. + The value is an interface name. + + rx_iface_in=eth0 + * tx_iface: Defines a transmission stream based on a network interface name. The driver sends packets to the given interface using the Linux kernel driver for that interface. The value is an interface name. @@ -122,6 +130,21 @@ Forward packets through two network interfaces: $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ --vdev 'net_pcap0,iface=eth0' --vdev='net_pcap1;iface=eth1' +Enable 2 tx queues on a network interface: + +.. code-block:: console + + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ + --vdev 'net_pcap0,rx_iface=eth1,tx_iface=eth1,tx_iface=eth1' \ + -- --txq 2 + +Read only incoming packets from a network interface and write them back to the same network interface: + +.. code-block:: console + + $RTE_TARGET/app/testpmd -l 0-3 -n 4 \ + --vdev 'net_pcap0,rx_iface_in=eth1,tx_iface=eth1' + Using libpcap-based PMD with the testpmd Application ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ diff --git a/doc/guides/nics/qede.rst b/doc/guides/nics/qede.rst index 63ce9b4c..cba38868 100644 --- a/doc/guides/nics/qede.rst +++ b/doc/guides/nics/qede.rst @@ -35,15 +35,15 @@ Supported Features - N-tuple filter and flow director (limited support) - NPAR (NIC Partitioning) - SR-IOV VF -- VXLAN Tunneling offload +- GRE Tunneling offload - GENEVE Tunneling offload +- VXLAN Tunneling offload - MPLSoUDP Tx Tunneling offload Non-supported Features ---------------------- - SR-IOV PF -- GRE and NVGRE Tunneling offloads Co-existence considerations --------------------------- @@ -58,19 +58,21 @@ Supported QLogic Adapters Prerequisites ------------- -- Requires storm firmware version **8.30.12.0**. Firmware may be available +- Requires storm firmware version **8.33.12.0**. Firmware may be available inbox in certain newer Linux distros under the standard directory - ``E.g. /lib/firmware/qed/qed_init_values-8.30.12.0.bin`` + ``E.g. /lib/firmware/qed/qed_init_values-8.33.12.0.bin``. If the required firmware files are not available then download it from - `QLogic Driver Download Center <http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/DefaultNewSearch.aspx>`_. - For downloading firmware file, select adapter category, model and DPDK Poll Mode Driver. - -- Requires management firmware (MFW) version **8.30.x.x** or higher to be - flashed on to the adapter. If the required management firmware is not - available then download from - `QLogic Driver Download Center <http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/DefaultNewSearch.aspx>`_. - For downloading firmware upgrade utility, select adapter category, model and Linux distro. - To flash the management firmware refer to the instructions in the QLogic Firmware Upgrade Utility Readme document. + `linux-firmware git repository <http://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/tree/qed>`_ + or `QLogic Driver Download Center <http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/DefaultNewSearch.aspx>`_. + To download firmware file from QLogic website, select adapter category, model and DPDK Poll Mode Driver. + +- Requires the NIC be updated minimally with **8.30.x.x** Management firmware(MFW) version supported for that NIC. + It is highly recommended that the NIC be updated with the latest available management firmware version to get latest feature set. + Management Firmware and Firmware Upgrade Utility for Cavium FastLinQ(r) branded adapters can be downloaded from + `Driver Download Center <http://driverdownloads.qlogic.com/QLogicDriverDownloads_UI/DefaultNewSearch.aspx>`_. + For downloading Firmware Upgrade Utility, select NIC category, model and Linux distro. + To update the management firmware, refer to the instructions in the Firmware Upgrade Utility Readme document. + For OEM branded adapters please follow the instruction provided by the OEM to update the Management Firmware on the NIC. - SR-IOV requires Linux PF driver version **8.20.x.x** or higher. If the required PF driver is not available then download it from @@ -104,7 +106,7 @@ enabling debugging options may affect system performance. - ``CONFIG_RTE_LIBRTE_QEDE_FW`` (default **""**) Gives absolute path of firmware file. - ``Eg: "/lib/firmware/qed/qed_init_values-8.30.12.0.bin"`` + ``Eg: "/lib/firmware/qed/qed_init_values-8.33.12.0.bin"`` Empty string indicates driver will pick up the firmware file from the default location /lib/firmware/qed. CAUTION this option is more for custom firmware, it is not @@ -121,7 +123,7 @@ SR-IOV: Prerequisites and Sample Application Notes This section provides instructions to configure SR-IOV with Linux OS. -**Note**: librte_pmd_qede will be used to bind to SR-IOV VF device and Linux native kernel driver (qede) will function as SR-IOV PF driver. Requires PF driver to be 8.10.x.x or higher. +**Note**: librte_pmd_qede will be used to bind to SR-IOV VF device and Linux native kernel driver (qede) will function as SR-IOV PF driver. Requires PF driver to be 8.20.x.x or higher. #. Verify SR-IOV and ARI capability is enabled on the adapter using ``lspci``: @@ -193,7 +195,7 @@ This section provides instructions to configure SR-IOV with Linux OS. #. Running testpmd - (Supply ``--log-level="pmd.net.qede.driver",7`` to view informational messages): + (Supply ``--log-level="pmd.net.qede.driver:info`` to view informational messages): Refer to the document :ref:`compiling and testing a PMD for a NIC <pmd_build_and_test>` to run diff --git a/doc/guides/nics/sfc_efx.rst b/doc/guides/nics/sfc_efx.rst index ccdf5ff0..63939ec8 100644 --- a/doc/guides/nics/sfc_efx.rst +++ b/doc/guides/nics/sfc_efx.rst @@ -30,7 +30,8 @@ Solarflare libefx-based Poll Mode Driver ======================================== The SFC EFX PMD (**librte_pmd_sfc_efx**) provides poll mode driver support -for **Solarflare SFN7xxx and SFN8xxx** family of 10/40 Gbps adapters. +for **Solarflare SFN7xxx and SFN8xxx** family of 10/40 Gbps adapters and +**Solarflare XtremeScale X2xxx** family of 10/25/40/50/100 Gbps adapters. SFC EFX PMD has support for the latest Linux and FreeBSD operating systems. More information can be found at `Solarflare Communications website @@ -87,6 +88,8 @@ SFC EFX PMD has support for: - Flow API +- Loopback + Non-supported Features ---------------------- @@ -97,8 +100,6 @@ The features not yet supported include: - Priority-based flow control -- Loopback - - Configurable RX CRC stripping (always stripped) - Header split on receive @@ -120,22 +121,37 @@ required in the receive buffer. It should be taken into account when mbuf pool for receive is created. +Equal stride super-buffer mode +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When the receive queue uses equal stride super-buffer DMA mode, one HW Rx +descriptor carries many Rx buffers which contiguously follow each other +with some stride (equal to total size of rte_mbuf as mempool object). +Each Rx buffer is an independent rte_mbuf. +However dedicated mempool manager must be used when mempool for the Rx +queue is created. The manager must support dequeue of the contiguous +block of objects and provide mempool info API to get the block size. + +Another limitation of a equal stride super-buffer mode, imposed by the +firmware, is that it allows for a single RSS context. + + Tunnels support --------------- -NVGRE, VXLAN and GENEVE tunnels are supported on SFN8xxx family adapters -with full-feature firmware variant running. +NVGRE, VXLAN and GENEVE tunnels are supported on SFN8xxx and X2xxx family +adapters with full-feature firmware variant running. **sfboot** should be used to configure NIC to run full-feature firmware variant. See Solarflare Server Adapter User's Guide for details. -SFN8xxx family adapters provide either inner or outer packet classes. +SFN8xxx and X2xxx family adapters provide either inner or outer packet classes. If adapter firmware advertises support for tunnels then the PMD configures the hardware to report inner classes, and outer classes are not reported in received packets. However, for VXLAN and GENEVE tunnels the PMD does report UDP as the outer layer 4 packet type. -SFN8xxx family adapters report GENEVE packets as VXLAN. +SFN8xxx and X2xxx family adapters report GENEVE packets as VXLAN. If UDP ports are configured for only one tunnel type then it is safe to treat VXLAN packet type indication as the corresponding UDP tunnel type. @@ -152,7 +168,9 @@ Supported pattern items: - VOID - ETH (exact match of source/destination addresses, individual/group match - of destination address, EtherType) + of destination address, EtherType in the outer frame and exact match of + destination addresses, individual/group match of destination address in + the inner frame) - VLAN (exact match of VID, double-tagging is supported) @@ -166,6 +184,13 @@ Supported pattern items: - UDP (exact match of source/destination ports) +- VXLAN (exact match of VXLAN network identifier) + +- GENEVE (exact match of virtual network identifier, only Ethernet (0x6558) + protocol type is supported) + +- NVGRE (exact match of virtual subnet ID) + Supported actions: - VOID @@ -174,6 +199,12 @@ Supported actions: - RSS +- DROP + +- FLAG (supported only with ef10_essb Rx datapath) + +- MARK (supported only with ef10_essb Rx datapath) + Validating flow rules depends on the firmware variant. Ethernet destinaton individual/group match @@ -184,10 +215,31 @@ in the mask of destination address. If destinaton address in the spec is multicast, it matches all multicast (and broadcast) packets, oherwise it matches unicast packets that are not filtered by other flow rules. +Exceptions to flow rules +~~~~~~~~~~~~~~~~~~~~~~~~ + +There is a list of exceptional flow rule patterns which will not be +accepted by the PMD. A pattern will be rejected if at least one of the +conditions is met: + +- Filtering by IPv4 or IPv6 EtherType without pattern items of internet + layer and above. + +- The last item is IPV4 or IPV6, and it's empty. + +- Filtering by TCP or UDP IP transport protocol without pattern items of + transport layer and above. + +- The last item is TCP or UDP, and it's empty. + Supported NICs -------------- +- Solarflare XtremeScale Adapters: + + - Solarflare X2522 Dual Port SFP28 10/25GbE Adapter + - Solarflare Flareon [Ultra] Server Adapters: - Solarflare SFN8522 Dual Port SFP+ Server Adapter @@ -258,15 +310,18 @@ whitelist option like "-w 02:00.0,arg1=value1,...". Case-insensitive 1/y/yes/on or 0/n/no/off may be used to specify boolean parameters value. -- ``rx_datapath`` [auto|efx|ef10] (default **auto**) +- ``rx_datapath`` [auto|efx|ef10|ef10_esps] (default **auto**) Choose receive datapath implementation. **auto** allows the driver itself to make a choice based on firmware features available and required by the datapath implementation. **efx** chooses libefx-based datapath which supports Rx scatter. - **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is + **ef10** chooses EF10 (SFN7xxx, SFN8xxx, X2xxx) native datapath which is more efficient than libefx-based and provides richer packet type classification, but lacks Rx scatter support. + **ef10_esps** chooses SFNX2xxx equal stride packed stream datapath + which may be used on DPDK firmware variant only + (see notes about its limitations above). - ``tx_datapath`` [auto|efx|ef10|ef10_simple] (default **auto**) @@ -277,12 +332,12 @@ boolean parameters value. (full-feature firmware variant only), TSO and multi-segment mbufs. Mbuf segments may come from different mempools, and mbuf reference counters are treated responsibly. - **ef10** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which is + **ef10** chooses EF10 (SFN7xxx, SFN8xxx, X2xxx) native datapath which is more efficient than libefx-based but has no VLAN insertion and TSO support yet. Mbuf segments may come from different mempools, and mbuf reference counters are treated responsibly. - **ef10_simple** chooses EF10 (SFN7xxx, SFN8xxx) native datapath which + **ef10_simple** chooses EF10 (SFN7xxx, SFN8xxx, X2xxx) native datapath which is even more faster then **ef10** but does not support multi-segment mbufs, disallows multiple mempools and neglects mbuf reference counters. @@ -293,21 +348,73 @@ boolean parameters value. **auto** allows NIC firmware to make a choice based on installed licences and firmware variant configured using **sfboot**. -- ``debug_init`` [bool] (default **n**) - - Enable extra logging during device initialization and startup. - -- ``mcdi_logging`` [bool] (default **n**) - - Enable extra logging of the communication with the NIC's management CPU. - The logging is done using RTE_LOG() with INFO level and PMD type. - The format is consumed by the Solarflare netlogdecode cross-platform tool. - - ``stats_update_period_ms`` [long] (default **1000**) Adjust period in milliseconds to update port hardware statistics. The accepted range is 0 to 65535. The value of **0** may be used to disable periodic statistics update. One should note that it's - only possible to set an arbitrary value on SFN8xxx provided that + only possible to set an arbitrary value on SFN8xxx and X2xxx provided that firmware version is 6.2.1.1033 or higher, otherwise any positive value will select a fixed update period of **1000** milliseconds + +- ``fw_variant`` [dont-care|full-feature|ultra-low-latency| + capture-packed-stream|dpdk] (default **dont-care**) + + Choose the preferred firmware variant to use. In order for the selected + option to have an effect, the **sfboot** utility must be configured with the + **auto** firmware-variant option. The preferred firmware variant applies to + all ports on the NIC. + **dont-care** ensures that the driver can attach to an unprivileged function. + The datapath firmware type to use is controlled by the **sfboot** + utility. + **full-feature** chooses full featured firmware. + **ultra-low-latency** chooses firmware with fewer features but lower latency. + **capture-packed-stream** chooses firmware for SolarCapture packed stream + mode. + **dpdk** chooses DPDK firmware with equal stride super-buffer Rx mode + for higher Rx packet rate and packet marks support and firmware subvariant + without checksumming on transmit for higher Tx packet rate if + checksumming is not required. + +- ``rxd_wait_timeout_ns`` [long] (default **200 us**) + + Adjust timeout in nanoseconds to head-of-line block to wait for + Rx descriptors. + The accepted range is 0 to 400 ms. + Flow control should be enabled to make it work. + The value of **0** disables it and packets are dropped immediately. + When a packet is dropped because of no Rx descriptors, + ``rx_nodesc_drop_cnt`` counter grows. + The feature is supported only by the DPDK firmware variant when equal + stride super-buffer Rx mode is used. + + +Dynamic Logging Parameters +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +One may leverage EAL option "--log-level" to change default levels +for the log types supported by the driver. The option is used with +an argument typically consisting of two parts separated by a colon. + +Level value is the last part which takes a symbolic name (or integer). +Log type is the former part which may shell match syntax. +Depending on the choice of the expression, the given log level may +be used either for some specific log type or for a subset of types. + +SFC EFX PMD provides the following log types available for control: + +- ``pmd.net.sfc.driver`` (default level is **notice**) + + Affects driver-wide messages unrelated to any particular devices. + +- ``pmd.net.sfc.main`` (default level is **notice**) + + Matches a subset of per-port log types registered during runtime. + A full name for a particular type may be obtained by appending a + dot and a PCI device identifier (``XXXX:XX:XX.X``) to the prefix. + +- ``pmd.net.sfc.mcdi`` (default level is **notice**) + + Extra logging of the communication with the NIC's management CPU. + The format of the log is consumed by the Solarflare netlogdecode + cross-platform tool. May be managed per-port, as explained above. diff --git a/doc/guides/nics/softnic.rst b/doc/guides/nics/softnic.rst new file mode 100644 index 00000000..6c2287a1 --- /dev/null +++ b/doc/guides/nics/softnic.rst @@ -0,0 +1,250 @@ +.. SPDX-License-Identifier: BSD-3-Clause + Copyright(c) 2018 Intel Corporation. + +Soft NIC Poll Mode Driver +========================= + +The Soft NIC allows building custom NIC pipelines in software. The Soft NIC pipeline +is DIY and reconfigurable through ``firmware`` (DPDK Packet Framework script). + +The Soft NIC leverages the DPDK Packet Framework libraries (librte_port, +librte_table and librte_pipeline) to make it modular, flexible and extensible +with new functionality. Please refer to DPDK Programmer's Guide, Chapter +``Packet Framework`` and DPDK Sample Application User Guide, +Chapter ``IP Pipeline Application`` for more details. + +The Soft NIC is configured through the standard DPDK ethdev API (ethdev, flow, +QoS, security). The internal framework is not externally visible. + +Key benefits: + - Can be used to augment missing features to HW NICs. + - Allows consumption of advanced DPDK features without application redesign. + - Allows out-of-the-box performance boost of DPDK consumers applications simply by + instantiating this type of Ethernet device. + +Flow +---- +* ``Device creation``: Each Soft NIC instance is a virtual device. + +* ``Device start``: The Soft NIC firmware script is executed every time the device + is started. The firmware script typically creates several internal objects, + such as: memory pools, SW queues, traffic manager, action profiles, pipelines, + etc. + +* ``Device stop``: All the internal objects that were previously created by the + firmware script during device start are now destroyed. + +* ``Device run``: Each Soft NIC device needs one or several CPU cores to run. + The firmware script maps each internal pipeline to a CPU core. Multiple + pipelines can be mapped to the same CPU core. In order for a given pipeline + assigned to CPU core X to run, the application needs to periodically call on + CPU core X the `rte_pmd_softnic_run()` function for the current Soft NIC + device. + +* ``Application run``: The application reads packets from the Soft NIC device RX + queues and writes packets to the Soft NIC device TX queues. + +Supported Operating Systems +--------------------------- + +Any Linux distribution fulfilling the conditions described in ``System Requirements`` +section of :ref:`the DPDK documentation <linux_gsg>` or refer to *DPDK +Release Notes*. + +Build options +------------- + +The default PMD configuration available in the common_linuxapp configuration file: + +CONFIG_RTE_LIBRTE_PMD_SOFTNIC=y + +Once the DPDK is built, all the DPDK applications include support for the +Soft NIC PMD. + +Soft NIC PMD arguments +---------------------- + +The user can specify below arguments in EAL ``--vdev`` options to create the +Soft NIC device instance: + + --vdev "net_softnic0,firmware=firmware.cli,conn_port=8086" + +#. ``firmware``: path to the firmware script used for Soft NIC configuration. + The example "firmware" script is provided at `drivers/net/softnic/`. + (Optional: No, Default = NA) + +#. ``conn_port``: tcp connection port (non-zero value) used by remote client + (for examples- telnet, netcat, etc.) to connect and configure Soft NIC device in run-time. + (Optional: yes, Default value: 0, no connection with external client) + +#. ``cpu_id``: numa node id. (Optional: yes, Default value: 0) + +#. ``tm_n_queues``: number of traffic manager's scheduler queues. The traffic manager + is based on DPDK *librte_sched* library. (Optional: yes, Default value: 65,536 queues) + +#. ``tm_qsize0``: size of scheduler queue 0 per traffic class of the pipes/subscribers. + (Optional: yes, Default: 64) + +#. ``tm_qsize1``: size of scheduler queue 1 per traffic class of the pipes/subscribers. + (Optional: yes, Default: 64) + +#. ``tm_qsize2``: size of scheduler queue 2 per traffic class of the pipes/subscribers. + (Optional: yes, Default: 64) + +#. ``tm_qsize3``: size of scheduler queue 3 per traffic class of the pipes/subscribers. + (Optional: yes, Default: 64) + + +Soft NIC testing +---------------- + +* Run testpmd application in Soft NIC forwarding mode with loopback feature + enabled on Soft NIC port: + + .. code-block:: console + + ./testpmd -c 0x3 --vdev 'net_softnic0,firmware=<script path>/firmware.cli,cpu_id=0,conn_port=8086' -- -i + --forward-mode=softnic --portmask=0x2 + + .. code-block:: console + + ... + Interactive-mode selected + Set softnic packet forwarding mode + ... + Configuring Port 0 (socket 0) + Port 0: 90:E2:BA:37:9D:DC + Configuring Port 1 (socket 0) + + ; SPDX-License-Identifier: BSD-3-Clause + ; Copyright(c) 2018 Intel Corporation + + link LINK dev 0000:02:00.0 + + pipeline RX period 10 offset_port_id 0 + pipeline RX port in bsz 32 link LINK rxq 0 + pipeline RX port out bsz 32 swq RXQ0 + pipeline RX table match stub + pipeline RX port in 0 table 0 + + pipeline TX period 10 offset_port_id 0 + pipeline TX port in bsz 32 swq TXQ0 + pipeline TX port out bsz 32 link LINK txq 0 + pipeline TX table match stub + pipeline TX port in 0 table 0 + + thread 1 pipeline RX enable + thread 1 pipeline TX enable + Port 1: 00:00:00:00:00:00 + Checking link statuses... + Done + testpmd> + +* Start forwarding + + .. code-block:: console + + testpmd> start + softnic packet forwarding - ports=1 - cores=1 - streams=1 - NUMA support enabled, MP over anonymous pages disabled + Logical Core 1 (socket 0) forwards packets on 1 streams: + RX P=2/Q=0 (socket 0) -> TX P=2/Q=0 (socket 0) peer=02:00:00:00:00:02 + + softnic packet forwarding packets/burst=32 + nb forwarding cores=1 - nb forwarding ports=1 + port 0: RX queue number: 1 Tx queue number: 1 + Rx offloads=0x1000 Tx offloads=0x0 + RX queue: 0 + RX desc=512 - RX free threshold=32 + RX threshold registers: pthresh=8 hthresh=8 wthresh=0 + RX Offloads=0x0 + TX queue: 0 + TX desc=512 - TX free threshold=32 + TX threshold registers: pthresh=32 hthresh=0 wthresh=0 + TX offloads=0x0 - TX RS bit threshold=32 + port 1: RX queue number: 1 Tx queue number: 1 + Rx offloads=0x0 Tx offloads=0x0 + RX queue: 0 + RX desc=0 - RX free threshold=0 + RX threshold registers: pthresh=0 hthresh=0 wthresh=0 + RX Offloads=0x0 + TX queue: 0 + TX desc=0 - TX free threshold=0 + TX threshold registers: pthresh=0 hthresh=0 wthresh=0 + TX offloads=0x0 - TX RS bit threshold=0 + +* Start remote client (e.g. telnet) to communicate with the softnic device: + + .. code-block:: console + + $ telnet 127.0.0.1 8086 + Trying 127.0.0.1... + Connected to 127.0.0.1. + Escape character is '^]'. + + Welcome to Soft NIC! + + softnic> + +* Add/update Soft NIC pipeline table match-action entries from telnet client: + + .. code-block:: console + + softnic> pipeline RX table 0 rule add match default action fwd port 0 + softnic> pipeline TX table 0 rule add match default action fwd port 0 + +Soft NIC Firmware +----------------- + +The Soft NIC firmware, for example- `softnic/firmware.cli`, consists of following CLI commands +for creating and managing software based NIC pipelines. For more details, please refer to CLI +command description provided in `softnic/rte_eth_softnic_cli.c`. + +* Physical port for packets send/receive: + + .. code-block:: console + + link LINK dev 0000:02:00.0 + +* Pipeline create: + + .. code-block:: console + + pipeline RX period 10 offset_port_id 0 (Soft NIC rx-path pipeline) + pipeline TX period 10 offset_port_id 0 (Soft NIC tx-path pipeline) + +* Pipeline input/output port create + + .. code-block:: console + + pipeline RX port in bsz 32 link LINK rxq 0 (Soft NIC rx pipeline input port) + pipeline RX port out bsz 32 swq RXQ0 (Soft NIC rx pipeline output port) + pipeline TX port in bsz 32 swq TXQ0 (Soft NIC tx pipeline input port) + pipeline TX port out bsz 32 link LINK txq 0 (Soft NIC tx pipeline output port) + +* Pipeline table create + + .. code-block:: console + + pipeline RX table match stub (Soft NIC rx pipeline match-action table) + pipeline TX table match stub (Soft NIC tx pipeline match-action table) + +* Pipeline input port connection with table + + .. code-block:: console + + pipeline RX port in 0 table 0 (Soft NIC rx pipeline input port 0 connection with table 0) + pipeline TX port in 0 table 0 (Soft NIC tx pipeline input port 0 connection with table 0) + +* Pipeline table match-action rules add + + .. code-block:: console + + pipeline RX table 0 rule add match default action fwd port 0 (Soft NIC rx pipeline table 0 rule) + pipeline TX table 0 rule add match default action fwd port 0 (Soft NIC tx pipeline table 0 rule) + +* Enable pipeline on CPU thread + + .. code-block:: console + + thread 1 pipeline RX enable (Soft NIC rx pipeline enable on cpu thread id 1) + thread 1 pipeline TX enable (Soft NIC tx pipeline enable on cpu thread id 1) diff --git a/doc/guides/nics/szedata2.rst b/doc/guides/nics/szedata2.rst index 1a5d4138..a2092f9c 100644 --- a/doc/guides/nics/szedata2.rst +++ b/doc/guides/nics/szedata2.rst @@ -1,32 +1,5 @@ -.. BSD LICENSE +.. SPDX-License-Identifier: BSD-3-Clause Copyright 2015 - 2016 CESNET - All rights reserved. - - Redistribution and use in source and binary forms, with or without - modification, are permitted provided that the following conditions - are met: - - * Redistributions of source code must retain the above copyright - notice, this list of conditions and the following disclaimer. - * Redistributions in binary form must reproduce the above copyright - notice, this list of conditions and the following disclaimer in - the documentation and/or other materials provided with the - distribution. - * Neither the name of CESNET nor the names of its - contributors may be used to endorse or promote products derived - from this software without specific prior written permission. - - THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS - "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT - LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR - A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT - OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, - SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT - LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, - DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY - THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT - (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE - OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. SZEDATA2 poll mode driver library ================================= @@ -70,8 +43,10 @@ separately: * **Kernel modules** + * combo6core * combov3 - * szedata2_cv3 + * szedata2 + * szedata2_cv3 or szedata2_cv3_fdt Kernel modules manage initialization of hardware, allocation and sharing of resources for user space applications. @@ -79,6 +54,15 @@ separately: Information about getting the dependencies can be found `here <http://www.netcope.com/en/company/community-support/dpdk-libsze2>`_. +Versions of the packages +~~~~~~~~~~~~~~~~~~~~~~~~ + +The minimum version of the provided packages: + +* for DPDK from 18.05: **4.4.1** + +* for DPDK up to 18.02 (including): **3.0.5** + Configuration ------------- @@ -89,45 +73,53 @@ These configuration options can be modified before compilation in the Value **y** enables compilation of szedata2 PMD. -* ``CONFIG_RTE_LIBRTE_PMD_SZEDATA2_AS`` default value: **0** - - This option defines type of firmware address space and must be set - according to the used card and mode. - Currently supported values are: - - * **0** - for cards (modes): - - * NFB-100G1 (100G1) +Using the SZEDATA2 PMD +---------------------- - * **1** - for cards (modes): +From DPDK version 16.04 the type of SZEDATA2 PMD is changed to PMD_PDEV. +SZEDATA2 device is automatically recognized during EAL initialization. +No special command line options are needed. - * NFB-100G2Q (100G1) +Kernel modules have to be loaded before running the DPDK application. - * **2** - for cards (modes): +NFB card architecture +--------------------- - * NFB-40G2 (40G2) - * NFB-100G2C (100G2) - * NFB-100G2Q (40G2) +The NFB cards are multi-port multi-queue cards, where (generally) data from any +Ethernet port may be sent to any queue. +They were historically represented in DPDK as a single port. - * **3** - for cards (modes): +However, the new NFB-200G2QL card employs an addon cable which allows to connect +it to two physical PCI-E slots at the same time (see the diagram below). +This is done to allow 200 Gbps of traffic to be transferred through the PCI-E +bus (note that a single PCI-E 3.0 x16 slot provides only 125 Gbps theoretical +throughput). - * NFB-40G2 (10G8) - * NFB-100G2Q (10G8) +Since each slot may be connected to a different CPU and therefore to a different +NUMA node, the card is represented as two ports in DPDK (each with half of the +queues), which allows DPDK to work with data from the individual queues on the +right NUMA node. - * **4** - for cards (modes): +.. figure:: img/szedata2_nfb200g_architecture.* + :align: center - * NFB-100G1 (10G10) + NFB-200G2QL high-level diagram - * **5** - for experimental firmwares and future use +Limitations +----------- -Using the SZEDATA2 PMD ----------------------- +The SZEDATA2 PMD does not support operations related to Ethernet ports +(link_up, link_down, set_mac_address, etc.). -From DPDK version 16.04 the type of SZEDATA2 PMD is changed to PMD_PDEV. -SZEDATA2 device is automatically recognized during EAL initialization. -No special command line options are needed. +NFB cards employ multiple Ethernet ports. +Until now, Ethernet port-related operations were performed on all of them +(since the whole card was represented as a single port). +With NFB-200G2QL card, this is no longer viable (see above). -Kernel modules have to be loaded before running the DPDK application. +Since there is no fixed mapping between the queues and Ethernet ports, and since +a single card can be represented as two ports in DPDK, there is no way of +telling which (if any) physical ports should be associated with individual +ports in DPDK. Example of usage ---------------- diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst index ea61be38..27148681 100644 --- a/doc/guides/nics/tap.rst +++ b/doc/guides/nics/tap.rst @@ -1,8 +1,8 @@ .. SPDX-License-Identifier: BSD-3-Clause Copyright(c) 2016 Intel Corporation. -Tap Poll Mode Driver -==================== +Tun|Tap Poll Mode Driver +======================== The ``rte_eth_tap.c`` PMD creates a device using TAP interfaces on the local host. The PMD allows for DPDK and the host to communicate using a raw @@ -37,6 +37,12 @@ for each interface string containing ``mac=fixed``. The MAC address is formatted as 00:'d':'t':'a':'p':[00-FF]. Convert the characters to hex and you get the actual MAC address: ``00:64:74:61:70:[00-FF]``. + --vdev=net_tap0,mac="00:64:74:61:70:11" + +The MAC address will have a user value passed as string. The MAC address is in +format with delimeter ``:``. The string is byte converted to hex and you get +the actual MAC address: ``00:64:74:61:70:11``. + It is possible to specify a remote netdevice to capture packets from by adding ``remote=foo1``, for example:: @@ -77,6 +83,17 @@ can utilize that stack to handle the network protocols. Plus you would be able to address the interface using an IP address assigned to the internal interface. +The TUN PMD allows user to create a TUN device on host. The PMD allows user +to transmit and receive packets via DPDK API calls with L3 header and payload. +The devices in host can be accessed via ``ifconfig`` or ``ip`` command. TUN +interfaces are passed to DPDK ``rte_eal_init`` arguments as ``--vdev=net_tunX``, +where X stands for unique id, example:: + + --vdev=net_tun0 --vdev=net_tun1,iface=foo1, ... + +Unlike TAP PMD, TUN PMD does not support user arguments as ``MAC`` or ``remote`` user +options. Default interface name is ``dtunX``, where X stands for unique id. + Flow API support ---------------- @@ -91,7 +108,7 @@ The kernel support can be checked with this command:: Supported items: - eth: src and dst (with variable masks), and eth_type (0xffff mask). -- vlan: vid, pcp, tpid, but not eid. (requires kernel 4.9) +- vlan: vid, pcp, but not eid. (requires kernel 4.9) - ipv4/6: src and dst (with variable masks), and ip_proto (0xffff mask). - udp/tcp: src and dst port (0xffff) mask. @@ -149,7 +166,7 @@ Run pktgen from the pktgen directory in a terminal with a commandline like the following:: sudo ./app/app/x86_64-native-linuxapp-gcc/app/pktgen -l 1-5 -n 4 \ - --proc-type auto --log-level 8 --socket-mem 512,512 --file-prefix pg \ + --proc-type auto --log-level debug --socket-mem 512,512 --file-prefix pg \ --vdev=net_tap0 --vdev=net_tap1 -b 05:00.0 -b 05:00.1 \ -b 04:00.0 -b 04:00.1 -b 04:00.2 -b 04:00.3 \ -b 81:00.0 -b 81:00.1 -b 81:00.2 -b 81:00.3 \ @@ -241,6 +258,11 @@ Please refer to ``iproute2`` package file ``lib/bpf.c`` function An example utility for eBPF instruction generation in the format of C arrays will be added in next releases +TAP reports on supported RSS functions as part of dev_infos_get callback: +``ETH_RSS_IP``, ``ETH_RSS_UDP`` and ``ETH_RSS_TCP``. +**Known limitation:** TAP supports all of the above hash functions together +and not in partial combinations. + Systems supporting flow API --------------------------- diff --git a/doc/guides/nics/thunderx.rst b/doc/guides/nics/thunderx.rst index 5270ef23..e84eaafe 100644 --- a/doc/guides/nics/thunderx.rst +++ b/doc/guides/nics/thunderx.rst @@ -30,6 +30,7 @@ Features of the ThunderX PMD are: - SR-IOV VF - NUMA support - Multi queue set support (up to 96 queues (12 queue sets)) per port +- Skip data bytes Supported ThunderX SoCs ----------------------- @@ -312,6 +313,21 @@ We will choose four secondary queue sets from the ending of the list (0002:01:01 The nicvf thunderx driver will make use of attached secondary VFs automatically during the interface configuration stage. + +Module params +-------------- + +skip_data_bytes +~~~~~~~~~~~~~~~ +This feature is used to create a hole between HEADROOM and actual data. Size of hole is specified +in bytes as module param("skip_data_bytes") to pmd. +This scheme is useful when application would like to insert vlan header without disturbing HEADROOM. + +Example: + .. code-block:: console + + -w 0002:01:00.2,skip_data_bytes=8 + Limitations ----------- @@ -319,8 +335,7 @@ CRC striping ~~~~~~~~~~~~ The ThunderX SoC family NICs strip the CRC for every packets coming into the -host interface. So, CRC will be stripped even when the -``rxmode.hw_strip_crc`` member is set to 0 in ``struct rte_eth_conf``. +host interface irrespective of the offload configuration. Maximum packet length ~~~~~~~~~~~~~~~~~~~~~ @@ -336,3 +351,8 @@ Maximum packet segments The ThunderX SoC family NICs support up to 12 segments per packet when working in scatter/gather mode. So, setting MTU will result with ``EINVAL`` when the frame size does not fit in the maximum number of segments. + +skip_data_bytes +~~~~~~~~~~~~~~~ + +Maximum limit of skip_data_bytes is 128 bytes and number of bytes should be multiple of 8. diff --git a/doc/guides/nics/vdev_netvsc.rst b/doc/guides/nics/vdev_netvsc.rst index 55d130a3..d1da0711 100644 --- a/doc/guides/nics/vdev_netvsc.rst +++ b/doc/guides/nics/vdev_netvsc.rst @@ -1,6 +1,6 @@ .. SPDX-License-Identifier: BSD-3-Clause Copyright 2017 6WIND S.A. - Copyright 2017 Mellanox Technologies, Ltd. + Copyright 2017 Mellanox Technologies, Ltd VDEV_NETVSC driver ================== @@ -89,12 +89,16 @@ The following device parameters are supported: - ``force`` [int] If nonzero, forces the use of specified interfaces even if not detected as - NetVSC or detected as routed NETVSC. + NetVSC. - ``ignore`` [int] - If nonzero, ignores the driver runnig (actually used to disable the + If nonzero, ignores the driver running (actually used to disable the auto-detection in Hyper-V VM). -Not specifying either ``iface`` or ``mac`` makes this driver attach itself to -all unrouted NetVSC interfaces found on the system. +.. note:: + + Not specifying either ``iface`` or ``mac`` makes this driver attach itself to + all unrouted NetVSC interfaces found on the system. + Specifying the device makes this driver attach itself to the device + regardless the device routes. diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst index ca09cd20..7c099fb7 100644 --- a/doc/guides/nics/virtio.rst +++ b/doc/guides/nics/virtio.rst @@ -201,7 +201,7 @@ The packet transmission flow is: Virtio PMD Rx/Tx Callbacks -------------------------- -Virtio driver has 3 Rx callbacks and 2 Tx callbacks. +Virtio driver has 4 Rx callbacks and 3 Tx callbacks. Rx callbacks: @@ -215,6 +215,9 @@ Rx callbacks: Vector version without mergeable Rx buffer support, also fixes the available ring indexes and uses vector instructions to optimize performance. +#. ``virtio_recv_mergeable_pkts_inorder``: + In-order version with mergeable Rx buffer support. + Tx callbacks: #. ``virtio_xmit_pkts``: @@ -223,6 +226,8 @@ Tx callbacks: #. ``virtio_xmit_pkts_simple``: Vector version fixes the available ring indexes to optimize performance. +#. ``virtio_xmit_pkts_inorder``: + In-order version. By default, the non-vector callbacks are used: @@ -234,7 +239,7 @@ By default, the non-vector callbacks are used: Vector callbacks will be used when: -* ``txq_flags`` is set to ``VIRTIO_SIMPLE_FLAGS`` (0xF01), which implies: +* ``txmode.offloads`` is set to ``0x0``, which implies: * Single segment is specified. @@ -252,8 +257,14 @@ The corresponding callbacks are: Example of using the vector version of the virtio poll mode driver in ``testpmd``:: - testpmd -l 0-2 -n 4 -- -i --txqflags=0xF01 --rxq=1 --txq=1 --nb-cores=1 + testpmd -l 0-2 -n 4 -- -i --tx-offloads=0x0 --rxq=1 --txq=1 --nb-cores=1 + +In-order callbacks only work on simulated virtio user vdev. + +* For Rx: If mergeable Rx buffers is enabled and in-order is enabled then + ``virtio_xmit_pkts_inorder`` is used. +* For Tx: If in-order is enabled then ``virtio_xmit_pkts_inorder`` is used. Interrupt mode -------------- @@ -318,3 +329,26 @@ Here we use l3fwd-power as an example to show how to get started. $ l3fwd-power -l 0-1 -- -p 1 -P --config="(0,0,1)" \ --no-numa --parse-ptype + + +Virtio PMD arguments +-------------------- + +The user can specify below argument in devargs. + +#. ``vdpa``: + + A virtio device could also be driven by vDPA (vhost data path acceleration) + driver, and works as a HW vhost backend. This argument is used to specify + a virtio device needs to work in vDPA mode. + (Default: 0 (disabled)) + +#. ``mrg_rxbuf``: + + It is used to enable virtio device mergeable Rx buffer feature. + (Default: 1 (enabled)) + +#. ``in_order``: + + It is used to enable virtio device in-order feature. + (Default: 1 (enabled)) |