From 8b25d1ad5d2264bdfc2818c7bda74ee2697df6db Mon Sep 17 00:00:00 2001 From: Christian Ehrhardt Date: Wed, 6 Jul 2016 09:22:35 +0200 Subject: Imported Upstream version 16.07-rc1 Change-Id: I40a523e52f12e8496fdd69e902824b0226c303de Signed-off-by: Christian Ehrhardt --- doc/api/doxy-api-index.md | 1 + doc/guides/contributing/coding_style.rst | 8 + doc/guides/contributing/versioning.rst | 2 +- doc/guides/cryptodevs/aesni_mb.rst | 5 +- doc/guides/cryptodevs/index.rst | 3 +- doc/guides/cryptodevs/kasumi.rst | 101 +++ doc/guides/cryptodevs/overview.rst | 79 +- doc/guides/cryptodevs/qat.rst | 3 + doc/guides/cryptodevs/snow3g.rst | 52 +- doc/guides/faq/faq.rst | 7 +- doc/guides/freebsd_gsg/build_dpdk.rst | 2 +- doc/guides/linux_gsg/enable_func.rst | 15 - doc/guides/linux_gsg/sys_reqs.rst | 3 + doc/guides/nics/bnxt.rst | 49 ++ doc/guides/nics/fm10k.rst | 5 +- doc/guides/nics/i40e.rst | 45 + doc/guides/nics/index.rst | 3 + doc/guides/nics/mlx5.rst | 96 +-- doc/guides/nics/nfp.rst | 47 +- doc/guides/nics/overview.rst | 153 ++-- doc/guides/nics/qede.rst | 314 +++++++ doc/guides/nics/thunderx.rst | 354 ++++++++ doc/guides/prog_guide/env_abstraction_layer.rst | 4 +- doc/guides/prog_guide/index.rst | 1 + doc/guides/prog_guide/ivshmem_lib.rst | 2 + doc/guides/prog_guide/mempool_lib.rst | 42 +- doc/guides/prog_guide/pdump_lib.rst | 123 +++ doc/guides/prog_guide/poll_mode_drv.rst | 21 +- doc/guides/prog_guide/vhost_lib.rst | 223 +++-- doc/guides/rel_notes/deprecation.rst | 79 +- doc/guides/rel_notes/index.rst | 1 + doc/guides/rel_notes/known_issues.rst | 40 +- doc/guides/rel_notes/release_16_07.rst | 358 ++++++++ doc/guides/sample_app_ug/img/ipsec_endpoints.svg | 850 +++++++++++++++++++ doc/guides/sample_app_ug/index.rst | 1 + doc/guides/sample_app_ug/ip_pipeline.rst | 112 ++- doc/guides/sample_app_ug/ipsec_secgw.rst | 910 +++++++++++++-------- doc/guides/sample_app_ug/ipv4_multicast.rst | 4 +- doc/guides/sample_app_ug/l2_forward_crypto.rst | 4 +- doc/guides/sample_app_ug/l2_forward_job_stats.rst | 3 - .../sample_app_ug/l2_forward_real_virtual.rst | 3 - doc/guides/sample_app_ug/l3_forward.rst | 2 +- doc/guides/sample_app_ug/link_status_intr.rst | 3 - doc/guides/sample_app_ug/pdump.rst | 122 +++ doc/guides/sample_app_ug/vhost.rst | 53 +- doc/guides/testpmd_app_ug/run_app.rst | 5 +- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 68 +- 47 files changed, 3500 insertions(+), 881 deletions(-) create mode 100644 doc/guides/cryptodevs/kasumi.rst create mode 100644 doc/guides/nics/bnxt.rst create mode 100644 doc/guides/nics/qede.rst create mode 100644 doc/guides/nics/thunderx.rst create mode 100644 doc/guides/prog_guide/pdump_lib.rst create mode 100644 doc/guides/rel_notes/release_16_07.rst create mode 100644 doc/guides/sample_app_ug/img/ipsec_endpoints.svg create mode 100644 doc/guides/sample_app_ug/pdump.rst (limited to 'doc') diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md index f6263867..5e7f024d 100644 --- a/doc/api/doxy-api-index.md +++ b/doc/api/doxy-api-index.md @@ -118,6 +118,7 @@ There are many libraries, so their headers may be grouped by topics: [frag] (@ref rte_port_frag.h), [reass] (@ref rte_port_ras.h), [sched] (@ref rte_port_sched.h), + [kni] (@ref rte_port_kni.h), [src/sink] (@ref rte_port_source_sink.h) * [table] (@ref rte_table.h): [lpm IPv4] (@ref rte_table_lpm.h), diff --git a/doc/guides/contributing/coding_style.rst b/doc/guides/contributing/coding_style.rst index ad1392d2..1eb67f34 100644 --- a/doc/guides/contributing/coding_style.rst +++ b/doc/guides/contributing/coding_style.rst @@ -685,3 +685,11 @@ Control Statements usage(); /* NOTREACHED */ } + + +Python Code +----------- + +All python code should be compliant with `PEP8 (Style Guide for Python Code) `_. + +The ``pep8`` tool can be used for testing compliance with the guidelines. diff --git a/doc/guides/contributing/versioning.rst b/doc/guides/contributing/versioning.rst index ae10a984..92b4d7ca 100644 --- a/doc/guides/contributing/versioning.rst +++ b/doc/guides/contributing/versioning.rst @@ -475,7 +475,7 @@ Where ``REV1`` and ``REV2`` are valid gitrevisions(7) https://www.kernel.org/pub/software/scm/git/docs/gitrevisions.html on the local repo and target is the usual DPDK compilation target. -For example: +For example:: # Check between the previous and latest commit: ./scripts/validate-abi.sh HEAD~1 HEAD x86_64-native-linuxapp-gcc diff --git a/doc/guides/cryptodevs/aesni_mb.rst b/doc/guides/cryptodevs/aesni_mb.rst index 9e04853a..60a89142 100644 --- a/doc/guides/cryptodevs/aesni_mb.rst +++ b/doc/guides/cryptodevs/aesni_mb.rst @@ -46,8 +46,11 @@ AESNI MB PMD has support for: Cipher algorithms: * RTE_CRYPTO_SYM_CIPHER_AES128_CBC +* RTE_CRYPTO_SYM_CIPHER_AES192_CBC * RTE_CRYPTO_SYM_CIPHER_AES256_CBC -* RTE_CRYPTO_SYM_CIPHER_AES512_CBC +* RTE_CRYPTO_SYM_CIPHER_AES128_CTR +* RTE_CRYPTO_SYM_CIPHER_AES192_CTR +* RTE_CRYPTO_SYM_CIPHER_AES256_CTR Hash algorithms: diff --git a/doc/guides/cryptodevs/index.rst b/doc/guides/cryptodevs/index.rst index a3f11f31..9616de1e 100644 --- a/doc/guides/cryptodevs/index.rst +++ b/doc/guides/cryptodevs/index.rst @@ -38,6 +38,7 @@ Crypto Device Drivers overview aesni_mb aesni_gcm + kasumi null snow3g - qat \ No newline at end of file + qat diff --git a/doc/guides/cryptodevs/kasumi.rst b/doc/guides/cryptodevs/kasumi.rst new file mode 100644 index 00000000..d6b3a975 --- /dev/null +++ b/doc/guides/cryptodevs/kasumi.rst @@ -0,0 +1,101 @@ +.. BSD LICENSE + Copyright(c) 2016 Intel Corporation. All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +KASUMI Crypto Poll Mode Driver +=============================== + +The KASUMI PMD (**librte_pmd_kasumi**) provides poll mode crypto driver +support for utilizing Intel Libsso library, which implements F8 and F9 functions +for KASUMI UEA1 cipher and UIA1 hash algorithms. + +Features +-------- + +KASUMI PMD has support for: + +Cipher algorithm: + +* RTE_CRYPTO_SYM_CIPHER_KASUMI_F8 + +Authentication algorithm: + +* RTE_CRYPTO_SYM_AUTH_KASUMI_F9 + +Limitations +----------- + +* Chained mbufs are not supported. +* KASUMI(F9) supported only if hash offset field is byte-aligned. + +Installation +------------ + +To build DPDK with the KASUMI_PMD the user is required to download +the export controlled ``libsso_kasumi`` library, by requesting it from +``_. +Once approval has been granted, the user needs to log in +``_ +and click on "Kasumi Bit Stream crypto library" link, to download the library. +After downloading the library, the user needs to unpack and compile it +on their system before building DPDK:: + + make kasumi + +Initialization +-------------- + +In order to enable this virtual crypto PMD, user must: + +* Export the environmental variable LIBSSO_KASUMI_PATH with the path where + the library was extracted (kasumi folder). + +* Build the LIBSSO library (explained in Installation section). + +* Set CONFIG_RTE_LIBRTE_PMD_KASUMI=y in config/common_base. + +To use the PMD in an application, user must: + +* Call rte_eal_vdev_init("cryptodev_kasumi_pmd") within the application. + +* Use --vdev="cryptodev_kasumi_pmd" in the EAL options, which will call rte_eal_vdev_init() internally. + +The following parameters (all optional) can be provided in the previous two calls: + +* socket_id: Specify the socket where the memory for the device is going to be allocated + (by default, socket_id will be the socket where the core that is creating the PMD is running on). + +* max_nb_queue_pairs: Specify the maximum number of queue pairs in the device (8 by default). + +* max_nb_sessions: Specify the maximum number of sessions that can be created (2048 by default). + +Example: + +.. code-block:: console + + ./l2fwd-crypto -c 40 -n 4 --vdev="cryptodev_kasumi_pmd,socket_id=1,max_nb_sessions=128" diff --git a/doc/guides/cryptodevs/overview.rst b/doc/guides/cryptodevs/overview.rst index 9f9af43c..d612f71b 100644 --- a/doc/guides/cryptodevs/overview.rst +++ b/doc/guides/cryptodevs/overview.rst @@ -33,62 +33,63 @@ Crypto Device Supported Functionality Matrices Supported Feature Flags .. csv-table:: - :header: "Feature Flags", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g" + :header: "Feature Flags", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g", "kasumi" :stub-columns: 1 - "RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO",x,x,,, - "RTE_CRYPTODEV_FF_ASYMMETRIC_CRYPTO",,,,, - "RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING",x,x,x,x,x - "RTE_CRYPTODEV_FF_CPU_SSE",,,x,x,x - "RTE_CRYPTODEV_FF_CPU_AVX",,,x,x,x - "RTE_CRYPTODEV_FF_CPU_AVX2",,,x,x, - "RTE_CRYPTODEV_FF_CPU_AESNI",,,x,x, - "RTE_CRYPTODEV_FF_HW_ACCELERATED",x,,,, + "RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO",x,x,x,x,x,x + "RTE_CRYPTODEV_FF_ASYMMETRIC_CRYPTO",,,,,, + "RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING",x,x,x,x,x,x + "RTE_CRYPTODEV_FF_CPU_SSE",,,x,x,x,x + "RTE_CRYPTODEV_FF_CPU_AVX",,,x,x,x,x + "RTE_CRYPTODEV_FF_CPU_AVX2",,,x,x,, + "RTE_CRYPTODEV_FF_CPU_AESNI",,,x,x,, + "RTE_CRYPTODEV_FF_HW_ACCELERATED",x,,,,, Supported Cipher Algorithms .. csv-table:: - :header: "Cipher Algorithms", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g" + :header: "Cipher Algorithms", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g", "kasumi" :stub-columns: 1 - "NULL",,x,,, - "AES_CBC_128",x,,x,, - "AES_CBC_192",x,,x,, - "AES_CBC_256",x,,x,, - "AES_CTR_128",,,,, - "AES_CTR_192",,,,, - "AES_CTR_256",,,,, - "SNOW3G_UEA2",x,,,,x + "NULL",,x,,,, + "AES_CBC_128",x,,x,,, + "AES_CBC_192",x,,x,,, + "AES_CBC_256",x,,x,,, + "AES_CTR_128",x,,x,,, + "AES_CTR_192",x,,x,,, + "AES_CTR_256",x,,x,,, + "SNOW3G_UEA2",x,,,,x, + "KASUMI_F8",,,,,,x Supported Authentication Algorithms .. csv-table:: - :header: "Cipher Algorithms", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g" + :header: "Cipher Algorithms", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g", "kasumi" :stub-columns: 1 - "NONE",,x,,, - "MD5",,,,, - "MD5_HMAC",,,x,, - "SHA1",,,,, - "SHA1_HMAC",x,,x,, - "SHA224",,,,, - "SHA224_HMAC",,,x,, - "SHA256",,,,, - "SHA256_HMAC",x,,x,, - "SHA384",,,,, - "SHA384_HMAC",,,x,, - "SHA512",,,,, - "SHA512_HMAC",x,,x,, - "AES_XCBC",x,,x,, - "SNOW3G_UIA2",x,,,,x - + "NONE",,x,,,, + "MD5",,,,,, + "MD5_HMAC",,,x,,, + "SHA1",,,,,, + "SHA1_HMAC",x,,x,,, + "SHA224",,,,,, + "SHA224_HMAC",,,x,,, + "SHA256",,,,,, + "SHA256_HMAC",x,,x,,, + "SHA384",,,,,, + "SHA384_HMAC",,,x,,, + "SHA512",,,,,, + "SHA512_HMAC",x,,x,,, + "AES_XCBC",x,,x,,, + "SNOW3G_UIA2",x,,,,x, + "KASUMI_F9",,,,,,x Supported AEAD Algorithms .. csv-table:: - :header: "AEAD Algorithms", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g" + :header: "AEAD Algorithms", "qat", "null", "aesni_mb", "aesni_gcm", "snow3g", "kasumi" :stub-columns: 1 - "AES_GCM_128",x,,x,, - "AES_GCM_192",x,,,, - "AES_GCM_256",x,,,, + "AES_GCM_128",x,,x,,, + "AES_GCM_192",x,,,,, + "AES_GCM_256",x,,,,, diff --git a/doc/guides/cryptodevs/qat.rst b/doc/guides/cryptodevs/qat.rst index 4b8f782a..cae19582 100644 --- a/doc/guides/cryptodevs/qat.rst +++ b/doc/guides/cryptodevs/qat.rst @@ -44,6 +44,9 @@ Cipher algorithms: * ``RTE_CRYPTO_SYM_CIPHER_AES128_CBC`` * ``RTE_CRYPTO_SYM_CIPHER_AES192_CBC`` * ``RTE_CRYPTO_SYM_CIPHER_AES256_CBC`` +* ``RTE_CRYPTO_SYM_CIPHER_AES128_CTR`` +* ``RTE_CRYPTO_SYM_CIPHER_AES192_CTR`` +* ``RTE_CRYPTO_SYM_CIPHER_AES256_CTR`` * ``RTE_CRYPTO_SYM_CIPHER_SNOW3G_UEA2`` * ``RTE_CRYPTO_CIPHER_AES_GCM`` diff --git a/doc/guides/cryptodevs/snow3g.rst b/doc/guides/cryptodevs/snow3g.rst index c1098b1a..670a62a9 100644 --- a/doc/guides/cryptodevs/snow3g.rst +++ b/doc/guides/cryptodevs/snow3g.rst @@ -51,55 +51,33 @@ Limitations ----------- * Chained mbufs are not supported. -* Snow3g(UEA2) supported only if cipher length, cipher offset fields are byte-aligned. -* Snow3g(UIA2) supported only if hash length, hash offset fields are byte-aligned. +* Snow3g(UIA2) supported only if hash offset field is byte-aligned. +* In-place bit-level operations for Snow3g(UEA2) are not supported + (if length and/or offset of data to be ciphered is not byte-aligned). Installation ------------ -To build DPDK with the SNOW3G_PMD the user is required to download -the export controlled ``libsso`` library, by requesting it from -``_, -and compiling it on their system before building DPDK:: - - make -f Makefile_snow3g - -**Note**: If using a gcc version higher than 5.0, and compilation fails, apply the following patch: - -.. code-block:: diff - - /libsso/src/snow3g/sso_snow3g.c - - static inline void ClockFSM_4(sso_snow3gKeyState4_t *pCtx, __m128i *data) - { - __m128i F, R; - - uint32_t K, L; - + uint32_t K; - + /* Declare unused if SNOW3G_WSM/SNB are defined */ - + uint32_t L __attribute__ ((unused)) = 0; - - F = _mm_add_epi32(pCtx->LFSR_X[15], pCtx->FSM_X[0]); - R = _mm_xor_si128(pCtx->LFSR_X[5], pCtx->FSM_X[2]); - - /libsso/include/sso_snow3g_internal.h - - -inline void ClockFSM_1(sso_snow3gKeyState1_t *pCtx, uint32_t *data); - -inline void ClockLFSR_1(sso_snow3gKeyState1_t *pCtx); - -inline void sso_snow3gStateInitialize_1(sso_snow3gKeyState1_t * pCtx, sso_snow3g_key_schedule_t *pKeySched, uint8_t *pIV); - +void ClockFSM_1(sso_snow3gKeyState1_t *pCtx, uint32_t *data); - +void ClockLFSR_1(sso_snow3gKeyState1_t *pCtx); - +void sso_snow3gStateInitialize_1(sso_snow3gKeyState1_t * pCtx, sso_snow3g_key_schedule_t *pKeySched, uint8_t *pIV); +To build DPDK with the KASUMI_PMD the user is required to download +the export controlled ``libsso_snow3g`` library, by requesting it from +``_. +Once approval has been granted, the user needs to log in +``_ +and click on "Snow3G Bit Stream crypto library" link, to download the library. +After downloading the library, the user needs to unpack and compile it +on their system before building DPDK:: + make snow3G Initialization -------------- In order to enable this virtual crypto PMD, user must: -* Export the environmental variable LIBSSO_PATH with the path where - the library was extracted. +* Export the environmental variable LIBSSO_SNOW3G_PATH with the path where + the library was extracted (snow3g folder). -* Build the LIBSSO library (explained in Installation section). +* Build the LIBSSO_SNOW3G library (explained in Installation section). * Set CONFIG_RTE_LIBRTE_PMD_SNOW3G=y in config/common_base. diff --git a/doc/guides/faq/faq.rst b/doc/guides/faq/faq.rst index 368374f5..3228b92c 100644 --- a/doc/guides/faq/faq.rst +++ b/doc/guides/faq/faq.rst @@ -88,9 +88,7 @@ the wrong socket, the application simply will not start. On application startup, there is a lot of EAL information printed. Is there any way to reduce this? --------------------------------------------------------------------------------------------------- -Yes, each EAL has a configuration file that is located in the /config directory. Within each configuration file, you will find CONFIG_RTE_LOG_LEVEL=8. -You can change this to a lower value, such as 6 to reduce this printout of debug information. The following is a list of LOG levels that can be found in the rte_log.h file. -You must remove, then rebuild, the EAL directory for the change to become effective as the configuration file creates the rte_config.h file in the EAL directory. +Yes, the option ``--log-level=`` accepts one of these numbers: .. code-block:: c @@ -103,6 +101,9 @@ You must remove, then rebuild, the EAL directory for the change to become effect #define RTE_LOG_INFO 7U /* Informational. */ #define RTE_LOG_DEBUG 8U /* Debug-level messages. */ +It is also possible to change the maximum (and default level) at compile time +with ``CONFIG_RTE_LOG_LEVEL``. + How can I tune my network application to achieve lower latency? --------------------------------------------------------------- diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst b/doc/guides/freebsd_gsg/build_dpdk.rst index ceacf7f8..1d92c089 100644 --- a/doc/guides/freebsd_gsg/build_dpdk.rst +++ b/doc/guides/freebsd_gsg/build_dpdk.rst @@ -263,7 +263,7 @@ To avoid this error, reduce the number of buffers or the buffer size. Loading the DPDK nic_uio Module ------------------------------- -After loading the contigmem module, the ``nic_uio must`` also be loaded into the +After loading the contigmem module, the ``nic_uio`` module must also be loaded into the running kernel prior to running any DPDK application. This module must be loaded using the kldload command as shown below (assuming that the current directory is the DPDK target directory). diff --git a/doc/guides/linux_gsg/enable_func.rst b/doc/guides/linux_gsg/enable_func.rst index 076770f2..ec0e04d9 100644 --- a/doc/guides/linux_gsg/enable_func.rst +++ b/doc/guides/linux_gsg/enable_func.rst @@ -186,21 +186,6 @@ Check with the local Intel's Network Division application engineers for firmware The base driver to support firmware version of FVL3E will be integrated in the next DPDK release, so currently the validated firmware version is 4.2.6. -Enabling Extended Tag -~~~~~~~~~~~~~~~~~~~~~ - -PCI configuration of ``extended_tag`` has big impact on small packet size -performance of 40G ports. Enabling ``extended_tag`` can help 40G port to -achieve the best performance, especially for small packet size. - -* Disabling/enabling ``extended_tag`` can be done in some BIOS implementations. - -* If BIOS does not enable it, and does not support changing it, tools - (e.g. ``setpci`` on Linux) can be used to enable or disable ``extended_tag``. - -* From release 16.04, ``extended_tag`` is enabled by default during port - initialization, users don't need to care about that anymore. - Use 16 Bytes RX Descriptor Size ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/guides/linux_gsg/sys_reqs.rst b/doc/guides/linux_gsg/sys_reqs.rst index 959709e5..b3215448 100644 --- a/doc/guides/linux_gsg/sys_reqs.rst +++ b/doc/guides/linux_gsg/sys_reqs.rst @@ -101,6 +101,9 @@ Compilation of the DPDK * libpcap headers and libraries (libpcap-devel) to compile and use the libpcap-based poll-mode driver. This driver is disabled by default and can be enabled by setting ``CONFIG_RTE_LIBRTE_PMD_PCAP=y`` in the build time config file. +* libarchive headers and library are needed for some unit tests using tar to get their resources. + + Running DPDK Applications ------------------------- diff --git a/doc/guides/nics/bnxt.rst b/doc/guides/nics/bnxt.rst new file mode 100644 index 00000000..2669e984 --- /dev/null +++ b/doc/guides/nics/bnxt.rst @@ -0,0 +1,49 @@ +.. BSD LICENSE + Copyright 2016 Broadcom Limited + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Broadcom Limited nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +bnxt poll mode driver library +============================= + +The bnxt poll mode library (**librte_pmd_bnxt**) implements support for +**Broadcom NetXtreme® C-Series**. These adapters support Standards- +compliant 10/25/50Gbps 30MPPS full-duplex throughput. + +Information about this family of adapters can be found in the +`NetXtreme® Brand section `_ +of the `Broadcom web site `_. + +Limitations +----------- + +With the current driver, allocated mbufs must be large enough to hold +the entire received frame. If the mbufs are not large enough, the +packets will be dropped. This is most limiting when jumbo frames are +used. + +SR-IOV is not supported. diff --git a/doc/guides/nics/fm10k.rst b/doc/guides/nics/fm10k.rst index c4915d82..7fc48624 100644 --- a/doc/guides/nics/fm10k.rst +++ b/doc/guides/nics/fm10k.rst @@ -157,10 +157,9 @@ Switch manager The Intel FM10000 family of NICs integrate a hardware switch and multiple host interfaces. The FM10000 PMD driver only manages host interfaces. For the switch component another switch driver has to be loaded prior to to the -FM10000 PMD driver. The switch driver can be acquired for Intel support or -from the `Match Interface `_ project. +FM10000 PMD driver. The switch driver can be acquired from Intel support. Only Testpoint is validated with DPDK, the latest version that has been -validated with DPDK2.2 is 4.1.6. +validated with DPDK is 4.1.6. CRC striping ~~~~~~~~~~~~ diff --git a/doc/guides/nics/i40e.rst b/doc/guides/nics/i40e.rst index 934eb027..da695afd 100644 --- a/doc/guides/nics/i40e.rst +++ b/doc/guides/nics/i40e.rst @@ -366,3 +366,48 @@ Delete all flow director rules on a port: testpmd> flush_flow_director 0 +Floating VEB +~~~~~~~~~~~~~ + +The Intel® Ethernet Controller X710 and XL710 Family support a feature called +"Floating VEB". + +A Virtual Ethernet Bridge (VEB) is an IEEE Edge Virtual Bridging (EVB) term +for functionality that allows local switching between virtual endpoints within +a physical endpoint and also with an external bridge/network. + +A "Floating" VEB doesn't have an uplink connection to the outside world so all +switching is done internally and remains within the host. As such, this +feature provides security benefits. + +In addition, a Floating VEB overcomes a limitation of normal VEBs where they +cannot forward packets when the physical link is down. Floating VEBs don't need +to connect to the NIC port so they can still forward traffic from VF to VF +even when the physical link is down. + +Therefore, with this feature enabled VFs can be limited to communicating with +each other but not an outside network, and they can do so even when there is +no physical uplink on the associated NIC port. + +To enable this feature, the user should pass a ``devargs`` parameter to the +EAL, for example:: + + -w 84:00.0,enable_floating_veb=1 + +In this configuration the PMD will use the floating VEB feature for all the +VFs created by this PF device. + +Alternatively, the user can specify which VFs need to connect to this floating +VEB using the ``floating_veb_list`` argument:: + + -w 84:00.0,enable_floating_veb=1,floating_veb_list=1;3-4 + +In this example ``VF1``, ``VF3`` and ``VF4`` connect to the floating VEB, +while other VFs connect to the normal VEB. + +The current implementation only supports one floating VEB and one regular +VEB. VFs can connect to a floating VEB or a regular VEB according to the +configuration passed on the EAL command line. + +The floating VEB functionality requires a NIC firmware version of 5.0 +or greater. diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst index 769f6770..92d56a59 100644 --- a/doc/guides/nics/index.rst +++ b/doc/guides/nics/index.rst @@ -37,6 +37,7 @@ Network Interface Controller Drivers overview bnx2x + bnxt cxgbe e1000em ena @@ -48,7 +49,9 @@ Network Interface Controller Drivers mlx4 mlx5 nfp + qede szedata2 + thunderx virtio vhost vmxnet3 diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index b6f91e6a..063c4a54 100644 --- a/doc/guides/nics/mlx5.rst +++ b/doc/guides/nics/mlx5.rst @@ -86,11 +86,11 @@ Features - Hardware checksum offloads. - Flow director (RTE_FDIR_MODE_PERFECT and RTE_FDIR_MODE_PERFECT_MAC_VLAN). - Secondary process TX is supported. +- KVM and VMware ESX SR-IOV modes are supported. Limitations ----------- -- KVM and VMware ESX SR-IOV modes are not supported yet. - Inner RSS for VXLAN frames is not supported yet. - Port statistics through software counters only. - Hardware checksum offloads for VXLAN inner header are not supported yet. @@ -114,23 +114,6 @@ These options can be modified in the ``.config`` file. adds additional run-time checks and debugging messages at the cost of lower performance. -- ``CONFIG_RTE_LIBRTE_MLX5_SGE_WR_N`` (default **4**) - - Number of scatter/gather elements (SGEs) per work request (WR). Lowering - this number improves performance but also limits the ability to receive - scattered packets (packets that do not fit a single mbuf). The default - value is a safe tradeoff. - -- ``CONFIG_RTE_LIBRTE_MLX5_MAX_INLINE`` (default **0**) - - Amount of data to be inlined during TX operations. Improves latency. - Can improve PPS performance when PCI backpressure is detected and may be - useful for scenarios involving heavy traffic on many queues. - - Since the additional software logic necessary to handle this mode can - lower performance when there is no backpressure, it is not enabled by - default. - - ``CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE`` (default **8**) Maximum number of cached memory pools (MPs) per TX queue. Each MP from @@ -142,16 +125,6 @@ These options can be modified in the ``.config`` file. Environment variables ~~~~~~~~~~~~~~~~~~~~~ -- ``MLX5_ENABLE_CQE_COMPRESSION`` - - A nonzero value lets ConnectX-4 return smaller completion entries to - improve performance when PCI backpressure is detected. It is most useful - for scenarios involving heavy traffic on many queues. - - Since the additional software logic necessary to handle this mode can - lower performance when there is no backpressure, it is not enabled by - default. - - ``MLX5_PMD_ENABLE_PADDING`` Enables HW packet padding in PCI bus transactions. @@ -175,6 +148,39 @@ Run-time configuration - **ethtool** operations on related kernel interfaces also affect the PMD. +- ``rxq_cqe_comp_en`` parameter [int] + + A nonzero value enables the compression of CQE on RX side. This feature + allows to save PCI bandwidth and improve performance at the cost of a + slightly higher CPU usage. Enabled by default. + +- ``txq_inline`` parameter [int] + + Amount of data to be inlined during TX operations. Improves latency. + Can improve PPS performance when PCI back pressure is detected and may be + useful for scenarios involving heavy traffic on many queues. + + It is not enabled by default (set to 0) since the additional software + logic necessary to handle this mode can lower performance when back + pressure is not expected. + +- ``txqs_min_inline`` parameter [int] + + Enable inline send only when the number of TX queues is greater or equal + to this value. + + This option should be used in combination with ``txq_inline`` above. + +- ``txq_mpw_en`` parameter [int] + + A nonzero value enables multi-packet send. This feature allows the TX + burst function to pack up to five packets in two descriptors in order to + save PCI bandwidth and improve performance at the cost of a slightly + higher CPU usage. + + It is currently only supported on the ConnectX-4 Lx family of adapters. + Enabled by default. + Prerequisites ------------- @@ -228,40 +234,12 @@ DPDK and must be installed separately: Currently supported by DPDK: -- Mellanox OFED **3.1-1.0.3**, **3.1-1.5.7.1** or **3.2-2.0.0.0** depending - on usage. - - The following features are supported with version **3.1-1.5.7.1** and - above only: - - - IPv6, UPDv6, TCPv6 RSS. - - RX checksum offloads. - - IBM POWER8. - - The following features are supported with version **3.2-2.0.0.0** and - above only: - - - Flow director. - - RX VLAN stripping. - - TX VLAN insertion. - - RX CRC stripping configuration. +- Mellanox OFED **3.3-1.0.0.0**. - Minimum firmware version: - With MLNX_OFED **3.1-1.0.3**: - - - ConnectX-4: **12.12.1240** - - ConnectX-4 Lx: **14.12.1100** - - With MLNX_OFED **3.1-1.5.7.1**: - - - ConnectX-4: **12.13.0144** - - ConnectX-4 Lx: **14.13.0144** - - With MLNX_OFED **3.2-2.0.0.0**: - - - ConnectX-4: **12.14.2036** - - ConnectX-4 Lx: **14.14.2036** + - ConnectX-4: **12.16.1006** + - ConnectX-4 Lx: **14.16.1006** Getting Mellanox OFED ~~~~~~~~~~~~~~~~~~~~~ diff --git a/doc/guides/nics/nfp.rst b/doc/guides/nics/nfp.rst index dfc36836..e4ebc712 100644 --- a/doc/guides/nics/nfp.rst +++ b/doc/guides/nics/nfp.rst @@ -61,9 +61,8 @@ instructions. DPDK runs in userspace and PMDs uses the Linux kernel UIO interface to allow access to physical devices from userspace. The NFP PMD requires -a separate UIO driver, **nfp_uio**, to perform correct -initialization. This driver is part of Netronome´s BSP and it is -equivalent to Intel's igb_uio driver. +the **igb_uio** UIO driver, available with DPDK, to perform correct +initialization. Building the software --------------------- @@ -201,27 +200,18 @@ Using the NFP PMD is not different to using other PMDs. Usual steps are: The module should now be listed by the lsmod command. -#. **To install the nfp_uio kernel module (manually):** This module supports - NFP-6xxx devices through the UIO interface. - - This module is part of Netronome´s BSP and it should be available when the - BSP is installed. +#. **To install the igb_uio kernel module (manually):** This module is part + of DPDK sources and configured by default (CONFIG_RTE_EAL_IGB_UIO=y). .. code-block:: console - modprobe nfp_uio.ko + modprobe igb_uio.ko The module should now be listed by the lsmod command. - Depending on which NFP modules are loaded, nfp_uio may be automatically - bound to the NFP PCI devices by the system. Otherwise the binding needs - to be done explicitly. This is the case when nfp_netvf, the Linux kernel - driver for NFP VFs, was loaded when VFs were created. As described later - in this document this configuration may also be performed using scripts - provided by the Netronome´s BSP. - - First the device needs to be unbound, for example from the nfp_netvf - driver: + Depending on which NFP modules are loaded, it could be necessary to + detach NFP devices from the nfp_netvf module. If this is the case the + device needs to be unbound, for example: .. code-block:: console @@ -232,30 +222,25 @@ Using the NFP PMD is not different to using other PMDs. Usual steps are: The output of lspci should now show that 0000:03:08.0 is not bound to any driver. - The next step is to add the NFP PCI ID to the NFP UIO driver: + The next step is to add the NFP PCI ID to the IGB UIO driver: .. code-block:: console - echo 19ee 6003 > /sys/bus/pci/drivers/nfp_uio/new_id + echo 19ee 6003 > /sys/bus/pci/drivers/igb_uio/new_id - And then to bind the device to the nfp_uio driver: + And then to bind the device to the igb_uio driver: .. code-block:: console - echo 0000:03:08.0 > /sys/bus/pci/drivers/nfp_uio/bind + echo 0000:03:08.0 > /sys/bus/pci/drivers/igb_uio/bind lspci -d19ee: -k - lspci should show that device bound to nfp_uio driver. - -#. **Using tools from Netronome´s BSP to install and bind modules:** DPDK provides - scripts which are useful for installing the UIO modules and for binding the - right device to those modules avoiding doing so manually. However, these scripts - have not support for Netronome´s UIO driver. Along with drivers, the BSP installs - those DPDK scripts slightly modified with support for Netronome´s UIO driver. + lspci should show that device bound to igb_uio driver. - Those specific scripts can be found in Netronome´s BSP installation directory. - Refer to BSP documentation for more information. +#. **Using scripts to install and bind modules:** DPDK provides scripts which are + useful for installing the UIO modules and for binding the right device to those + modules avoiding doing so manually: * **setup.sh** * **dpdk_nic_bind.py** diff --git a/doc/guides/nics/overview.rst b/doc/guides/nics/overview.rst index ed116e31..a23eb5cc 100644 --- a/doc/guides/nics/overview.rst +++ b/doc/guides/nics/overview.rst @@ -58,7 +58,7 @@ Most of these differences are summarized below. white-space: pre-wrap; text-align: center; vertical-align: top; - padding: 3px; + padding: 2px; } table#id1 th:first-child { vertical-align: bottom; @@ -74,76 +74,81 @@ Most of these differences are summarized below. .. table:: Features availability in networking drivers - ==================== = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = - Feature a b b b c e e e i i i i i i i i i i f f f f m m m n n p r s v v v v x - f n n o x 1 n n 4 4 4 4 g g x x x x m m m m l l p f u c i z h i i m e - p x x n g 0 a i 0 0 0 0 b b g g g g 1 1 1 1 x x i p l a n e o r r x n - a 2 2 d b 0 c e e e e v b b b b 0 0 0 0 4 5 p l p g d s t t n v - c x x i e 0 . v v f e e e e k k k k e a t i i e i - k v n . f f . v v . v v t o o t r - e f g . . . f f . f f a . 3 t - t v v v v v v 2 v - e e e e e e e - c c c c c c c - ==================== = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = - speed capabilities - link status X X X X X X X X X X X X X X X X X X - link status event X X X X X X X X X X X - queue status event X - Rx interrupt X X X X X X X X X X X X X X X - queue start/stop X X X X X X X X X X X X X X X X X X - MTU update X X X X X X X X X X - jumbo frame X X X X X X X X X X X X X X X X X X X X - scattered Rx X X X X X X X X X X X X X X X X X X X X X - LRO X X X X - TSO X X X X X X X X X X X X X X X X - promiscuous mode X X X X X X X X X X X X X X X X X X X X - allmulticast mode X X X X X X X X X X X X X X X X X X X - unicast MAC filter X X X X X X X X X X X X X X X X X X X X - multicast MAC filter X X X X X X X X X X X X X - RSS hash X X X X X X X X X X X X X X X X X X - RSS key update X X X X X X X X X X X X X X X - RSS reta update X X X X X X X X X X X X X X X - VMDq X X X X X X X - SR-IOV X X X X X X X X X - DCB X X X X X - VLAN filter X X X X X X X X X X X X X X X X X X - ethertype filter X X X X X - n-tuple filter X X X - SYN filter X X X - tunnel filter X X X X - flexible filter X - hash filter X X X X - flow director X X X X X - flow control X X X X X X X - rate limitation X X - traffic mirroring X X X X - CRC offload X X X X X X X X X X X X X X X - VLAN offload X X X X X X X X X X X X X X X - QinQ offload X X X X X X X - L3 checksum offload X X X X X X X X X X X X X X X X - L4 checksum offload X X X X X X X X X X X X X X X X - inner L3 checksum X X X X X X - inner L4 checksum X X X X X X - packet type parsing X X X X X X X X X X X X X X - timesync X X X X X - basic stats X X X X X X X X X X X X X X X X X X X X X X X X X X X - extended stats X X X X X X X X X X X X X X X X X - stats per queue X X X X X X X X X X X X - EEPROM dump X X X - registers dump X X X X X X - multiprocess aware X X X X X X X X X X X X X X X - BSD nic_uio X X X X X X X X X X X X X X X X X X X - Linux UIO X X X X X X X X X X X X X X X X X X X X X X - Linux VFIO X X X X X X X X X X X X X X X X X X X - other kdrv X X X - ARMv7 X X X - ARMv8 X X X - Power8 X X X - TILE-Gx X - x86-32 X X X X X X X X X X X X X X X X X X X X X X X X - x86-64 X X X X X X X X X X X X X X X X X X X X X X X X X X X - usage doc X X X X X X X X X - design doc - perf doc - ==================== = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = + ==================== = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = + Feature a b b b b c e e e i i i i i i i i i i f f f f m m m n n p q q r s t v v v v x + f n n n o x 1 n n 4 4 4 4 g g x x x x m m m m l l p f u c e e i z h h i i m e + p x x x n g 0 a i 0 0 0 0 b b g g g g 1 1 1 1 x x i p l a d d n e u o r r x n + a 2 2 t d b 0 c e e e e v b b b b 0 0 0 0 4 5 p l p e e g d n s t t n v + c x x i e 0 . v v f e e e e k k k k e v a d t i i e i + k v n . f f . v v . v v f t e o o t r + e f g . . . f f . f f a r . 3 t + t v v v v v v 2 x v + e e e e e e e + c c c c c c c + ==================== = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = + Speed capabilities + Link status Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Link status event Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Queue status event Y + Rx interrupt Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Queue start/stop Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + MTU update Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Jumbo frame Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Scattered Rx Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + LRO Y Y Y Y + TSO Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Promiscuous mode Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Allmulticast mode Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Unicast MAC filter Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Multicast MAC filter Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + RSS hash Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + RSS key update Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + RSS reta update Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + VMDq Y Y Y Y Y Y Y + SR-IOV Y Y Y Y Y Y Y Y Y Y Y + DCB Y Y Y Y Y + VLAN filter Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Ethertype filter Y Y Y Y Y + N-tuple filter Y Y Y + SYN filter Y Y Y + Tunnel filter Y Y Y Y + Flexible filter Y + Hash filter Y Y Y Y + Flow director Y Y Y Y Y + Flow control Y Y Y Y Y Y Y Y Y + Rate limitation Y Y + Traffic mirroring Y Y Y Y + CRC offload Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + VLAN offload Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y P + QinQ offload Y Y Y Y Y Y Y + L3 checksum offload Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + L4 checksum offload Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Inner L3 checksum Y Y Y Y Y Y + Inner L4 checksum Y Y Y Y Y Y + Packet type parsing Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Timesync Y Y Y Y Y + Basic stats Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Extended stats Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Stats per queue Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + EEPROM dump Y Y Y Y + Registers dump Y Y Y Y Y Y Y Y + Multiprocess aware Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + BSD nic_uio Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Linux UIO Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Linux VFIO Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Other kdrv Y Y Y + ARMv7 Y Y Y + ARMv8 Y Y Y Y Y Y Y Y + Power8 Y Y Y + TILE-Gx Y + x86-32 Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + x86-64 Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y + Usage doc Y Y Y Y Y Y Y Y Y Y Y Y + Design doc + Perf doc + ==================== = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = + +.. Note:: + + Features marked with "P" are partially supported. Refer to the appropriate + NIC guide in the following sections for details. diff --git a/doc/guides/nics/qede.rst b/doc/guides/nics/qede.rst new file mode 100644 index 00000000..f7ca8eb9 --- /dev/null +++ b/doc/guides/nics/qede.rst @@ -0,0 +1,314 @@ +.. BSD LICENSE + Copyright (c) 2016 QLogic Corporation + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of QLogic Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +QEDE Poll Mode Driver +====================== + +The QEDE poll mode driver library (**librte_pmd_qede**) implements support +for **QLogic FastLinQ QL4xxxx 25G/40G CNA** family of adapters as well +as their virtual functions (VF) in SR-IOV context. It is supported on +several standard Linux distros like RHEL7.x, SLES12.x and Ubuntu. +It is compile-tested under FreeBSD OS. + +More information can be found at `QLogic Corporation's Website +`_. + +Supported Features +------------------ + +- Unicast/Multicast filtering +- Promiscuous mode +- Allmulti mode +- Port hardware statistics +- Jumbo frames (using single buffer) +- VLAN offload - Filtering and stripping +- Stateless checksum offloads (IPv4/TCP/UDP) +- Multiple Rx/Tx queues (queue-pairs) +- RSS (with user configurable table/key) +- TSS +- Multiple MAC address +- Default pause flow control +- SR-IOV VF for 25G/40G modes + +Non-supported Features +---------------------- + +- Scatter-Gather Rx/Tx frames +- Unequal number of Rx/Tx queues +- MTU change (dynamic) +- SR-IOV PF +- Tunneling offloads +- Reload of the PMD after a non-graceful termination + +Supported QLogic Adapters +------------------------- + +- QLogic FastLinQ QL4xxxx 25G/40G/100G CNAs. + +Prerequisites +------------- + +- Requires firmware version **8.7.x.** and management firmware + version **8.7.x or higher**. Firmware may be available + inbox in certain newer Linux distros under the standard directory + ``E.g. /lib/firmware/qed/qed_init_values_zipped-8.7.7.0.bin`` + +- If the required firmware files are not available then visit + `QLogic Driver Download Center `_. + +- This driver relies on external zlib library (-lz) for uncompressing + the firmware file. + +Performance note +~~~~~~~~~~~~~~~~ + +- For better performance, it is recommended to use 4K or higher RX/TX rings. + +Config File Options +~~~~~~~~~~~~~~~~~~~ + +The following options can be modified in the ``.config`` file. Please note that +enabling debugging options may affect system performance. + +- ``CONFIG_RTE_LIBRTE_QEDE_PMD`` (default **y**) + + Toggle compilation of QEDE PMD driver. + +- ``CONFIG_RTE_LIBRTE_QEDE_DEBUG_INFO`` (default **n**) + + Toggle display of generic debugging messages. + +- ``CONFIG_RTE_LIBRTE_QEDE_DEBUG_DRIVER`` (default **n**) + + Toggle display of ecore related messages. + +- ``CONFIG_RTE_LIBRTE_QEDE_DEBUG_TX`` (default **n**) + + Toggle display of transmit fast path run-time messages. + +- ``CONFIG_RTE_LIBRTE_QEDE_DEBUG_RX`` (default **n**) + + Toggle display of receive fast path run-time messages. + +- ``CONFIG_RTE_LIBRTE_QEDE_FW`` (default **""**) + + Gives absolute path of firmware file. + ``Eg: "/lib/firmware/qed/qed_init_values_zipped-8.7.7.0.bin"`` + Empty string indicates driver will pick up the firmware file + from the default location. + +Driver Compilation +~~~~~~~~~~~~~~~~~~ + +To compile QEDE PMD for Linux x86_64 gcc target, run the following ``make`` +command:: + + cd + make config T=x86_64-native-linuxapp-gcc install + +To compile QEDE PMD for Linux x86_64 clang target, run the following ``make`` +command:: + + cd + make config T=x86_64-native-linuxapp-clang install + +To compile QEDE PMD for FreeBSD x86_64 clang target, run the following ``gmake`` +command:: + + cd + gmake config T=x86_64-native-bsdapp-clang install + +To compile QEDE PMD for FreeBSD x86_64 gcc target, run the following ``gmake`` +command:: + + cd + gmake config T=x86_64-native-bsdapp-gcc install -Wl,-rpath=\ + /usr/local/lib/gcc48 CC=gcc48 + + +Sample Application Notes +~~~~~~~~~~~~~~~~~~~~~~~~ + +This section demonstrates how to launch ``testpmd`` with QLogic 4xxxx +devices managed by ``librte_pmd_qede`` in Linux operating system. + +#. Request huge pages: + + .. code-block:: console + + echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages/ \ + nr_hugepages + +#. Load ``igb_uio`` driver: + + .. code-block:: console + + insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko + +#. Bind the QLogic 4xxxx adapters to ``igb_uio`` loaded in the + previous step: + + .. code-block:: console + + ./tools/dpdk_nic_bind.py --bind igb_uio 0000:84:00.0 0000:84:00.1 \ + 0000:84:00.2 0000:84:00.3 + +#. Start ``testpmd`` with basic parameters: + (Enable QEDE_DEBUG_INFO=y to view informational messages) + + .. code-block:: console + + testpmd -c 0xff1 -n 4 -- -i --nb-cores=8 --portmask=0xf --rxd=4096 \ + --txd=4096 --txfreet=4068 --enable-rx-cksum --rxq=4 --txq=4 \ + --rss-ip --rss-udp + + [...] + + EAL: PCI device 0000:84:00.0 on NUMA socket 1 + EAL: probe driver: 1077:1634 rte_qede_pmd + EAL: Not managed by a supported kernel driver, skipped + EAL: PCI device 0000:84:00.1 on NUMA socket 1 + EAL: probe driver: 1077:1634 rte_qede_pmd + EAL: Not managed by a supported kernel driver, skipped + EAL: PCI device 0000:88:00.0 on NUMA socket 1 + EAL: probe driver: 1077:1656 rte_qede_pmd + EAL: PCI memory mapped at 0x7f738b200000 + EAL: PCI memory mapped at 0x7f738b280000 + EAL: PCI memory mapped at 0x7f738b300000 + PMD: Chip details : BB1 + PMD: Driver version : QEDE PMD 8.7.9.0_1.0.0 + PMD: Firmware version : 8.7.7.0 + PMD: Management firmware version : 8.7.8.0 + PMD: Firmware file : /lib/firmware/qed/qed_init_values_zipped-8.7.7.0.bin + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_common_dev_init:macaddr \ + 00:0e:1e:d2:09:9c + [...] + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_tx_queue_setup:txq 0 num_desc 4096 \ + tx_free_thresh 4068 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_tx_queue_setup:txq 1 num_desc 4096 \ + tx_free_thresh 4068 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_tx_queue_setup:txq 2 num_desc 4096 \ + tx_free_thresh 4068 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_tx_queue_setup:txq 3 num_desc 4096 \ + tx_free_thresh 4068 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_rx_queue_setup:rxq 0 num_desc 4096 \ + rx_buf_size=2148 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_rx_queue_setup:rxq 1 num_desc 4096 \ + rx_buf_size=2148 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_rx_queue_setup:rxq 2 num_desc 4096 \ + rx_buf_size=2148 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_rx_queue_setup:rxq 3 num_desc 4096 \ + rx_buf_size=2148 socket 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_dev_start:port 0 + [QEDE PMD: (84:00.0:dpdk-port-0)]qede_dev_start:link status: down + [...] + Checking link statuses... + Port 0 Link Up - speed 25000 Mbps - full-duplex + Port 1 Link Up - speed 25000 Mbps - full-duplex + Port 2 Link Up - speed 25000 Mbps - full-duplex + Port 3 Link Up - speed 25000 Mbps - full-duplex + Done + testpmd> + + +SR-IOV: Prerequisites and Sample Application Notes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This section provides instructions to configure SR-IOV with Linux OS. + +**Note**: librte_pmd_qede will be used to bind to SR-IOV VF device and Linux native kernel driver (QEDE) will function as SR-IOV PF driver. + +#. Verify SR-IOV and ARI capability is enabled on the adapter using ``lspci``: + + .. code-block:: console + + lspci -s -vvv + + Example output: + + .. code-block:: console + + [...] + Capabilities: [1b8 v1] Alternative Routing-ID Interpretation (ARI) + [...] + Capabilities: [1c0 v1] Single Root I/O Virtualization (SR-IOV) + [...] + Kernel driver in use: igb_uio + +#. Load the kernel module: + + .. code-block:: console + + modprobe qede + + Example output: + + .. code-block:: console + + systemd-udevd[4848]: renamed network interface eth0 to ens5f0 + systemd-udevd[4848]: renamed network interface eth1 to ens5f1 + +#. Bring up the PF ports: + + .. code-block:: console + + ifconfig ens5f0 up + ifconfig ens5f1 up + +#. Create VF device(s): + + Echo the number of VFs to be created into ``"sriov_numvfs"`` sysfs entry + of the parent PF. + + Example output: + + .. code-block:: console + + echo 2 > /sys/devices/pci0000:00/0000:00:03.0/0000:81:00.0/sriov_numvfs + + +#. Assign VF MAC address: + + Assign MAC address to the VF using iproute2 utility. The syntax is:: + + ip link set vf mac + + Example output: + + .. code-block:: console + + ip link set ens5f0 vf 0 mac 52:54:00:2f:9d:e8 + + +#. PCI Passthrough: + + The VF devices may be passed through to the guest VM using ``virt-manager`` or + ``virsh``. QEDE PMD should be used to bind the VF devices in the guest VM + using the instructions outlined in the Application notes above. diff --git a/doc/guides/nics/thunderx.rst b/doc/guides/nics/thunderx.rst new file mode 100644 index 00000000..e38f260b --- /dev/null +++ b/doc/guides/nics/thunderx.rst @@ -0,0 +1,354 @@ +.. BSD LICENSE + Copyright (C) Cavium networks Ltd. 2016. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Cavium networks nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +ThunderX NICVF Poll Mode Driver +=============================== + +The ThunderX NICVF PMD (**librte_pmd_thunderx_nicvf**) provides poll mode driver +support for the inbuilt NIC found in the **Cavium ThunderX** SoC family +as well as their virtual functions (VF) in SR-IOV context. + +More information can be found at `Cavium Networks Official Website +`_. + +Features +-------- + +Features of the ThunderX PMD are: + +- Multiple queues for TX and RX +- Receive Side Scaling (RSS) +- Packet type information +- Checksum offload +- Promiscuous mode +- Multicast mode +- Port hardware statistics +- Jumbo frames +- Link state information +- Scattered and gather for TX and RX +- VLAN stripping +- SR-IOV VF +- NUMA support + +Supported ThunderX SoCs +----------------------- +- CN88xx + +Prerequisites +------------- +- Follow the DPDK :ref:`Getting Started Guide for Linux ` to setup the basic DPDK environment. + +Pre-Installation Configuration +------------------------------ + +Config File Options +~~~~~~~~~~~~~~~~~~~ + +The following options can be modified in the ``config`` file. +Please note that enabling debugging options may affect system performance. + +- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_PMD`` (default ``n``) + + By default it is enabled only for defconfig_arm64-thunderx-* config. + Toggle compilation of the ``librte_pmd_thunderx_nicvf`` driver. + +- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_INIT`` (default ``n``) + + Toggle display of initialization related messages. + +- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_RX`` (default ``n``) + + Toggle display of receive fast path run-time message + +- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_TX`` (default ``n``) + + Toggle display of transmit fast path run-time message + +- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_DRIVER`` (default ``n``) + + Toggle display of generic debugging messages + +- ``CONFIG_RTE_LIBRTE_THUNDERX_NICVF_DEBUG_MBOX`` (default ``n``) + + Toggle display of PF mailbox related run-time check messages + +Driver Compilation +~~~~~~~~~~~~~~~~~~ + +To compile the ThunderX NICVF PMD for Linux arm64 gcc target, run the +following “make” command: + +.. code-block:: console + + cd + make config T=arm64-thunderx-linuxapp-gcc install + +Linux +----- + +.. _thunderx_testpmd_example: + +Running testpmd +~~~~~~~~~~~~~~~ + +This section demonstrates how to launch ``testpmd`` with ThunderX NIC VF device +managed by ``librte_pmd_thunderx_nicvf`` in the Linux operating system. + +#. Load ``vfio-pci`` driver: + + .. code-block:: console + + modprobe vfio-pci + + .. _thunderx_vfio_noiommu: + +#. Enable **VFIO-NOIOMMU** mode (optional): + + .. code-block:: console + + echo 1 > /sys/module/vfio/parameters/enable_unsafe_noiommu_mode + + .. note:: + + **VFIO-NOIOMMU** is required only when running in VM context and should not be enabled otherwise. + See also :ref:`SR-IOV: Prerequisites and sample Application Notes `. + +#. Bind the ThunderX NIC VF device to ``vfio-pci`` loaded in the previous step: + + Setup VFIO permissions for regular users and then bind to ``vfio-pci``: + + .. code-block:: console + + ./tools/dpdk_nic_bind.py --bind vfio-pci 0002:01:00.2 + +#. Start ``testpmd`` with basic parameters: + + .. code-block:: console + + ./arm64-thunderx-linuxapp-gcc/app/testpmd -c 0xf -n 4 -w 0002:01:00.2 \ + -- -i --disable-hw-vlan-filter --crc-strip --no-flush-rx \ + --port-topology=loop + + Example output: + + .. code-block:: console + + ... + + PMD: rte_nicvf_pmd_init(): librte_pmd_thunderx nicvf version 1.0 + + ... + EAL: probe driver: 177d:11 rte_nicvf_pmd + EAL: using IOMMU type 1 (Type 1) + EAL: PCI memory mapped at 0x3ffade50000 + EAL: Trying to map BAR 4 that contains the MSI-X table. + Trying offsets: 0x40000000000:0x0000, 0x10000:0x1f0000 + EAL: PCI memory mapped at 0x3ffadc60000 + PMD: nicvf_eth_dev_init(): nicvf: device (177d:11) 2:1:0:2 + PMD: nicvf_eth_dev_init(): node=0 vf=1 mode=tns-bypass sqs=false + loopback_supported=true + PMD: nicvf_eth_dev_init(): Port 0 (177d:11) mac=a6:c6:d9:17:78:01 + Interactive-mode selected + Configuring Port 0 (socket 0) + ... + + PMD: nicvf_dev_configure(): Configured ethdev port0 hwcap=0x0 + Port 0: A6:C6:D9:17:78:01 + Checking link statuses... + Port 0 Link Up - speed 10000 Mbps - full-duplex + Done + testpmd> + +.. _thunderx_sriov_example: + +SR-IOV: Prerequisites and sample Application Notes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Current ThunderX NIC PF/VF kernel modules maps each physical Ethernet port +automatically to virtual function (VF) and presented them as PCIe-like SR-IOV device. +This section provides instructions to configure SR-IOV with Linux OS. + +#. Verify PF devices capabilities using ``lspci``: + + .. code-block:: console + + lspci -vvv + + Example output: + + .. code-block:: console + + 0002:01:00.0 Ethernet controller: Cavium Networks Device a01e (rev 01) + ... + Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI) + ... + Capabilities: [180 v1] Single Root I/O Virtualization (SR-IOV) + ... + Kernel driver in use: thunder-nic + ... + + .. note:: + + Unless ``thunder-nic`` driver is in use make sure your kernel config includes ``CONFIG_THUNDER_NIC_PF`` setting. + +#. Verify VF devices capabilities and drivers using ``lspci``: + + .. code-block:: console + + lspci -vvv + + Example output: + + .. code-block:: console + + 0002:01:00.1 Ethernet controller: Cavium Networks Device 0011 (rev 01) + ... + Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI) + ... + Kernel driver in use: thunder-nicvf + ... + + 0002:01:00.2 Ethernet controller: Cavium Networks Device 0011 (rev 01) + ... + Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI) + ... + Kernel driver in use: thunder-nicvf + ... + + .. note:: + + Unless ``thunder-nicvf`` driver is in use make sure your kernel config includes ``CONFIG_THUNDER_NIC_VF`` setting. + +#. Verify PF/VF bind using ``dpdk_nic_bind.py``: + + .. code-block:: console + + ./tools/dpdk_nic_bind.py --status + + Example output: + + .. code-block:: console + + ... + 0002:01:00.0 'Device a01e' if= drv=thunder-nic unused=vfio-pci + 0002:01:00.1 'Device 0011' if=eth0 drv=thunder-nicvf unused=vfio-pci + 0002:01:00.2 'Device 0011' if=eth1 drv=thunder-nicvf unused=vfio-pci + ... + +#. Load ``vfio-pci`` driver: + + .. code-block:: console + + modprobe vfio-pci + +#. Bind VF devices to ``vfio-pci`` using ``dpdk_nic_bind.py``: + + .. code-block:: console + + ./tools/dpdk_nic_bind.py --bind vfio-pci 0002:01:00.1 + ./tools/dpdk_nic_bind.py --bind vfio-pci 0002:01:00.2 + +#. Verify VF bind using ``dpdk_nic_bind.py``: + + .. code-block:: console + + ./tools/dpdk_nic_bind.py --status + + Example output: + + .. code-block:: console + + ... + 0002:01:00.1 'Device 0011' drv=vfio-pci unused= + 0002:01:00.2 'Device 0011' drv=vfio-pci unused= + ... + 0002:01:00.0 'Device a01e' if= drv=thunder-nic unused=vfio-pci + ... + +#. Pass VF device to VM context (PCIe Passthrough): + + The VF devices may be passed through to the guest VM using qemu or + virt-manager or virsh etc. + ``librte_pmd_thunderx_nicvf`` or ``thunder-nicvf`` should be used to bind + the VF devices in the guest VM in :ref:`VFIO-NOIOMMU ` mode. + + Example qemu guest launch command: + + .. code-block:: console + + sudo qemu-system-aarch64 -name vm1 \ + -machine virt,gic_version=3,accel=kvm,usb=off \ + -cpu host -m 4096 \ + -smp 4,sockets=1,cores=8,threads=1 \ + -nographic -nodefaults \ + -kernel \ + -append "root=/dev/vda console=ttyAMA0 rw hugepagesz=512M hugepages=3" \ + -device vfio-pci,host=0002:01:00.1 \ + -drive file=,if=none,id=disk1,format=raw \ + -device virtio-blk-device,scsi=off,drive=disk1,id=virtio-disk1,bootindex=1 \ + -netdev tap,id=net0,ifname=tap0,script=/etc/qemu-ifup_thunder \ + -device virtio-net-device,netdev=net0 \ + -serial stdio \ + -mem-path /dev/huge + +#. Refer to section :ref:`Running testpmd ` for instruction + how to launch ``testpmd`` application. + +Limitations +----------- + +CRC striping +~~~~~~~~~~~~ + +The ThunderX SoC family NICs strip the CRC for every packets coming into the +host interface. So, CRC will be stripped even when the +``rxmode.hw_strip_crc`` member is set to 0 in ``struct rte_eth_conf``. + +Maximum packet length +~~~~~~~~~~~~~~~~~~~~~ + +The ThunderX SoC family NICs support a maximum of a 9K jumbo frame. The value +is fixed and cannot be changed. So, even when the ``rxmode.max_rx_pkt_len`` +member of ``struct rte_eth_conf`` is set to a value lower than 9200, frames +up to 9200 bytes can still reach the host interface. + +Maximum packet segments +~~~~~~~~~~~~~~~~~~~~~~~ + +The ThunderX SoC family NICs support up to 12 segments per packet when working +in scatter/gather mode. So, setting MTU will result with ``EINVAL`` when the +frame size does not fit in the maximum number of segments. + +Limited VFs +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ThunderX SoC family NICs has 128VFs and each VF has 8/8 queues +for RX/TX respectively. Current driver implementation has one to one mapping +between physical port and VF hence only limited VFs can be used. diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst index 4737dc2e..4b9895e4 100644 --- a/doc/guides/prog_guide/env_abstraction_layer.rst +++ b/doc/guides/prog_guide/env_abstraction_layer.rst @@ -322,8 +322,8 @@ Known Issues The rte_mempool uses a per-lcore cache inside the mempool. For non-EAL pthreads, ``rte_lcore_id()`` will not return a valid number. - So for now, when rte_mempool is used with non-EAL pthreads, the put/get operations will bypass the mempool cache and there is a performance penalty because of this bypass. - Support for non-EAL mempool cache is currently being enabled. + So for now, when rte_mempool is used with non-EAL pthreads, the put/get operations will bypass the default mempool cache and there is a performance penalty because of this bypass. + Only user-owned external caches can be used in a non-EAL context in conjunction with ``rte_mempool_generic_put()`` and ``rte_mempool_generic_get()`` that accept an explicit cache parameter. + rte_ring diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst index b862d0cf..07a4d354 100644 --- a/doc/guides/prog_guide/index.rst +++ b/doc/guides/prog_guide/index.rst @@ -52,6 +52,7 @@ Programmer's Guide packet_distrib_lib reorder_lib ip_fragment_reassembly_lib + pdump_lib multi_proc_support kernel_nic_interface thread_safety_dpdk_functions diff --git a/doc/guides/prog_guide/ivshmem_lib.rst b/doc/guides/prog_guide/ivshmem_lib.rst index 9401ccf2..b8a32e4c 100644 --- a/doc/guides/prog_guide/ivshmem_lib.rst +++ b/doc/guides/prog_guide/ivshmem_lib.rst @@ -79,6 +79,8 @@ The following is a simple guide to using the IVSHMEM Library API: Only data structures fully residing in DPDK hugepage memory work correctly. Supported data structures created by malloc(), mmap() or otherwise using non-DPDK memory cause undefined behavior and even a segmentation fault. + Specifically, because the memzone field in an rte_ring refers to a memzone structure residing in local memory, + accessing the memzone field in a shared rte_ring will cause an immediate segmentation fault. IVSHMEM Environment Configuration --------------------------------- diff --git a/doc/guides/prog_guide/mempool_lib.rst b/doc/guides/prog_guide/mempool_lib.rst index 5fae79ab..59466752 100644 --- a/doc/guides/prog_guide/mempool_lib.rst +++ b/doc/guides/prog_guide/mempool_lib.rst @@ -34,13 +34,12 @@ Mempool Library =============== A memory pool is an allocator of a fixed-sized object. -In the DPDK, it is identified by name and uses a ring to store free objects. +In the DPDK, it is identified by name and uses a mempool handler to store free objects. +The default mempool handler is ring based. It provides some other optional services such as a per-core object cache and an alignment helper to ensure that objects are padded to spread them equally on all DRAM or DDR3 channels. -This library is used by the -:ref:`Mbuf Library ` and the -:ref:`Environment Abstraction Layer ` (for logging history). +This library is used by the :ref:`Mbuf Library `. Cookies ------- @@ -116,7 +115,7 @@ While this may mean a number of buffers may sit idle on some core's cache, the speed at which a core can access its own cache for a specific memory pool without locks provides performance gains. The cache is composed of a small, per-core table of pointers and its length (used as a stack). -This cache can be enabled or disabled at creation of the pool. +This internal cache can be enabled or disabled at creation of the pool. The maximum size of the cache is static and is defined at compilation time (CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE). @@ -128,6 +127,39 @@ The maximum size of the cache is static and is defined at compilation time (CONF A mempool in Memory with its Associated Ring +Alternatively to the internal default per-lcore local cache, an application can create and manage external caches through the ``rte_mempool_cache_create()``, ``rte_mempool_cache_free()`` and ``rte_mempool_cache_flush()`` calls. +These user-owned caches can be explicitly passed to ``rte_mempool_generic_put()`` and ``rte_mempool_generic_get()``. +The ``rte_mempool_default_cache()`` call returns the default internal cache if any. +In contrast to the default caches, user-owned caches can be used by non-EAL threads too. + +Mempool Handlers +------------------------ + +This allows external memory subsystems, such as external hardware memory +management systems and software based memory allocators, to be used with DPDK. + +There are two aspects to a mempool handler. + +* Adding the code for your new mempool operations (ops). This is achieved by + adding a new mempool ops code, and using the ``REGISTER_MEMPOOL_OPS`` macro. + +* Using the new API to call ``rte_mempool_create_empty()`` and + ``rte_mempool_set_ops_byname()`` to create a new mempool and specifying which + ops to use. + +Several different mempool handlers may be used in the same application. A new +mempool can be created by using the ``rte_mempool_create_empty()`` function, +then using ``rte_mempool_set_ops_byname()`` to point the mempool to the +relevant mempool handler callback (ops) structure. + +Legacy applications may continue to use the old ``rte_mempool_create()`` API +call, which uses a ring based mempool handler by default. These applications +will need to be modified to use a new mempool handler. + +For applications that use ``rte_pktmbuf_create()``, there is a config setting +(``RTE_MBUF_DEFAULT_MEMPOOL_OPS``) that allows the application to make use of +an alternative mempool handler. + Use Cases --------- diff --git a/doc/guides/prog_guide/pdump_lib.rst b/doc/guides/prog_guide/pdump_lib.rst new file mode 100644 index 00000000..580ffcbd --- /dev/null +++ b/doc/guides/prog_guide/pdump_lib.rst @@ -0,0 +1,123 @@ +.. BSD LICENSE + Copyright(c) 2016 Intel Corporation. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +.. _pdump_library: + +The librte_pdump Library +======================== + +The ``librte_pdump`` library provides a framework for packet capturing in DPDK. +The library does the complete copy of the Rx and Tx mbufs to a new mempool and +hence it slows down the performance of the applications, so it is recommended +to use this library for debugging purposes. + +The library provides the following APIs to initialize the packet capture framework, to enable +or disable the packet capture, and to uninitialize it: + +* ``rte_pdump_init()``: + This API initializes the packet capture framework. + +* ``rte_pdump_enable()``: + This API enables the packet capture on a given port and queue. + Note: The filter option in the API is a place holder for future enhancements. + +* ``rte_pdump_enable_by_deviceid()``: + This API enables the packet capture on a given device id (``vdev name or pci address``) and queue. + Note: The filter option in the API is a place holder for future enhancements. + +* ``rte_pdump_disable()``: + This API disables the packet capture on a given port and queue. + +* ``rte_pdump_disable_by_deviceid()``: + This API disables the packet capture on a given device id (``vdev name or pci address``) and queue. + +* ``rte_pdump_uninit()``: + This API uninitializes the packet capture framework. + +* ``rte_pdump_set_socket_dir()``: + This API sets the server and client socket paths. + Note: This API is not thread-safe. + + +Operation +--------- + +The ``librte_pdump`` library works on a client/server model. The server is responsible for enabling or +disabling the packet capture and the clients are responsible for requesting the enabling or disabling of +the packet capture. + +The packet capture framework, as part of its initialization, creates the pthread and the server socket in +the pthread. The application that calls the framework initialization will have the server socket created, +either under the path that the application has passed or under the default path i.e. either ``/var/run`` for +root user or ``$HOME`` for non root user. + +Applications that request enabling or disabling of the packet capture will have the client socket created either under +the path that the application has passed or under the default path i.e. either ``/var/run/`` for root user or ``$HOME`` +for not root user to send the requests to the server. +The server socket will listen for client requests for enabling or disabling the packet capture. + + +Implementation Details +---------------------- + +The library API ``rte_pdump_init()``, initializes the packet capture framework by creating the pthread and the server +socket. The server socket in the pthread context will be listening to the client requests to enable or disable the +packet capture. + +The library APIs ``rte_pdump_enable()`` and ``rte_pdump_enable_by_deviceid()`` enables the packet capture. +On each call to these APIs, the library creates a separate client socket, creates the "pdump enable" request and sends +the request to the server. The server that is listening on the socket will take the request and enable the packet capture +by registering the Ethernet RX and TX callbacks for the given port or device_id and queue combinations. +Then the server will mirror the packets to the new mempool and enqueue them to the rte_ring that clients have passed +to these APIs. The server also sends the response back to the client about the status of the request that was processed. +After the response is received from the server, the client socket is closed. + +The library APIs ``rte_pdump_disable()`` and ``rte_pdump_disable_by_deviceid()`` disables the packet capture. +On each call to these APIs, the library creates a separate client socket, creates the "pdump disable" request and sends +the request to the server. The server that is listening on the socket will take the request and disable the packet +capture by removing the Ethernet RX and TX callbacks for the given port or device_id and queue combinations. The server +also sends the response back to the client about the status of the request that was processed. After the response is +received from the server, the client socket is closed. + +The library API ``rte_pdump_uninit()``, uninitializes the packet capture framework by closing the pthread and the +server socket. + +The library API ``rte_pdump_set_socket_dir()``, sets the given path as either server socket path +or client socket path based on the ``type`` argument of the API. +If the given path is ``NULL``, default path will be selected, i.e. either ``/var/run/`` for root user or ``$HOME`` +for non root user. Clients also need to call this API to set their server socket path if the server socket +path is different from default path. + + +Use Case: Packet Capturing +-------------------------- + +The DPDK ``app/pdump`` tool is developed based on this library to capture packets in DPDK. +Users can use this as an example to develop their own packet capturing tools. diff --git a/doc/guides/prog_guide/poll_mode_drv.rst b/doc/guides/prog_guide/poll_mode_drv.rst index 76986924..bf3ea9fd 100644 --- a/doc/guides/prog_guide/poll_mode_drv.rst +++ b/doc/guides/prog_guide/poll_mode_drv.rst @@ -299,10 +299,23 @@ Extended Statistics API ~~~~~~~~~~~~~~~~~~~~~~~ The extended statistics API allows each individual PMD to expose a unique set -of statistics. The client of the API provides an array of -``struct rte_eth_xstats`` type. Each ``struct rte_eth_xstats`` contains a -string and value pair. The amount of xstats exposed, and position of the -statistic in the array must remain constant during runtime. +of statistics. Accessing these from application programs is done via two +functions: + +* ``rte_eth_xstats_get``: Fills in an array of ``struct rte_eth_xstat`` + with extended statistics. +* ``rte_eth_xstats_get_names``: Fills in an array of + ``struct rte_eth_xstat_name`` with extended statistic name lookup + information. + +Each ``struct rte_eth_xstat`` contains an identifier and value pair, and +each ``struct rte_eth_xstat_name`` contains a string. Each identifier +within the ``struct rte_eth_xstat`` lookup array must have a corresponding +entry in the ``struct rte_eth_xstat_name`` lookup array. Within the latter +the index of the entry is the identifier the string is associated with. +These identifiers, as well as the number of extended statistic exposed, must +remain constant during runtime. Note that extended statistic identifiers are +driver-specific, and hence might not be the same for different ports. A naming scheme exists for the strings exposed to clients of the API. This is to allow scraping of the API for statistics of interest. The naming scheme uses diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst index 48e1fffb..14d5e675 100644 --- a/doc/guides/prog_guide/vhost_lib.rst +++ b/doc/guides/prog_guide/vhost_lib.rst @@ -1,5 +1,5 @@ .. BSD LICENSE - Copyright(c) 2010-2014 Intel Corporation. All rights reserved. + Copyright(c) 2010-2016 Intel Corporation. All rights reserved. All rights reserved. Redistribution and use in source and binary forms, with or without @@ -31,105 +31,194 @@ Vhost Library ============= -The vhost library implements a user space vhost driver. It supports both vhost-cuse -(cuse: user space character device) and vhost-user(user space socket server). -It also creates, manages and destroys vhost devices for corresponding virtio -devices in the guest. Vhost supported vSwitch could register callbacks to this -library, which will be called when a vhost device is activated or deactivated -by guest virtual machine. +The vhost library implements a user space virtio net server allowing the user +to manipulate the virtio ring directly. In another words, it allows the user +to fetch/put packets from/to the VM virtio net device. To achieve this, a +vhost library should be able to: + +* Access the guest memory: + + For QEMU, this is done by using the ``-object memory-backend-file,share=on,...`` + option. Which means QEMU will create a file to serve as the guest RAM. + The ``share=on`` option allows another process to map that file, which + means it can access the guest RAM. + +* Know all the necessary information about the vring: + + Information such as where the available ring is stored. Vhost defines some + messages to tell the backend all the information it needs to know how to + manipulate the vring. + +Currently, there are two ways to pass these messages and as a result there are +two Vhost implementations in DPDK: *vhost-cuse* (where the character devices +are in user space) and *vhost-user*. + +Vhost-cuse creates a user space character device and hook to a function ioctl, +so that all ioctl commands that are sent from the frontend (QEMU) will be +captured and handled. + +Vhost-user creates a Unix domain socket file through which messages are +passed. + +.. Note:: + + Since DPDK v2.2, the majority of the development effort has gone into + enhancing vhost-user, such as multiple queue, live migration, and + reconnect. Thus, it is strongly advised to use vhost-user instead of + vhost-cuse. + Vhost API Overview ------------------ -* Vhost driver registration +The following is an overview of the Vhost API functions: + +* ``rte_vhost_driver_register(path, flags)`` + + This function registers a vhost driver into the system. For vhost-cuse, a + ``/dev/path`` character device file will be created. For vhost-user server + mode, a Unix domain socket file ``path`` will be created. + + Currently two flags are supported (these are valid for vhost-user only): + + - ``RTE_VHOST_USER_CLIENT`` + + DPDK vhost-user will act as the client when this flag is given. See below + for an explanation. + + - ``RTE_VHOST_USER_NO_RECONNECT`` + + When DPDK vhost-user acts as the client it will keep trying to reconnect + to the server (QEMU) until it succeeds. This is useful in two cases: + + * When QEMU is not started yet. + * When QEMU restarts (for example due to a guest OS reboot). + + This reconnect option is enabled by default. However, it can be turned off + by setting this flag. - rte_vhost_driver_register registers the vhost driver into the system. - For vhost-cuse, character device file will be created under the /dev directory. - Character device name is specified as the parameter. - For vhost-user, a Unix domain socket server will be created with the parameter as - the local socket path. +* ``rte_vhost_driver_session_start()`` -* Vhost session start + This function starts the vhost session loop to handle vhost messages. It + starts an infinite loop, therefore it should be called in a dedicated + thread. - rte_vhost_driver_session_start starts the vhost session loop. - Vhost session is an infinite blocking loop. - Put the session in a dedicate DPDK thread. +* ``rte_vhost_driver_callback_register(virtio_net_device_ops)`` -* Callback register + This function registers a set of callbacks, to let DPDK applications take + the appropriate action when some events happen. The following events are + currently supported: - Vhost supported vSwitch could call rte_vhost_driver_callback_register to - register two callbacks, new_destory and destroy_device. - When virtio device is activated or deactivated by guest virtual machine, - the callback will be called, then vSwitch could put the device onto data - core or remove the device from data core by setting or unsetting - VIRTIO_DEV_RUNNING on the device flags. + * ``new_device(int vid)`` -* Read/write packets from/to guest virtual machine + This callback is invoked when a virtio net device becomes ready. ``vid`` + is the virtio net device ID. - rte_vhost_enqueue_burst transmit host packets to guest. - rte_vhost_dequeue_burst receives packets from guest. + * ``destroy_device(int vid)`` -* Feature enable/disable + This callback is invoked when a virtio net device shuts down (or when the + vhost connection is broken). - Now one negotiate-able feature in vhost is merge-able. - vSwitch could enable/disable this feature for performance consideration. + * ``vring_state_changed(int vid, uint16_t queue_id, int enable)`` -Vhost Implementation --------------------- + This callback is invoked when a specific queue's state is changed, for + example to enabled or disabled. -Vhost cuse implementation +* ``rte_vhost_enqueue_burst(vid, queue_id, pkts, count)`` + + Transmits (enqueues) ``count`` packets from host to guest. + +* ``rte_vhost_dequeue_burst(vid, queue_id, mbuf_pool, pkts, count)`` + + Receives (dequeues) ``count`` packets from guest, and stored them at ``pkts``. + +* ``rte_vhost_feature_disable/rte_vhost_feature_enable(feature_mask)`` + + This function disables/enables some features. For example, it can be used to + disable mergeable buffers and TSO features, which both are enabled by + default. + + +Vhost Implementations +--------------------- + +Vhost-cuse implementation ~~~~~~~~~~~~~~~~~~~~~~~~~ + When vSwitch registers the vhost driver, it will register a cuse device driver into the system and creates a character device file. This cuse driver will -receive vhost open/release/IOCTL message from QEMU simulator. +receive vhost open/release/IOCTL messages from the QEMU simulator. -When the open call is received, vhost driver will create a vhost device for the -virtio device in the guest. +When the open call is received, the vhost driver will create a vhost device +for the virtio device in the guest. -When VHOST_SET_MEM_TABLE IOCTL is received, vhost searches the memory region -to find the starting user space virtual address that maps the memory of guest -virtual machine. Through this virtual address and the QEMU pid, vhost could -find the file QEMU uses to map the guest memory. Vhost maps this file into its -address space, in this way vhost could fully access the guest physical memory, -which means vhost could access the shared virtio ring and the guest physical -address specified in the entry of the ring. +When the ``VHOST_SET_MEM_TABLE`` ioctl is received, vhost searches the memory +region to find the starting user space virtual address that maps the memory of +the guest virtual machine. Through this virtual address and the QEMU pid, +vhost can find the file QEMU uses to map the guest memory. Vhost maps this +file into its address space, in this way vhost can fully access the guest +physical memory, which means vhost could access the shared virtio ring and the +guest physical address specified in the entry of the ring. The guest virtual machine tells the vhost whether the virtio device is ready -for processing or is de-activated through VHOST_NET_SET_BACKEND message. -The registered callback from vSwitch will be called. +for processing or is de-activated through the ``VHOST_NET_SET_BACKEND`` +message. The registered callback from vSwitch will be called. -When the release call is released, vhost will destroy the device. +When the release call is made, vhost will destroy the device. -Vhost user implementation +Vhost-user implementation ~~~~~~~~~~~~~~~~~~~~~~~~~ -When vSwitch registers a vhost driver, it will create a Unix domain socket server -into the system. This server will listen for a connection and process the vhost message from -QEMU simulator. -When there is a new socket connection, it means a new virtio device has been created in -the guest virtual machine, and the vhost driver will create a vhost device for this virtio device. +Vhost-user uses Unix domain sockets for passing messages. This means the DPDK +vhost-user implementation has two options: + +* DPDK vhost-user acts as the server. + + DPDK will create a Unix domain socket server file and listen for + connections from the frontend. + + Note, this is the default mode, and the only mode before DPDK v16.07. + + +* DPDK vhost-user acts as the client. + + Unlike the server mode, this mode doesn't create the socket file; + it just tries to connect to the server (which responses to create the + file instead). + + When the DPDK vhost-user application restarts, DPDK vhost-user will try to + connect to the server again. This is how the "reconnect" feature works. + + Note: the "reconnect" feature requires **QEMU v2.7** (or above). + +No matter which mode is used, once a connection is established, DPDK +vhost-user will start receiving and processing vhost messages from QEMU. + +For messages with a file descriptor, the file descriptor can be used directly +in the vhost process as it is already installed by the Unix domain socket. -For messages with a file descriptor, the file descriptor could be directly used in the vhost -process as it is already installed by Unix domain socket. +The supported vhost messages are: - * VHOST_SET_MEM_TABLE - * VHOST_SET_VRING_KICK - * VHOST_SET_VRING_CALL - * VHOST_SET_LOG_FD - * VHOST_SET_VRING_ERR +* ``VHOST_SET_MEM_TABLE`` +* ``VHOST_SET_VRING_KICK`` +* ``VHOST_SET_VRING_CALL`` +* ``VHOST_SET_LOG_FD`` +* ``VHOST_SET_VRING_ERR`` -For VHOST_SET_MEM_TABLE message, QEMU will send us information for each memory region and its -file descriptor in the ancillary data of the message. The fd is used to map that region. +For ``VHOST_SET_MEM_TABLE`` message, QEMU will send information for each +memory region and its file descriptor in the ancillary data of the message. +The file descriptor is used to map that region. -There is no VHOST_NET_SET_BACKEND message as in vhost cuse to signal us whether virtio device -is ready or should be stopped. -VHOST_SET_VRING_KICK is used as the signal to put the vhost device onto data plane. -VHOST_GET_VRING_BASE is used as the signal to remove vhost device from data plane. +There is no ``VHOST_NET_SET_BACKEND`` message as in vhost-cuse to signal +whether the virtio device is ready or stopped. Instead, +``VHOST_SET_VRING_KICK`` is used as the signal to put the vhost device into +the data plane, and ``VHOST_GET_VRING_BASE`` is used as the signal to remove +the vhost device from the data plane. When the socket connection is closed, vhost will destroy the device. Vhost supported vSwitch reference --------------------------------- -For more vhost details and how to support vhost in vSwitch, please refer to vhost example in the -DPDK Sample Applications Guide. +For more vhost details and how to support vhost in vSwitch, please refer to +the vhost example in the DPDK Sample Applications Guide. diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst index 327fc2bf..f502f863 100644 --- a/doc/guides/rel_notes/deprecation.rst +++ b/doc/guides/rel_notes/deprecation.rst @@ -8,6 +8,9 @@ API and ABI deprecation notices are to be posted here. Deprecation Notices ------------------- +* The log history is deprecated. + It is voided in 16.07 and will be removed in release 16.11. + * The ethdev hotplug API is going to be moved to EAL with a notification mechanism added to crypto and ethdev libraries so that hotplug is now available to both of them. This API will be stripped of the device arguments @@ -20,73 +23,21 @@ Deprecation Notices do not need to care about the kind of devices that are being used, making it easier to add new buses later. -* The EAL function pci_config_space_set is deprecated in release 16.04 - and will be removed from 16.07. - Macros CONFIG_RTE_PCI_CONFIG, CONFIG_RTE_PCI_EXTENDED_TAG and - CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE will be removed. - The /sys entries extended_tag and max_read_request_size created by igb_uio - will be removed. - -* ABI changes are planned for struct rte_pci_id, i.e., add new field ``class``. - This new added ``class`` field can be used to probe pci device by class - related info. This change should impact size of struct rte_pci_id and struct - rte_pci_device. The release 16.04 does not contain these ABI changes, but - release 16.07 will. - -* The following fields have been deprecated in rte_eth_stats: - ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss, - tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff - -* The xstats API and rte_eth_xstats struct will be changed to allow retrieval - of values without any string copies or parsing. - No backwards compatibility is planned, as it would require code duplication - in every PMD that supports xstats. - * ABI changes are planned for adding four new flow types. This impacts RTE_ETH_FLOW_MAX. The release 2.2 does not contain these ABI changes, but release 2.3 will. [postponed] -* ABI change is planned for the rte_mempool structure to allow mempool - cache support to be dynamic depending on the mempool being created - needing cache support. Saves about 1.5M of memory per rte_mempool structure - by removing the per lcore cache memory. Change will occur in DPDK 16.07 - release and will skip the define RTE_NEXT_ABI in DPDK 16.04 release. The - code affected is app/test/test_mempool.c and librte_mempool/rte_mempool.[ch]. - The rte_mempool.local_cache will be converted from an array to a pointer to - allow for dynamic allocation of the per lcore cache memory. - -* ABI will change for rte_mempool struct to move the cache-related fields - to the more appropriate rte_mempool_cache struct. The mempool API is - also changed to enable external cache management that is not tied to EAL - threads. Some mempool get and put calls are removed in favor of a more - compact API. The ones that remain are backwards compatible and use the - per-lcore default cache if available. This change targets release 16.07. - -* The rte_mempool struct will be changed in 16.07 to facilitate the new - external mempool manager functionality. - The ring element will be replaced with a more generic 'pool' opaque pointer - to allow new mempool handlers to use their own user-defined mempool - layout. Also newly added to rte_mempool is a handler index. - The existing API will be backward compatible, but there will be new API - functions added to facilitate the creation of mempools using an external - handler. The 16.07 release will contain these changes. - -* The rte_mempool allocation will be changed in 16.07: - allocation of large mempool in several virtual memory chunks, new API - to populate a mempool, new API to free a mempool, allocation in - anonymous mapping, drop of specific dom0 code. These changes will - induce a modification of the rte_mempool structure, plus a - modification of the API of rte_mempool_obj_iter(), implying a breakage - of the ABI. +* The mbuf flags PKT_RX_VLAN_PKT and PKT_RX_QINQ_PKT are deprecated and + are respectively replaced by PKT_RX_VLAN_STRIPPED and + PKT_RX_QINQ_STRIPPED, that are better described. The old flags and + their behavior will be kept in 16.07 and will be removed in 16.11. -* ABI changes are planned for struct rte_port_source_params in order to - support PCAP file reading feature. The release 16.04 contains this ABI - change wrapped by RTE_NEXT_ABI macro. Release 16.07 will contain this - change, and no backwards compatibility is planned. +* The APIs rte_mempool_count and rte_mempool_free_count are being deprecated + on the basis that they are confusing to use - free_count actually returns + the number of allocated entries, not the number of free entries as expected. + They are being replaced by rte_mempool_avail_count and + rte_mempool_in_use_count respectively. -* A librte_vhost public structures refactor is planned for DPDK 16.07 - that requires both ABI and API change. - The proposed refactor would expose DPDK vhost dev to applications as - a handle, like the way kernel exposes an fd to user for locating a - specific file, and to keep all major structures internally, so that - we are likely to be free from ABI violations in future. +* The mempool functions for single/multi producer/consumer are deprecated and + will be removed in 16.11. + It is replaced by rte_mempool_generic_get/put functions. diff --git a/doc/guides/rel_notes/index.rst b/doc/guides/rel_notes/index.rst index 84317b81..52c63b40 100644 --- a/doc/guides/rel_notes/index.rst +++ b/doc/guides/rel_notes/index.rst @@ -36,6 +36,7 @@ Release Notes :numbered: rel_description + release_16_07 release_16_04 release_2_2 release_2_1 diff --git a/doc/guides/rel_notes/known_issues.rst b/doc/guides/rel_notes/known_issues.rst index 923a202d..5ec19876 100644 --- a/doc/guides/rel_notes/known_issues.rst +++ b/doc/guides/rel_notes/known_issues.rst @@ -532,25 +532,6 @@ Cannot set link speed on Intel® 40G Ethernet controller Poll Mode Driver (PMD). -Stopping the port does not down the link on Intel® 40G Ethernet controller --------------------------------------------------------------------------- - -**Description**: - On Intel® 40G Ethernet Controller stopping the port does not really down the port link. - -**Implication**: - The port link will be still up after stopping the port. - -**Resolution/Workaround**: - None - -**Affected Environment/Platform**: - All. - -**Driver/Module**: - Poll Mode Driver (PMD). - - Devices bound to igb_uio with VT-d enabled do not work on Linux kernel 3.15-3.17 -------------------------------------------------------------------------------- @@ -618,3 +599,24 @@ DPDK may not build on some Intel CPUs using clang < 3.7.0 **Driver/Module**: Environment Abstraction Layer (EAL). + + +The last EAL argument is replaced by the program name in argv[] +--------------------------------------------------------------- + +**Description**: + The last EAL argument is replaced by program name in ``argv[]`` after ``eal_parse_args`` is called. + This is the intended behavior but it causes the pointer to the last EAL argument to be lost. + +**Implication**: + If the last EAL argument in ``argv[]`` is generated by a malloc function, changing it will cause memory + issues when freeing the argument. + +**Resolution/Workaround**: + An application should not consider the value in ``argv[]`` as unchanged. + +**Affected Environment/Platform**: + ALL. + +**Driver/Module**: + Environment Abstraction Layer (EAL). diff --git a/doc/guides/rel_notes/release_16_07.rst b/doc/guides/rel_notes/release_16_07.rst new file mode 100644 index 00000000..569f562f --- /dev/null +++ b/doc/guides/rel_notes/release_16_07.rst @@ -0,0 +1,358 @@ +DPDK Release 16.07 +================== + +.. **Read this first.** + + The text below explains how to update the release notes. + + Use proper spelling, capitalization and punctuation in all sections. + + Variable and config names should be quoted as fixed width text: ``LIKE_THIS``. + + Build the docs and view the output file to ensure the changes are correct:: + + make doc-guides-html + + firefox build/doc/html/guides/rel_notes/release_16_07.html + + +New Features +------------ + +.. This section should contain new features added in this release. Sample format: + + * **Add a title in the past tense with a full stop.** + + Add a short 1-2 sentence description in the past tense. The description + should be enough to allow someone scanning the release notes to understand + the new feature. + + If the feature adds a lot of sub-features you can use a bullet list like this. + + * Added feature foo to do something. + * Enhanced feature bar to do something else. + + Refer to the previous release notes for examples. + +* **Removed mempool cache if not needed.** + + The size of the mempool structure is reduced if the per-lcore cache is disabled. + +* **Added mempool external cache for non-EAL thread.** + + Added new functions to create, free or flush a user-owned mempool + cache for non-EAL threads. Previously, cache was always disabled + on these threads. + +* **Changed the memory allocation in mempool library.** + + * Added ability to allocate a large mempool in virtually fragmented memory. + * Added new APIs to populate a mempool with memory. + * Added an API to free a mempool. + * Modified the API of rte_mempool_obj_iter() function. + * Dropped specific Xen Dom0 code. + * Dropped specific anonymous mempool code in testpmd. + +* **Added new driver for Broadcom NetXtreme-C devices.** + + Added the new bnxt driver for Broadcom NetXtreme-C devices. See the + "Network Interface Controller Drivers" document for more details on this + new driver. + +* **Added new driver for ThunderX nicvf device.** + +* **Added mailbox interrupt support for ixgbe and igb VFs.** + + When the physical NIC link comes down or up, the PF driver will send a + mailbox message to notify each VF. To handle this link up/down event, + add mailbox interrupts support to receive the message and allow the app to + register a callback for it. + +* **Updated the ixgbe base driver.** + + The ixgbe base driver was updated with changes including the + following: + + * Added sgmii link for X550. + * Added mac link setup for X550a SFP and SFP+. + * Added KR support for X550em_a. + * Added new phy definitions for M88E1500. + * Added support for the VLVF to be bypassed when adding/removing a VFTA entry. + * Added X550a flow control auto negotiation support. + +* **Updated the i40e base driver.** + + Updated the i40e base driver, which includes support for new devices IDs. + +* **Supported virtio on IBM POWER8.** + + The ioports are mapped in memory when using Linux UIO. + +* **Virtio support for containers.** + + Add a new virtual device, named virtio-user, to support virtio for containers. + + Known limitations: + + * Control queue and multi-queue are not supported yet. + * Cannot work with --huge-unlink. + * Cannot work with --no-huge. + * Cannot work when there are more than VHOST_MEMORY_MAX_NREGIONS(8) hugepages. + * Root privilege is a must for sorting hugepages by physical address. + * Can only be used with vhost user backend. + +* **Added vhost-user client mode.** + + DPDK vhost-user could be the server as well as the client. It supports + server mode only before, now it also supports client mode. Client mode + is enabled when ``RTE_VHOST_USER_CLIENT`` flag is set while calling + ``rte_vhost_driver_register``. + + When DPDK vhost-user restarts from normal or abnormal quit (say crash), + the client mode would allow DPDK to establish the connect again. Note + that a brand new QEMU version (v2.7 or above) is needed, otherwise, the + reconnect won't work. + + DPDK vhost-user will also try to reconnect by default when + + * the first connect fails (when QEMU is not started yet) + * the connection is broken (when QEMU restarts) + + It can be turned off if flag ``RTE_VHOST_USER_NO_RECONNECT`` is set. + +* **Added NSH packet recognition in i40e.** + +* **Added AES-CTR support to AESNI MB PMD.** + + Now AESNI MB PMD supports 128/192/256-bit counter mode AES encryption and + decryption. + +* **Added support of AES counter mode for Intel QuickAssist devices.** + + Enabled support for the AES CTR algorithm for Intel QuickAssist devices. + Provided support for algorithm-chaining operations. + +* **Added KASUMI SW PMD.** + + A new Crypto PMD has been added, which provides KASUMI F8 (UEA1) ciphering + and KASUMI F9 (UIA1) hashing. + +* **Added multi-writer support for RTE Hash with Intel TSX.** + + The following features/modifications have been added to rte_hash library: + + * Enabled application developers to use an extra flag for rte_hash creation + to specify default behavior (multi-thread safe/unsafe) with rte_hash_add_key + function. + * Changed Cuckoo search algorithm to breadth first search for multi-writer + routine and split Cuckoo Search and Move operations in order to reduce + transactional code region and improve TSX performance. + * Added a hash multi-writer test case for test app. + +* **Improved IP Pipeline Application.** + + The following features have been added to ip_pipeline application: + + * Configure the MAC address in the routing pipeline and automatic routes + updates with change in link state. + * Enable RSS per network interface through the configuration file. + * Streamline the CLI code. + +* **Added keepalive enhancements.** + + Adds support for reporting of core states other than dead to + monitoring applications, enabling the support of broader liveness + reporting to external processes. + +* **Added packet capture framework.** + + * A new library ``librte_pdump`` is added to provide packet capture API. + * A new ``app/pdump`` tool is added to capture packets in DPDK. + + +* **Added floating VEB support for i40e PF driver.** + + A "floating VEB" is a special Virtual Ethernet Bridge (VEB) which does not + have an upload port, but instead is used for switching traffic between + virtual functions (VFs) on a port. + + For information on this feature, please see the "I40E Poll Mode Driver" + section of the "Network Interface Controller Drivers" document. + + +Resolved Issues +--------------- + +.. This section should contain bug fixes added to the relevant sections. Sample format: + + * **code/section Fixed issue in the past tense with a full stop.** + + Add a short 1-2 sentence description of the resolved issue in the past tense. + The title should contain the code/lib section like a commit message. + Add the entries in alphabetic order in the relevant sections below. + + +EAL +~~~ + + +Drivers +~~~~~~~ + +* **i40e: Fixed vlan stripping from inner header.** + + Previously, for tunnel packets, such as VXLAN/NVGRE, the vlan + tags of the inner header will be stripped without putting vlan + info to descriptor. + Now this issue is fixed by disabling vlan stripping from inner header. + +* **i40e: Fixed the type issue of a single VLAN type.** + + Currently, if a single VLAN header is added in a packet, it's treated + as inner VLAN. But generally, a single VLAN header is treated as the + outer VLAN header. + This issue is fixed by changing corresponding register for single VLAN. + + +Libraries +~~~~~~~~~ + +* **mbuf: Fixed refcnt update when detaching.** + + Fix the ``rte_pktmbuf_detach()`` function to decrement the direct + mbuf's reference counter. The previous behavior was not to affect + the reference counter. It lead a memory leak of the direct mbuf. + + +Examples +~~~~~~~~ + + +Other +~~~~~ + + +Known Issues +------------ + +.. This section should contain new known issues in this release. Sample format: + + * **Add title in present tense with full stop.** + + Add a short 1-2 sentence description of the known issue in the present + tense. Add information on any known workarounds. + + +API Changes +----------- + +.. This section should contain API changes. Sample format: + + * Add a short 1-2 sentence description of the API change. Use fixed width + quotes for ``rte_function_names`` or ``rte_struct_names``. Use the past tense. + +* The following counters are removed from ``rte_eth_stats`` structure: + ibadcrc, ibadlen, imcasts, fdirmatch, fdirmiss, + tx_pause_xon, rx_pause_xon, tx_pause_xoff, rx_pause_xoff. + +* The extended statistics are fetched by ids with ``rte_eth_xstats_get`` + after a lookup by name ``rte_eth_xstats_get_names``. + +* The function ``rte_eth_dev_info_get`` fill the new fields ``nb_rx_queues`` + and ``nb_tx_queues`` in the structure ``rte_eth_dev_info``. + +* The vhost function ``rte_vring_available_entries`` is renamed to + ``rte_vhost_avail_entries``. + +* All existing vhost APIs and callbacks with ``virtio_net`` struct pointer + as the parameter have been changed due to the ABI refactoring mentioned + below: it's replaced by ``int vid``. + +* The function ``rte_vhost_enqueue_burst`` no longer supports concurrent enqueuing + packets to the same queue. + +* The function ``rte_eth_dev_set_mtu`` adds a new return value ``-EBUSY``, which + indicates the operation is forbidden because the port is running. + + +ABI Changes +----------- + +.. * Add a short 1-2 sentence description of the ABI change that was announced in + the previous releases and made in this release. Use fixed width quotes for + ``rte_function_names`` or ``rte_struct_names``. Use the past tense. + +* The ``rte_port_source_params`` structure has new fields to support PCAP file. + It was already in release 16.04 with ``RTE_NEXT_ABI`` flag. + +* The ``rte_eth_dev_info`` structure has new fields ``nb_rx_queues`` and ``nb_tx_queues`` + to support number of queues configured by software. + +* vhost ABI refactoring has been made: ``virtio_net`` structure is never + exported to application any more. Instead, a handle, ``vid``, has been + used to represent this structure internally. + + +Shared Library Versions +----------------------- + +.. Update any library version updated in this release and prepend with a ``+`` sign. + +The libraries prepended with a plus sign were incremented in this version. + +.. code-block:: diff + + + libethdev.so.4 + librte_acl.so.2 + librte_cfgfile.so.2 + librte_cmdline.so.2 + librte_distributor.so.1 + librte_eal.so.2 + librte_hash.so.2 + librte_ip_frag.so.1 + librte_ivshmem.so.1 + librte_jobstats.so.1 + librte_kni.so.2 + librte_kvargs.so.1 + librte_lpm.so.2 + librte_mbuf.so.2 + + librte_mempool.so.2 + librte_meter.so.1 + librte_pipeline.so.3 + librte_pmd_bond.so.1 + librte_pmd_ring.so.2 + + librte_port.so.3 + librte_power.so.1 + librte_reorder.so.1 + librte_ring.so.1 + librte_sched.so.1 + librte_table.so.2 + librte_timer.so.1 + + librte_vhost.so.3 + + +Tested Platforms +---------------- + +.. This section should contain a list of platforms that were tested with this + release. + + The format is: + + #. Platform name. + + - Platform details. + - Platform details. + + +Tested NICs +----------- + +.. This section should contain a list of NICs that were tested with this release. + + The format is: + + #. NIC name. + + - NIC details. + - NIC details. diff --git a/doc/guides/sample_app_ug/img/ipsec_endpoints.svg b/doc/guides/sample_app_ug/img/ipsec_endpoints.svg new file mode 100644 index 00000000..e4aba4cc --- /dev/null +++ b/doc/guides/sample_app_ug/img/ipsec_endpoints.svg @@ -0,0 +1,850 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + + ep0 + ep1 + traffic gen + traffic gen + + + + + + 2 + 3 + 2 + 3 + 0 + 1 + + 0 + 1 + + UNPROTECTEDcipher-text + PROTECTEDclear-text + + + outbound + inbound + + + + + + + + diff --git a/doc/guides/sample_app_ug/index.rst b/doc/guides/sample_app_ug/index.rst index 930f68c0..96bb3179 100644 --- a/doc/guides/sample_app_ug/index.rst +++ b/doc/guides/sample_app_ug/index.rst @@ -76,6 +76,7 @@ Sample Applications User Guide ptpclient performance_thread ipsec_secgw + pdump **Figures** diff --git a/doc/guides/sample_app_ug/ip_pipeline.rst b/doc/guides/sample_app_ug/ip_pipeline.rst index 899fd4a0..09cbc174 100644 --- a/doc/guides/sample_app_ug/ip_pipeline.rst +++ b/doc/guides/sample_app_ug/ip_pipeline.rst @@ -1,5 +1,5 @@ .. BSD LICENSE - Copyright(c) 2015 Intel Corporation. All rights reserved. + Copyright(c) 2015-2016 Intel Corporation. All rights reserved. All rights reserved. Redistribution and use in source and binary forms, with or without @@ -351,33 +351,35 @@ Application resources present in the configuration file .. table:: Application resource names in the configuration file - +--------------------------+-----------------------------+-------------------------------------------------+ - | Resource type | Format | Examples | - +==========================+=============================+=================================================+ - | Pipeline | ``PIPELINE`` | ``PIPELINE0``, ``PIPELINE1`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Mempool | ``MEMPOOL`` | ``MEMPOOL0``, ``MEMPOOL1`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Link (network interface) | ``LINK`` | ``LINK0``, ``LINK1`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Link RX queue | ``RXQ.`` | ``RXQ0.0``, ``RXQ1.5`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Link TX queue | ``TXQ.`` | ``TXQ0.0``, ``TXQ1.5`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Software queue | ``SWQ`` | ``SWQ0``, ``SWQ1`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Traffic Manager | ``TM`` | ``TM0``, ``TM1`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Source | ``SOURCE`` | ``SOURCE0``, ``SOURCE1`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Sink | ``SINK`` | ``SINK0``, ``SINK1`` | - +--------------------------+-----------------------------+-------------------------------------------------+ - | Message queue | ``MSGQ`` | ``MSGQ0``, ``MSGQ1``, | - | | ``MSGQ-REQ-PIPELINE`` | ``MSGQ-REQ-PIPELINE2``, ``MSGQ-RSP-PIPELINE2,`` | - | | ``MSGQ-RSP-PIPELINE`` | ``MSGQ-REQ-CORE-s0c1``, ``MSGQ-RSP-CORE-s0c1`` | - | | ``MSGQ-REQ-CORE-`` | | - | | ``MSGQ-RSP-CORE-`` | | - +--------------------------+-----------------------------+-------------------------------------------------+ + +----------------------------+-----------------------------+-------------------------------------------------+ + | Resource type | Format | Examples | + +============================+=============================+=================================================+ + | Pipeline | ``PIPELINE`` | ``PIPELINE0``, ``PIPELINE1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Mempool | ``MEMPOOL`` | ``MEMPOOL0``, ``MEMPOOL1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Link (network interface) | ``LINK`` | ``LINK0``, ``LINK1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Link RX queue | ``RXQ.`` | ``RXQ0.0``, ``RXQ1.5`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Link TX queue | ``TXQ.`` | ``TXQ0.0``, ``TXQ1.5`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Software queue | ``SWQ`` | ``SWQ0``, ``SWQ1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Traffic Manager | ``TM`` | ``TM0``, ``TM1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | KNI (kernel NIC interface) | ``KNI`` | ``KNI0``, ``KNI1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Source | ``SOURCE`` | ``SOURCE0``, ``SOURCE1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Sink | ``SINK`` | ``SINK0``, ``SINK1`` | + +----------------------------+-----------------------------+-------------------------------------------------+ + | Message queue | ``MSGQ`` | ``MSGQ0``, ``MSGQ1``, | + | | ``MSGQ-REQ-PIPELINE`` | ``MSGQ-REQ-PIPELINE2``, ``MSGQ-RSP-PIPELINE2,`` | + | | ``MSGQ-RSP-PIPELINE`` | ``MSGQ-REQ-CORE-s0c1``, ``MSGQ-RSP-CORE-s0c1`` | + | | ``MSGQ-REQ-CORE-`` | | + | | ``MSGQ-RSP-CORE-`` | | + +----------------------------+-----------------------------+-------------------------------------------------+ ``LINK`` instances are created implicitly based on the ``PORT_MASK`` application startup argument. ``LINK0`` is the first port enabled in the ``PORT_MASK``, port 1 is the next one, etc. @@ -386,7 +388,7 @@ For example, if bit 5 is the first bit set in the bitmask, then ``LINK0`` is hav This mechanism creates a contiguous LINK ID space and isolates the configuration file against changes in the board PCIe slots where NICs are plugged in. -``RXQ``, ``TXQ`` and ``TM`` instances have the LINK ID as part of their name. +``RXQ``, ``TXQ``, ``TM`` and ``KNI`` instances have the LINK ID as part of their name. For example, ``RXQ2.1``, ``TXQ2.1`` and ``TM2`` are all associated with ``LINK2``. @@ -707,6 +709,58 @@ TM section +---------------+---------------------------------------------+----------+----------+---------------+ +KNI section +~~~~~~~~~~~ + +.. _table_ip_pipelines_kni_section: + +.. tabularcolumns:: |p{2.5cm}|p{7cm}|p{1.5cm}|p{1.5cm}|p{1.5cm}| + +.. table:: Configuration file KNI section + + +---------------+----------------------------------------------+----------+------------------+---------------+ + | Section | Description | Optional | Type | Default value | + +===============+==============================================+==========+==================+===============+ + | core | CPU core to run the KNI kernel thread. | YES | See "CPU Core | Not set | + | | When core config is set, the KNI kernel | | notation" | | + | | thread will be bound to the particular core. | | | | + | | When core config is not set, the KNI kernel | | | | + | | thread will be scheduled by the OS. | | | | + +---------------+----------------------------------------------+----------+------------------+---------------+ + | mempool | Mempool to use for buffer allocation for | YES | uint32_t | MEMPOOL0 | + | | current KNI port. The mempool ID has | | | | + | | to be associated with a valid instance | | | | + | | defined in the mempool entry of the global | | | | + | | section. | | | | + +---------------+----------------------------------------------+----------+------------------+---------------+ + | burst_read | Read burst size (number of packets) | YES | uint32_t | 32 | + | | | | power of 2 | | + | | | | 0 < burst < size | | + +---------------+----------------------------------------------+----------+------------------+---------------+ + | burst_write | Write burst size (number of packets) | YES | uint32_t | 32 | + | | | | power of 2 | | + | | | | 0 < burst < size | | + +---------------+----------------------------------------------+----------+------------------+---------------+ + | dropless | When dropless is set to NO, packets can be | YES | YES/NO | NO | + | | dropped if not enough free slots are | | | | + | | currently available in the queue, so the | | | | + | | write operation to the queue is non- | | | | + | | blocking. | | | | + | | When dropless is set to YES, packets cannot | | | | + | | be dropped if not enough free slots are | | | | + | | currently available in the queue, so the | | | | + | | write operation to the queue is blocking, as | | | | + | | the write operation is retried until enough | | | | + | | free slots become available and all the | | | | + | | packets are successfully written to the | | | | + | | queue. | | | | + +---------------+----------------------------------------------+----------+------------------+---------------+ + | n_retries | Number of retries. Valid only when dropless | YES | uint64_t | 0 | + | | is set to YES. When set to 0, it indicates | | | | + | | unlimited number of retries. | | | | + +---------------+----------------------------------------------+----------+------------------+---------------+ + + SOURCE section ~~~~~~~~~~~~~~ diff --git a/doc/guides/sample_app_ug/ipsec_secgw.rst b/doc/guides/sample_app_ug/ipsec_secgw.rst index c11c7e7e..fcb33c26 100644 --- a/doc/guides/sample_app_ug/ipsec_secgw.rst +++ b/doc/guides/sample_app_ug/ipsec_secgw.rst @@ -38,165 +38,171 @@ Overview -------- The application demonstrates the implementation of a Security Gateway -(not IPsec compliant, see Constraints bellow) using DPDK based on RFC4301, +(not IPsec compliant, see the Constraints section below) using DPDK based on RFC4301, RFC4303, RFC3602 and RFC2404. Internet Key Exchange (IKE) is not implemented, so only manual setting of Security Policies and Security Associations is supported. The Security Policies (SP) are implemented as ACL rules, the Security -Associations (SA) are stored in a table and the Routing is implemented +Associations (SA) are stored in a table and the routing is implemented using LPM. -The application classify the ports between Protected and Unprotected. -Thus, traffic received in an Unprotected or Protected port is consider +The application classifies the ports as *Protected* and *Unprotected*. +Thus, traffic received on an Unprotected or Protected port is consider Inbound or Outbound respectively. -Path for IPsec Inbound traffic: +The Path for IPsec Inbound traffic is: -* Read packets from the port +* Read packets from the port. * Classify packets between IPv4 and ESP. -* Inbound SA lookup for ESP packets based on their SPI -* Verification/Decryption -* Removal of ESP and outer IP header -* Inbound SP check using ACL of decrypted packets and any other IPv4 packet - we read. -* Routing -* Write packet to port - -Path for IPsec Outbound traffic: - -* Read packets from the port -* Outbound SP check using ACL of all IPv4 traffic -* Outbound SA lookup for packets that need IPsec protection -* Add ESP and outer IP header -* Encryption/Digest -* Routing -* Write packet to port +* Perform Inbound SA lookup for ESP packets based on their SPI. +* Perform Verification/Decryption. +* Remove ESP and outer IP header +* Inbound SP check using ACL of decrypted packets and any other IPv4 packets. +* Routing. +* Write packet to port. + +The Path for the IPsec Outbound traffic is: + +* Read packets from the port. +* Perform Outbound SP check using ACL of all IPv4 traffic. +* Perform Outbound SA lookup for packets that need IPsec protection. +* Add ESP and outer IP header. +* Perform Encryption/Digest. +* Routing. +* Write packet to port. + Constraints ----------- -* IPv4 traffic -* ESP tunnel mode -* EAS-CBC, HMAC-SHA1 and NULL -* Each SA must be handle by a unique lcore (1 RX queue per port) -* No chained mbufs + +* No IPv6 options headers. +* No AH mode. +* Currently only EAS-CBC, HMAC-SHA1 and NULL. +* Each SA must be handle by a unique lcore (*1 RX queue per port*). +* No chained mbufs. + Compiling the Application ------------------------- To compile the application: -#. Go to the sample application directory: - - .. code-block:: console +#. Go to the sample application directory:: export RTE_SDK=/path/to/rte_sdk cd ${RTE_SDK}/examples/ipsec-secgw -#. Set the target (a default target is used if not specified). For example: +#. Set the target (a default target is used if not specified). For example:: - .. code-block:: console export RTE_TARGET=x86_64-native-linuxapp-gcc See the *DPDK Getting Started Guide* for possible RTE_TARGET values. -#. Build the application: - - .. code-block:: console +#. Build the application:: make +#. [Optional] Build the application for debugging: + This option adds some extra flags, disables compiler optimizations and + is verbose:: + + make DEBUG=1 + + Running the Application ----------------------- -The application has a number of command line options: +The application has a number of command line options:: -.. code-block:: console - ./build/ipsec-secgw [EAL options] -- -p PORTMASK -P -u PORTMASK --config - (port,queue,lcore)[,(port,queue,lcore] --single-sa SAIDX --ep0|--ep1 + ./build/ipsec-secgw [EAL options] -- + -p PORTMASK -P -u PORTMASK + --config (port,queue,lcore)[,(port,queue,lcore] + --single-sa SAIDX + --ep0|--ep1 -where, +Where: -* -p PORTMASK: Hexadecimal bitmask of ports to configure +* ``-p PORTMASK``: Hexadecimal bitmask of ports to configure. -* -P: optional, sets all ports to promiscuous mode so that packets are +* ``-P``: *optional*. Sets all ports to promiscuous mode so that packets are accepted regardless of the packet's Ethernet MAC destination address. Without this option, only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted (default is enabled). -* -u PORTMASK: hexadecimal bitmask of unprotected ports +* ``-u PORTMASK``: hexadecimal bitmask of unprotected ports -* --config (port,queue,lcore)[,(port,queue,lcore)]: determines which queues - from which ports are mapped to which cores +* ``--config (port,queue,lcore)[,(port,queue,lcore)]``: determines which queues + from which ports are mapped to which cores. -* --single-sa SAIDX: use a single SA for outbound traffic, bypassing the SP +* ``--single-sa SAIDX``: use a single SA for outbound traffic, bypassing the SP on both Inbound and Outbound. This option is meant for debugging/performance purposes. -* --ep0: configure the app as Endpoint 0. +* ``--ep0``: configure the app as Endpoint 0. -* --ep1: configure the app as Endpoint 1. +* ``--ep1``: configure the app as Endpoint 1. -Either one of --ep0 or --ep1 *must* be specified. -The main purpose of these options is two easily configure two systems -back-to-back that would forward traffic through an IPsec tunnel. +Either one of ``--ep0`` or ``--ep1`` **must** be specified. +The main purpose of these options is to easily configure two systems +back-to-back that would forward traffic through an IPsec tunnel (see +:ref:`figure_ipsec_endpoints`). The mapping of lcores to port/queues is similar to other l3fwd applications. -For example, given the following command line: +For example, given the following command line:: -.. code-block:: console - - ./build/ipsec-secgw -l 20,21 -n 4 --socket-mem 0,2048 - --vdev "cryptodev_null_pmd" -- -p 0xf -P -u 0x3 - --config="(0,0,20),(1,0,20),(2,0,21),(3,0,21)" --ep0 + ./build/ipsec-secgw -l 20,21 -n 4 --socket-mem 0,2048 \ + --vdev "cryptodev_null_pmd" -- -p 0xf -P -u 0x3 \ + --config="(0,0,20),(1,0,20),(2,0,21),(3,0,21)" --ep0 \ where each options means: -* The -l option enables cores 20 and 21 +* The ``-l`` option enables cores 20 and 21. -* The -n option sets memory 4 channels +* The ``-n`` option sets memory 4 channels. -* The --socket-mem to use 2GB on socket 1 +* The ``--socket-mem`` to use 2GB on socket 1. -* The --vdev "cryptodev_null_pmd" option creates virtual NULL cryptodev PMD +* The ``--vdev "cryptodev_null_pmd"`` option creates virtual NULL cryptodev PMD. -* The -p option enables ports (detected) 0, 1, 2 and 3 +* The ``-p`` option enables ports (detected) 0, 1, 2 and 3. -* The -P option enables promiscuous mode +* The ``-P`` option enables promiscuous mode. -* The -u option sets ports 1 and 2 as unprotected, leaving 2 and 3 as protected +* The ``-u`` option sets ports 1 and 2 as unprotected, leaving 2 and 3 as protected. -* The --config option enables one queue per port with the following mapping: +* The ``--config`` option enables one queue per port with the following mapping: -+----------+-----------+-----------+---------------------------------------+ -| **Port** | **Queue** | **lcore** | **Description** | -| | | | | -+----------+-----------+-----------+---------------------------------------+ -| 0 | 0 | 20 | Map queue 0 from port 0 to lcore 20. | -| | | | | -+----------+-----------+-----------+---------------------------------------+ -| 1 | 0 | 20 | Map queue 0 from port 1 to lcore 20. | -| | | | | -+----------+-----------+-----------+---------------------------------------+ -| 2 | 0 | 21 | Map queue 0 from port 2 to lcore 21. | -| | | | | -+----------+-----------+-----------+---------------------------------------+ -| 3 | 0 | 21 | Map queue 0 from port 3 to lcore 21. | -| | | | | -+----------+-----------+-----------+---------------------------------------+ + +----------+-----------+-----------+---------------------------------------+ + | **Port** | **Queue** | **lcore** | **Description** | + | | | | | + +----------+-----------+-----------+---------------------------------------+ + | 0 | 0 | 20 | Map queue 0 from port 0 to lcore 20. | + | | | | | + +----------+-----------+-----------+---------------------------------------+ + | 1 | 0 | 20 | Map queue 0 from port 1 to lcore 20. | + | | | | | + +----------+-----------+-----------+---------------------------------------+ + | 2 | 0 | 21 | Map queue 0 from port 2 to lcore 21. | + | | | | | + +----------+-----------+-----------+---------------------------------------+ + | 3 | 0 | 21 | Map queue 0 from port 3 to lcore 21. | + | | | | | + +----------+-----------+-----------+---------------------------------------+ -* The --ep0 options configures the app with a given set of SP, SA and Routing +* The ``--ep0`` options configures the app with a given set of SP, SA and Routing entries as explained below in more detail. Refer to the *DPDK Getting Started Guide* for general information on running applications and the Environment Abstraction Layer (EAL) options. The application would do a best effort to "map" crypto devices to cores, with -hardware devices having priority. +hardware devices having priority. Basically, hardware devices if present would +be assigned to a core before software ones. This means that if the application is using a single core and both hardware and software crypto devices are detected, hardware devices will be used. @@ -208,18 +214,35 @@ For example, something like the following command line: .. code-block:: console - ./build/ipsec-secgw -l 20,21 -n 4 --socket-mem 0,2048 - -w 81:00.0 -w 81:00.1 -w 81:00.2 -w 81:00.3 - --vdev "cryptodev_aesni_mb_pmd" --vdev "cryptodev_null_pmd" -- - -p 0xf -P -u 0x3 --config="(0,0,20),(1,0,20),(2,0,21),(3,0,21)" + ./build/ipsec-secgw -l 20,21 -n 4 --socket-mem 0,2048 \ + -w 81:00.0 -w 81:00.1 -w 81:00.2 -w 81:00.3 \ + --vdev "cryptodev_aesni_mb_pmd" --vdev "cryptodev_null_pmd" \ + -- \ + -p 0xf -P -u 0x3 --config="(0,0,20),(1,0,20),(2,0,21),(3,0,21)" \ --ep0 + Configurations -------------- The following sections provide some details on the default values used to initialize the SP, SA and Routing tables. -Currently all the configuration is hard coded into the application. +Currently all configuration information is hard coded into the application. + +The following image illustrate a few of the concepts regarding IPSec, such +as protected/unprotected and inbound/outbound traffic, from the point of +view of two back-to-back endpoints: + +.. _figure_ipsec_endpoints: + +.. figure:: img/ipsec_endpoints.* + + IPSec Inbound/Outbound traffic + +Note that the above image only displays unidirectional traffic per port +for illustration purposes. +The application supports bidirectional traffic on all ports, + Security Policy Initialization ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -228,107 +251,126 @@ As mention in the overview, the Security Policies are ACL rules. The application defines two ACLs, one each of Inbound and Outbound, and it replicates them per socket in use. -Following are the default rules: - -Endpoint 0 Outbound Security Policies: - -+---------+------------------+-----------+------------+ -| **Src** | **Dst** | **proto** | **SA idx** | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.105.0/24 | Any | 5 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.106.0/24 | Any | 6 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.107.0/24 | Any | 7 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.108.0/24 | Any | 8 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.200.0/24 | Any | 9 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.250.0/24 | Any | BYPASS | -| | | | | -+---------+------------------+-----------+------------+ - -Endpoint 0 Inbound Security Policies: - -+---------+------------------+-----------+------------+ -| **Src** | **Dst** | **proto** | **SA idx** | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.115.0/24 | Any | 5 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.116.0/24 | Any | 6 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.117.0/24 | Any | 7 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.118.0/24 | Any | 8 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.210.0/24 | Any | 9 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.240.0/24 | Any | BYPASS | -| | | | | -+---------+------------------+-----------+------------+ - -Endpoint 1 Outbound Security Policies: - -+---------+------------------+-----------+------------+ -| **Src** | **Dst** | **proto** | **SA idx** | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.115.0/24 | Any | 5 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.116.0/24 | Any | 6 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.117.0/24 | Any | 7 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.118.0/24 | Any | 8 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.210.0/24 | Any | 9 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.240.0/24 | Any | BYPASS | -| | | | | -+---------+------------------+-----------+------------+ - -Endpoint 1 Inbound Security Policies: - -+---------+------------------+-----------+------------+ -| **Src** | **Dst** | **proto** | **SA idx** | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.105.0/24 | Any | 5 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.106.0/24 | Any | 6 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.107.0/24 | Any | 7 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.108.0/24 | Any | 8 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.200.0/24 | Any | 9 | -| | | | | -+---------+------------------+-----------+------------+ -| Any | 192.168.250.0/24 | Any | BYPASS | -| | | | | -+---------+------------------+-----------+------------+ +Following are the default rules which show only the relevant information, +assuming ANY value is valid for the fields not mentioned (src ip, proto, +src/dst ports). + +.. _table_ipsec_endpoint_outbound_sp: + +.. table:: Endpoint 0 Outbound Security Policies + + +-----------------------------------+------------+ + | **Dst** | **SA idx** | + | | | + +-----------------------------------+------------+ + | 192.168.105.0/24 | 5 | + | | | + +-----------------------------------+------------+ + | 192.168.106.0/24 | 6 | + | | | + +-----------------------------------+------------+ + | 192.168.175.0/24 | 10 | + | | | + +-----------------------------------+------------+ + | 192.168.176.0/24 | 11 | + | | | + +-----------------------------------+------------+ + | 192.168.200.0/24 | 15 | + | | | + +-----------------------------------+------------+ + | 192.168.201.0/24 | 16 | + | | | + +-----------------------------------+------------+ + | 192.168.55.0/24 | 25 | + | | | + +-----------------------------------+------------+ + | 192.168.56.0/24 | 26 | + | | | + +-----------------------------------+------------+ + | 192.168.240.0/24 | BYPASS | + | | | + +-----------------------------------+------------+ + | 192.168.241.0/24 | BYPASS | + | | | + +-----------------------------------+------------+ + | 0:0:0:0:5555:5555:0:0/96 | 5 | + | | | + +-----------------------------------+------------+ + | 0:0:0:0:6666:6666:0:0/96 | 6 | + | | | + +-----------------------------------+------------+ + | 0:0:1111:1111:0:0:0:0/96 | 10 | + | | | + +-----------------------------------+------------+ + | 0:0:1111:1111:1111:1111:0:0/96 | 11 | + | | | + +-----------------------------------+------------+ + | 0:0:0:0:aaaa:aaaa:0:0/96 | 25 | + | | | + +-----------------------------------+------------+ + | 0:0:0:0:bbbb:bbbb:0:0/96 | 26 | + | | | + +-----------------------------------+------------+ + +.. _table_ipsec_endpoint_inbound_sp: + +.. table:: Endpoint 0 Inbound Security Policies + + +-----------------------------------+------------+ + | **Dst** | **SA idx** | + | | | + +-----------------------------------+------------+ + | 192.168.115.0/24 | 105 | + | | | + +-----------------------------------+------------+ + | 192.168.116.0/24 | 106 | + | | | + +-----------------------------------+------------+ + | 192.168.185.0/24 | 110 | + | | | + +-----------------------------------+------------+ + | 192.168.186.0/24 | 111 | + | | | + +-----------------------------------+------------+ + | 192.168.210.0/24 | 115 | + | | | + +-----------------------------------+------------+ + | 192.168.211.0/24 | 116 | + | | | + +-----------------------------------+------------+ + | 192.168.65.0/24 | 125 | + | | | + +-----------------------------------+------------+ + | 192.168.66.0/24 | 126 | + | | | + +-----------------------------------+------------+ + | 192.168.245.0/24 | BYPASS | + | | | + +-----------------------------------+------------+ + | 192.168.246.0/24 | BYPASS | + | | | + +-----------------------------------+------------+ + | ffff:0:0:0:5555:5555:0:0/96 | 105 | + | | | + +-----------------------------------+------------+ + | ffff:0:0:0:6666:6666:0:0/96 | 106 | + | | | + +-----------------------------------+------------+ + | ffff:0:1111:1111:0:0:0:0/96 | 110 | + | | | + +-----------------------------------+------------+ + | ffff:0:1111:1111:1111:1111:0:0/96 | 111 | + | | | + +-----------------------------------+------------+ + | ffff:0:0:0:aaaa:aaaa:0:0/96 | 125 | + | | | + +-----------------------------------+------------+ + | ffff:0:0:0:bbbb:bbbb:0:0/96 | 126 | + | | | + +-----------------------------------+------------+ + +For Endpoint 1, we use the same policies in reverse, meaning the Inbound SP +entries are set as Outbound and vice versa. Security Association Initialization @@ -336,7 +378,7 @@ Security Association Initialization The SAs are kept in a array table. -For Inbound, the SPI is used as index module the table size. +For Inbound, the SPI is used as index modulo the table size. This means that on a table for 100 SA, SPI 5 and 105 would use the same index and that is not currently supported. @@ -346,179 +388,327 @@ not the SPI in the Security Policy. All SAs configured with AES-CBC and HMAC-SHA1 share the same values for cipher block size and key, and authentication digest size and key. -Following are the default values: - -Endpoint 0 Outbound Security Associations: - -+---------+------------+-----------+----------------+------------------+ -| **SPI** | **Cipher** | **Auth** | **Tunnel src** | **Tunnel dst** | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 5 | AES-CBC | HMAC-SHA1 | 172.16.1.5 | 172.16.2.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 6 | AES-CBC | HMAC-SHA1 | 172.16.1.6 | 172.16.2.6 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 7 | AES-CBC | HMAC-SHA1 | 172.16.1.7 | 172.16.2.7 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 8 | AES-CBC | HMAC-SHA1 | 172.16.1.8 | 172.16.2.8 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 9 | NULL | NULL | 172.16.1.5 | 172.16.2.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ - -Endpoint 0 Inbound Security Associations: - -+---------+------------+-----------+----------------+------------------+ -| **SPI** | **Cipher** | **Auth** | **Tunnel src** | **Tunnel dst** | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 5 | AES-CBC | HMAC-SHA1 | 172.16.2.5 | 172.16.1.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 6 | AES-CBC | HMAC-SHA1 | 172.16.2.6 | 172.16.1.6 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 7 | AES-CBC | HMAC-SHA1 | 172.16.2.7 | 172.16.1.7 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 8 | AES-CBC | HMAC-SHA1 | 172.16.2.8 | 172.16.1.8 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 9 | NULL | NULL | 172.16.2.5 | 172.16.1.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ - -Endpoint 1 Outbound Security Associations: - -+---------+------------+-----------+----------------+------------------+ -| **SPI** | **Cipher** | **Auth** | **Tunnel src** | **Tunnel dst** | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 5 | AES-CBC | HMAC-SHA1 | 172.16.2.5 | 172.16.1.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 6 | AES-CBC | HMAC-SHA1 | 172.16.2.6 | 172.16.1.6 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 7 | AES-CBC | HMAC-SHA1 | 172.16.2.7 | 172.16.1.7 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 8 | AES-CBC | HMAC-SHA1 | 172.16.2.8 | 172.16.1.8 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 9 | NULL | NULL | 172.16.2.5 | 172.16.1.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ - -Endpoint 1 Inbound Security Associations: - -+---------+------------+-----------+----------------+------------------+ -| **SPI** | **Cipher** | **Auth** | **Tunnel src** | **Tunnel dst** | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 5 | AES-CBC | HMAC-SHA1 | 172.16.1.5 | 172.16.2.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 6 | AES-CBC | HMAC-SHA1 | 172.16.1.6 | 172.16.2.6 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 7 | AES-CBC | HMAC-SHA1 | 172.16.1.7 | 172.16.2.7 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 8 | AES-CBC | HMAC-SHA1 | 172.16.1.8 | 172.16.2.8 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ -| 9 | NULL | NULL | 172.16.1.5 | 172.16.2.5 | -| | | | | | -+---------+------------+-----------+----------------+------------------+ +The following are the default values: + +.. _table_ipsec_endpoint_outbound_sa: + +.. table:: Endpoint 0 Outbound Security Associations + + +---------+----------+------------+-----------+----------------+----------------+ + | **SPI** | **Mode** | **Cipher** | **Auth** | **Tunnel src** | **Tunnel dst** | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 5 | Tunnel | AES-CBC | HMAC-SHA1 | 172.16.1.5 | 172.16.2.5 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 6 | Tunnel | AES-CBC | HMAC-SHA1 | 172.16.1.6 | 172.16.2.6 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 10 | Trans | AES-CBC | HMAC-SHA1 | N/A | N/A | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 11 | Trans | AES-CBC | HMAC-SHA1 | N/A | N/A | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 15 | Tunnel | NULL | NULL | 172.16.1.5 | 172.16.2.5 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 16 | Tunnel | NULL | NULL | 172.16.1.6 | 172.16.2.6 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 25 | Tunnel | AES-CBC | HMAC-SHA1 | 1111:1111: | 2222:2222: | + | | | | | 1111:1111: | 2222:2222: | + | | | | | 1111:1111: | 2222:2222: | + | | | | | 1111:5555 | 2222:5555 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 26 | Tunnel | AES-CBC | HMAC-SHA1 | 1111:1111: | 2222:2222: | + | | | | | 1111:1111: | 2222:2222: | + | | | | | 1111:1111: | 2222:2222: | + | | | | | 1111:6666 | 2222:6666 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + +.. _table_ipsec_endpoint_inbound_sa: + +.. table:: Endpoint 0 Inbound Security Associations + + +---------+----------+------------+-----------+----------------+----------------+ + | **SPI** | **Mode** | **Cipher** | **Auth** | **Tunnel src** | **Tunnel dst** | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 105 | Tunnel | AES-CBC | HMAC-SHA1 | 172.16.2.5 | 172.16.1.5 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 106 | Tunnel | AES-CBC | HMAC-SHA1 | 172.16.2.6 | 172.16.1.6 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 110 | Trans | AES-CBC | HMAC-SHA1 | N/A | N/A | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 111 | Trans | AES-CBC | HMAC-SHA1 | N/A | N/A | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 115 | Tunnel | NULL | NULL | 172.16.2.5 | 172.16.1.5 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 116 | Tunnel | NULL | NULL | 172.16.2.6 | 172.16.1.6 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 125 | Tunnel | AES-CBC | HMAC-SHA1 | 2222:2222: | 1111:1111: | + | | | | | 2222:2222: | 1111:1111: | + | | | | | 2222:2222: | 1111:1111: | + | | | | | 2222:5555 | 1111:5555 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + | 126 | Tunnel | AES-CBC | HMAC-SHA1 | 2222:2222: | 1111:1111: | + | | | | | 2222:2222: | 1111:1111: | + | | | | | 2222:2222: | 1111:1111: | + | | | | | 2222:6666 | 1111:6666 | + | | | | | | | + +---------+----------+------------+-----------+----------------+----------------+ + +For Endpoint 1, we use the same policies in reverse, meaning the Inbound SP +entries are set as Outbound and vice versa. + Routing Initialization ~~~~~~~~~~~~~~~~~~~~~~ -The Routing is implemented using LPM table. +The Routing is implemented using an LPM table. Following default values: -Endpoint 0 Routing Table: - -+------------------+----------+ -| **Dst addr** | **Port** | -| | | -+------------------+----------+ -| 172.16.2.5/32 | 0 | -| | | -+------------------+----------+ -| 172.16.2.6/32 | 0 | -| | | -+------------------+----------+ -| 172.16.2.7/32 | 1 | -| | | -+------------------+----------+ -| 172.16.2.8/32 | 1 | -| | | -+------------------+----------+ -| 192.168.115.0/24 | 2 | -| | | -+------------------+----------+ -| 192.168.116.0/24 | 2 | -| | | -+------------------+----------+ -| 192.168.117.0/24 | 3 | -| | | -+------------------+----------+ -| 192.168.118.0/24 | 3 | -| | | -+------------------+----------+ -| 192.168.210.0/24 | 2 | -| | | -+------------------+----------+ -| 192.168.240.0/24 | 2 | -| | | -+------------------+----------+ -| 192.168.250.0/24 | 0 | -| | | -+------------------+----------+ - -Endpoint 1 Routing Table: - -+------------------+----------+ -| **Dst addr** | **Port** | -| | | -+------------------+----------+ -| 172.16.1.5/32 | 2 | -| | | -+------------------+----------+ -| 172.16.1.6/32 | 2 | -| | | -+------------------+----------+ -| 172.16.1.7/32 | 3 | -| | | -+------------------+----------+ -| 172.16.1.8/32 | 3 | -| | | -+------------------+----------+ -| 192.168.105.0/24 | 0 | -| | | -+------------------+----------+ -| 192.168.106.0/24 | 0 | -| | | -+------------------+----------+ -| 192.168.107.0/24 | 1 | -| | | -+------------------+----------+ -| 192.168.108.0/24 | 1 | -| | | -+------------------+----------+ -| 192.168.200.0/24 | 0 | -| | | -+------------------+----------+ -| 192.168.240.0/24 | 2 | -| | | -+------------------+----------+ -| 192.168.250.0/24 | 0 | -| | | -+------------------+----------+ +.. _table_ipsec_endpoint_outbound_routing: + +.. table:: Endpoint 0 Routing Table + + +------------------+----------+ + | **Dst addr** | **Port** | + | | | + +------------------+----------+ + | 172.16.2.5/32 | 0 | + | | | + +------------------+----------+ + | 172.16.2.6/32 | 1 | + | | | + +------------------+----------+ + | 192.168.175.0/24 | 0 | + | | | + +------------------+----------+ + | 192.168.176.0/24 | 1 | + | | | + +------------------+----------+ + | 192.168.240.0/24 | 0 | + | | | + +------------------+----------+ + | 192.168.241.0/24 | 1 | + | | | + +------------------+----------+ + | 192.168.115.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.116.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.65.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.66.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.185.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.186.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.210.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.211.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.245.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.246.0/24 | 3 | + | | | + +------------------+----------+ + | 2222:2222: | 0 | + | 2222:2222: | | + | 2222:2222: | | + | 2222:5555/116 | | + | | | + +------------------+----------+ + | 2222:2222: | 1 | + | 2222:2222: | | + | 2222:2222: | | + | 2222:6666/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 0 | + | 1111:1111: | | + | 0000:0000: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 1 | + | 1111:1111: | | + | 1111:1111: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 2 | + | 0000:0000: | | + | aaaa:aaaa: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 3 | + | 0000:0000: | | + | bbbb:bbbb: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 2 | + | 0000:0000: | | + | 5555:5555: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 3 | + | 0000:0000: | | + | 6666:6666: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 2 | + | 1111:1111: | | + | 0000:0000: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 3 | + | 1111:1111: | | + | 1111:1111: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ + +.. _table_ipsec_endpoint_inbound_routing: + +.. table:: Endpoint 1 Routing Table + + +------------------+----------+ + | **Dst addr** | **Port** | + | | | + +------------------+----------+ + | 172.16.1.5/32 | 0 | + | | | + +------------------+----------+ + | 172.16.1.6/32 | 1 | + | | | + +------------------+----------+ + | 192.168.185.0/24 | 0 | + | | | + +------------------+----------+ + | 192.168.186.0/24 | 1 | + | | | + +------------------+----------+ + | 192.168.245.0/24 | 0 | + | | | + +------------------+----------+ + | 192.168.246.0/24 | 1 | + | | | + +------------------+----------+ + | 192.168.105.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.106.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.55.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.56.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.175.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.176.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.200.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.201.0/24 | 3 | + | | | + +------------------+----------+ + | 192.168.240.0/24 | 2 | + | | | + +------------------+----------+ + | 192.168.241.0/24 | 3 | + | | | + +------------------+----------+ + | 1111:1111: | 0 | + | 1111:1111: | | + | 1111:1111: | | + | 1111:5555/116 | | + | | | + +------------------+----------+ + | 1111:1111: | 1 | + | 1111:1111: | | + | 1111:1111: | | + | 1111:6666/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 0 | + | 1111:1111: | | + | 0000:0000: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ + | ffff:0000: | 1 | + | 1111:1111: | | + | 1111:1111: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 2 | + | 0000:0000: | | + | aaaa:aaaa: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 3 | + | 0000:0000: | | + | bbbb:bbbb: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 2 | + | 0000:0000: | | + | 5555:5555: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 3 | + | 0000:0000: | | + | 6666:6666: | | + | 0000:0/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 2 | + | 1111:1111: | | + | 0000:0000: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ + | 0000:0000: | 3 | + | 1111:1111: | | + | 1111:1111: | | + | 0000:0000/116 | | + | | | + +------------------+----------+ diff --git a/doc/guides/sample_app_ug/ipv4_multicast.rst b/doc/guides/sample_app_ug/ipv4_multicast.rst index 67ea944e..72da8c42 100644 --- a/doc/guides/sample_app_ug/ipv4_multicast.rst +++ b/doc/guides/sample_app_ug/ipv4_multicast.rst @@ -193,7 +193,7 @@ Firstly, the Ethernet* header is removed from the packet and the IPv4 address is /* Remove the Ethernet header from the input packet */ iphdr = (struct ipv4_hdr *)rte_pktmbuf_adj(m, sizeof(struct ether_hdr)); - RTE_MBUF_ASSERT(iphdr != NULL); + RTE_ASSERT(iphdr != NULL); dest_addr = rte_be_to_cpu_32(iphdr->dst_addr); Then, the packet is checked to see if it has a multicast destination address and @@ -271,7 +271,7 @@ The actual packet transmission is done in the mcast_send_pkt() function: ethdr = (struct ether_hdr *)rte_pktmbuf_prepend(pkt, (uint16_t) sizeof(*ethdr)); - RTE_MBUF_ASSERT(ethdr != NULL); + RTE_ASSERT(ethdr != NULL); ether_addr_copy(dest_addr, ðdr->d_addr); ether_addr_copy(&ports_eth_addr[port], ðdr->s_addr); diff --git a/doc/guides/sample_app_ug/l2_forward_crypto.rst b/doc/guides/sample_app_ug/l2_forward_crypto.rst index 7cce51be..723376c5 100644 --- a/doc/guides/sample_app_ug/l2_forward_crypto.rst +++ b/doc/guides/sample_app_ug/l2_forward_crypto.rst @@ -167,11 +167,11 @@ To run the application in linuxapp environment with 2 lcores, 2 ports and 2 cryp .. code-block:: console - $ ./build/l2fwd -c 0x3 -n 4 --vdev "cryptodev_aesni_mb_pmd" \ + $ ./build/l2fwd-crypto -c 0x3 -n 4 --vdev "cryptodev_aesni_mb_pmd" \ --vdev "cryptodev_aesni_mb_pmd" -- -p 0x3 --chain CIPHER_HASH \ --cipher_op ENCRYPT --cipher_algo AES_CBC \ --cipher_key 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f \ - --auth_op GENERATE --auth_algo SHA1_HMAC \ + --auth_op GENERATE --auth_algo AES_XCBC_MAC \ --auth_key 10:11:12:13:14:15:16:17:18:19:1a:1b:1c:1d:1e:1f Refer to the *DPDK Getting Started Guide* for general information on running applications diff --git a/doc/guides/sample_app_ug/l2_forward_job_stats.rst b/doc/guides/sample_app_ug/l2_forward_job_stats.rst index 03d99779..2444e36e 100644 --- a/doc/guides/sample_app_ug/l2_forward_job_stats.rst +++ b/doc/guides/sample_app_ug/l2_forward_job_stats.rst @@ -238,9 +238,6 @@ in the *DPDK Programmer's Guide* and the *DPDK API Reference*. if (nb_ports == 0) rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n"); - if (nb_ports > RTE_MAX_ETHPORTS) - nb_ports = RTE_MAX_ETHPORTS; - /* reset l2fwd_dst_ports */ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) diff --git a/doc/guides/sample_app_ug/l2_forward_real_virtual.rst b/doc/guides/sample_app_ug/l2_forward_real_virtual.rst index e77d67c7..b51b2dc9 100644 --- a/doc/guides/sample_app_ug/l2_forward_real_virtual.rst +++ b/doc/guides/sample_app_ug/l2_forward_real_virtual.rst @@ -242,9 +242,6 @@ in the *DPDK Programmer's Guide* - Rel 1.4 EAR and the *DPDK API Reference*. if (nb_ports == 0) rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n"); - if (nb_ports > RTE_MAX_ETHPORTS) - nb_ports = RTE_MAX_ETHPORTS; - /* reset l2fwd_dst_ports */ for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_app_ug/l3_forward.rst index 3e070d0b..491f99d9 100644 --- a/doc/guides/sample_app_ug/l3_forward.rst +++ b/doc/guides/sample_app_ug/l3_forward.rst @@ -324,7 +324,7 @@ The key code snippet of simple_ipv4_fwd_4pkts() is shown below: const void *key_array[4] = {&key[0], &key[1], &key[2],&key[3]}; - rte_hash_lookup_multi(qconf->ipv4_lookup_struct, &key_array[0], 4, ret); + rte_hash_lookup_bulk(qconf->ipv4_lookup_struct, &key_array[0], 4, ret); dst_port[0] = (ret[0] < 0)? portid:ipv4_l3fwd_out_if[ret[0]]; dst_port[1] = (ret[1] < 0)? portid:ipv4_l3fwd_out_if[ret[1]]; diff --git a/doc/guides/sample_app_ug/link_status_intr.rst b/doc/guides/sample_app_ug/link_status_intr.rst index a4dbb545..779df97d 100644 --- a/doc/guides/sample_app_ug/link_status_intr.rst +++ b/doc/guides/sample_app_ug/link_status_intr.rst @@ -145,9 +145,6 @@ To fully understand this code, it is recommended to study the chapters that rela if (nb_ports == 0) rte_exit(EXIT_FAILURE, "No Ethernet ports - bye\n"); - if (nb_ports > RTE_MAX_ETHPORTS) - nb_ports = RTE_MAX_ETHPORTS; - /* * Each logical core is assigned a dedicated TX queue on each port. */ diff --git a/doc/guides/sample_app_ug/pdump.rst b/doc/guides/sample_app_ug/pdump.rst new file mode 100644 index 00000000..96c8709e --- /dev/null +++ b/doc/guides/sample_app_ug/pdump.rst @@ -0,0 +1,122 @@ + +.. BSD LICENSE + Copyright(c) 2016 Intel Corporation. All rights reserved. + All rights reserved. + + Redistribution and use in source and binary forms, with or without + modification, are permitted provided that the following conditions + are met: + + * Redistributions of source code must retain the above copyright + notice, this list of conditions and the following disclaimer. + * Redistributions in binary form must reproduce the above copyright + notice, this list of conditions and the following disclaimer in + the documentation and/or other materials provided with the + distribution. + * Neither the name of Intel Corporation nor the names of its + contributors may be used to endorse or promote products derived + from this software without specific prior written permission. + + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + +dpdk_pdump Application +====================== + +The ``dpdk_pdump`` application is a Data Plane Development Kit (DPDK) application that runs as a DPDK secondary process and +is capable of enabling packet capture on dpdk ports. + + +Running the Application +----------------------- + +The application has a ``--pdump`` command line option with various sub arguments: + +.. code-block:: console + + ./build/app/dpdk_pdump -- + --pdump '(port= | device_id=), + (queue=), + (rx-dev= | + tx-dev=), + [ring-size=], + [mbuf-size=], + [total-num-mbufs=]' + +Note: + +* Parameters inside the parentheses represents mandatory parameters. + +* Parameters inside the square brackets represents optional parameters. + +Multiple instances of ``--pdump`` can be passed to capture packets on different port and queue combinations. + + +Parameters +~~~~~~~~~~ + +``port``: +Port id of the eth device on which packets should be captured. + +``device_id``: +PCI address (or) name of the eth device on which packets should be captured. + + .. Note:: + + * As of now the ``dpdk_pdump`` tool cannot capture the packets of virtual devices + in the primary process due to a bug in the ethdev library. Due to this bug, in a multi process context, + when the primary and secondary have different ports set, then the secondary process + (here the ``dpdk_pdump`` tool) overwrites the ``rte_eth_devices[]`` entries of the primary process. + +``queue``: +Queue id of the eth device on which packets should be captured. The user can pass a queue value of ``*`` to enable +packet capture on all queues of the eth device. + +``rx-dev``: +Can be either a pcap file name or any Linux iface. + +``tx-dev``: +Can be either a pcap file name or any Linux iface. + + .. Note:: + + * To receive ingress packets only, ``rx-dev`` should be passed. + + * To receive egress packets only, ``tx-dev`` should be passed. + + * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev`` + should both be passed with the different file names or the Linux iface names. + + * To receive ingress and egress packets separately ``rx-dev`` and ``tx-dev`` + should both be passed with the same file names or the the Linux iface names. + +``ring-size``: +Size of the ring. This value is used internally for ring creation. The ring will be used to enqueue the packets from +the primary application to the secondary. This is an optional parameter with default size 16384. + +``mbuf-size``: +Size of the mbuf data. This is used internally for mempool creation. Ideally this value must be same as +the primary application's mempool's mbuf data size which is used for packet RX. This is an optional parameter with +default size 2176. + +``total-num-mbufs``: +Total number mbufs in mempool. This is used internally for mempool creation. This is an optional parameter with default +value 65535. + + +Example +------- + +.. code-block:: console + + $ sudo ./build/app/dpdk_pdump -- --pdump 'port=0,queue=*,rx-dev=/tmp/rx.pcap' diff --git a/doc/guides/sample_app_ug/vhost.rst b/doc/guides/sample_app_ug/vhost.rst index 47ce36ce..a93e54d8 100644 --- a/doc/guides/sample_app_ug/vhost.rst +++ b/doc/guides/sample_app_ug/vhost.rst @@ -491,39 +491,9 @@ The default value is 15. -- --rx-retry 1 --rx-retry-delay 20 **Zero copy.** -The zero copy option enables/disables the zero copy mode for RX/TX packet, -in the zero copy mode the packet buffer address from guest translate into host physical address -and then set directly as DMA address. -If the zero copy mode is disabled, then one copy mode is utilized in the sample. -This option is disabled by default. - -.. code-block:: console - - ./vhost-switch -c f -n 4 --socket-mem 1024 --huge-dir /mnt/huge \ - -- --zero-copy [0,1] - -**RX descriptor number.** -The RX descriptor number option specify the Ethernet RX descriptor number, -Linux legacy virtio-net has different behavior in how to use the vring descriptor from DPDK based virtio-net PMD, -the former likely allocate half for virtio header, another half for frame buffer, -while the latter allocate all for frame buffer, -this lead to different number for available frame buffer in vring, -and then lead to different Ethernet RX descriptor number could be used in zero copy mode. -So it is valid only in zero copy mode is enabled. The value is 32 by default. - -.. code-block:: console - - ./vhost-switch -c f -n 4 --socket-mem 1024 --huge-dir /mnt/huge \ - -- --zero-copy 1 --rx-desc-num [0, n] - -**TX descriptor number.** -The TX descriptor number option specify the Ethernet TX descriptor number, it is valid only in zero copy mode is enabled. -The value is 64 by default. - -.. code-block:: console - - ./vhost-switch -c f -n 4 --socket-mem 1024 --huge-dir /mnt/huge \ - -- --zero-copy 1 --tx-desc-num [0, n] +Zero copy mode is removed, due to it has not been working for a while. And +due to the large and complex code, it's better to redesign it than fixing +it to make it work again. Hence, zero copy may be added back later. **VLAN strip.** The VLAN strip option enable/disable the VLAN strip on host, if disabled, the guest will receive the packets with VLAN tag. @@ -863,3 +833,20 @@ For example: The above message indicates that device 0 has been registered with MAC address cc:bb:bb:bb:bb:bb and VLAN tag 1000. Any packets received on the NIC with these values is placed on the devices receive queue. When a virtio-net device transmits packets, the VLAN tag is added to the packet by the DPDK vhost sample code. + +Running virtio-user with vhost-switch +------------------------------------- + +We can also use virtio-user with vhost-switch now. +Virtio-user is a virtual device that can be run in a application (container) parallelly with vhost in the same OS, +aka, there is no need to start a VM. We just run it with a different --file-prefix to avoid startup failure. + +.. code-block:: console + + cd ${RTE_SDK}/x86_64-native-linuxapp-gcc/app + ./testpmd -c 0x3 -n 4 --socket-mem 1024 --no-pci --file-prefix=virtio-user-testpmd \ + --vdev=virtio-user0,mac=00:01:02:03:04:05,path=$path_vhost \ + -- -i --txqflags=0xf01 --disable-hw-vlan + +There is no difference on the vhost side. +Pleae note that there are some limitations (see release note for more information) in the usage of virtio-user. diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst index f6055645..7712bd24 100644 --- a/doc/guides/testpmd_app_ug/run_app.rst +++ b/doc/guides/testpmd_app_ug/run_app.rst @@ -289,6 +289,10 @@ The commandline options are: Enable hardware RX checksum offload. +* ``--enable-scatter`` + + Enable scatter (multi-segment) RX. + * ``--disable-hw-vlan`` Disable hardware VLAN. @@ -329,7 +333,6 @@ The commandline options are: io (the default) mac - mac_retry mac_swap flowgen rxonly diff --git a/doc/guides/testpmd_app_ug/testpmd_funcs.rst b/doc/guides/testpmd_app_ug/testpmd_funcs.rst index aed5e477..30e410dd 100644 --- a/doc/guides/testpmd_app_ug/testpmd_funcs.rst +++ b/doc/guides/testpmd_app_ug/testpmd_funcs.rst @@ -98,9 +98,11 @@ Start packet forwarding with current configuration:: start tx_first ~~~~~~~~~~~~~~ -Start packet forwarding with current configuration after sending one burst of packets:: +Start packet forwarding with current configuration after sending specified number of bursts of packets:: - testpmd> start tx_first + testpmd> start tx_first (""|burst_num) + +The default burst number is 1 when ``burst_num`` not presented. stop ~~~~ @@ -177,6 +179,10 @@ For example: ipv6-sctp ipv6-other l2_payload + port + vxlan + geneve + nvgre show port rss reta ~~~~~~~~~~~~~~~~~~ @@ -249,8 +255,10 @@ set fwd Set the packet forwarding mode:: - testpmd> set fwd (io|mac|mac_retry|macswap|flowgen| \ - rxonly|txonly|csum|icmpecho) + testpmd> set fwd (io|mac|macswap|flowgen| \ + rxonly|txonly|csum|icmpecho) (""|retry) + +``retry`` can be specified for forwarding engines except ``rx_only``. The available information categories are: @@ -260,8 +268,6 @@ The available information categories are: * ``mac``: Changes the source and the destination Ethernet addresses of packets before forwarding them. -* ``mac_retry``: Same as "mac" forwarding mode, but includes retries if the destination queue is full. - * ``macswap``: MAC swap forwarding mode. Swaps the source and the destination Ethernet addresses of packets before forwarding them. @@ -392,9 +398,9 @@ Set number of packets per burst:: This is equivalent to the ``--burst command-line`` option. -In ``mac_retry`` forwarding mode, the transmit delay time and number of retries can also be set:: +When retry is enabled, the transmit delay time and number of retries can also be set:: - testpmd> set burst tx delay (micrseconds) retry (num) + testpmd> set burst tx delay (microseconds) retry (num) set txpkts ~~~~~~~~~~ @@ -980,7 +986,9 @@ The following sections show functions for configuring ports. port attach ~~~~~~~~~~~ -Attach a port specified by pci address or virtual device args. +Attach a port specified by pci address or virtual device args:: + + testpmd> port attach (identifier) To attach a new pci device, the device should be recognized by kernel first. Then it should be moved under DPDK management. @@ -1014,8 +1022,6 @@ For example, to move a pci device using ixgbe under DPDK management: To attach a port created by virtual device, above steps are not needed. -port attach (identifier) - For example, to attach a port whose pci address is 0000:0a:00.0. .. code-block:: console @@ -1061,16 +1067,19 @@ the mode and slave parameters must be given. port detach ~~~~~~~~~~~ -Detach a specific port. - -Before detaching a port, the port should be closed:: +Detach a specific port:: testpmd> port detach (port_id) +Before detaching a port, the port should be stopped and closed. + For example, to detach a pci device port 0. .. code-block:: console + testpmd> port stop 0 + Stopping ports... + Done testpmd> port close 0 Closing ports... Done @@ -1088,6 +1097,9 @@ For example, to detach a virtual device port 0. .. code-block:: console + testpmd> port stop 0 + Stopping ports... + Done testpmd> port close 0 Closing ports... Done @@ -1187,6 +1199,26 @@ CRC stripping is off by default. The ``on`` option is equivalent to the ``--crc-strip`` command-line option. +port config - scatter +~~~~~~~~~~~~~~~~~~~~~~~ + +Set RX scatter mode on or off for all ports:: + + testpmd> port config all scatter (on|off) + +RX scatter mode is off by default. + +The ``on`` option is equivalent to the ``--enable-scatter`` command-line option. + +port config - TX queue flags +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Set a hexadecimal bitmap of TX queue flags for all ports:: + + testpmd> port config all txqflags value + +This command is equivalent to the ``--txqflags`` command-line option. + port config - RX Checksum ~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -1258,7 +1290,7 @@ port config - RSS Set the RSS (Receive Side Scaling) mode on or off:: - testpmd> port config all rss (all|ip|tcp|udp|sctp|ether|none) + testpmd> port config all rss (all|ip|tcp|udp|sctp|ether|port|vxlan|geneve|nvgre|none) RSS is on by default. @@ -1782,13 +1814,13 @@ Different NICs may have different capabilities, command show port fdir (port_id) For example, to add an ipv4-udp flow type filter:: - testpmd> flow_director_filter 0 add flow ipv4-udp src 2.2.2.3 32 \ + testpmd> flow_director_filter 0 mode IP add flow ipv4-udp src 2.2.2.3 32 \ dst 2.2.2.5 33 tos 2 ttl 40 vlan 0x1 flexbytes (0x88,0x48) \ fwd pf queue 1 fd_id 1 For example, add an ipv4-other flow type filter:: - testpmd> flow_director_filter 0 add flow ipv4-other src 2.2.2.3 \ + testpmd> flow_director_filter 0 mode IP add flow ipv4-other src 2.2.2.3 \ dst 2.2.2.5 tos 2 proto 20 ttl 40 vlan 0x1 \ flexbytes (0x88,0x48) fwd pf queue 1 fd_id 1 @@ -1821,7 +1853,7 @@ Set flow director's input masks:: Example, to set flow director mask on port 0:: - testpmd> flow_director_mask 0 vlan 0xefff \ + testpmd> flow_director_mask 0 mode IP vlan 0xefff \ src_mask 255.255.255.255 \ FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF 0xFFFF \ dst_mask 255.255.255.255 \ -- cgit 1.2.3-korg