summaryrefslogtreecommitdiffstats
path: root/doc/guides/prog_guide
diff options
context:
space:
mode:
authorLuca Boccassi <luca.boccassi@gmail.com>2018-11-01 11:59:50 +0000
committerLuca Boccassi <luca.boccassi@gmail.com>2018-11-01 12:00:19 +0000
commit8d01b9cd70a67cdafd5b965a70420c3bd7fb3f82 (patch)
tree208e3bc33c220854d89d010e3abf720a2e62e546 /doc/guides/prog_guide
parentb63264c8342e6a1b6971c79550d2af2024b6a4de (diff)
New upstream version 18.11-rc1upstream/18.11-rc1
Change-Id: Iaa71986dd6332e878d8f4bf493101b2bbc6313bb Signed-off-by: Luca Boccassi <luca.boccassi@gmail.com>
Diffstat (limited to 'doc/guides/prog_guide')
-rw-r--r--doc/guides/prog_guide/env_abstraction_layer.rst45
-rw-r--r--doc/guides/prog_guide/event_ethernet_tx_adapter.rst165
-rw-r--r--doc/guides/prog_guide/hash_lib.rst33
-rw-r--r--doc/guides/prog_guide/index.rst2
-rw-r--r--doc/guides/prog_guide/kernel_nic_interface.rst239
-rw-r--r--doc/guides/prog_guide/packet_framework.rst11
-rw-r--r--doc/guides/prog_guide/port_hotplug_framework.rst106
-rw-r--r--doc/guides/prog_guide/power_man.rst86
-rw-r--r--doc/guides/prog_guide/profile_app.rst34
-rw-r--r--doc/guides/prog_guide/rte_flow.rst285
-rw-r--r--doc/guides/prog_guide/rte_security.rst107
-rw-r--r--doc/guides/prog_guide/vhost_lib.rst8
12 files changed, 927 insertions, 194 deletions
diff --git a/doc/guides/prog_guide/env_abstraction_layer.rst b/doc/guides/prog_guide/env_abstraction_layer.rst
index d362c920..4f8612a2 100644
--- a/doc/guides/prog_guide/env_abstraction_layer.rst
+++ b/doc/guides/prog_guide/env_abstraction_layer.rst
@@ -213,6 +213,43 @@ Normally, these options do not need to be changed.
can later be mapped into that preallocated VA space (if dynamic memory mode
is enabled), and can optionally be mapped into it at startup.
+Support for Externally Allocated Memory
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+It is possible to use externally allocated memory in DPDK, using a set of malloc
+heap API's. Support for externally allocated memory is implemented through
+overloading the socket ID - externally allocated heaps will have socket ID's
+that would be considered invalid under normal circumstances. Requesting an
+allocation to take place from a specified externally allocated memory is a
+matter of supplying the correct socket ID to DPDK allocator, either directly
+(e.g. through a call to ``rte_malloc``) or indirectly (through data
+structure-specific allocation API's such as ``rte_ring_create``).
+
+Since there is no way DPDK can verify whether memory are is available or valid,
+this responsibility falls on the shoulders of the user. All multiprocess
+synchronization is also user's responsibility, as well as ensuring that all
+calls to add/attach/detach/remove memory are done in the correct order. It is
+not required to attach to a memory area in all processes - only attach to memory
+areas as needed.
+
+The expected workflow is as follows:
+
+* Get a pointer to memory area
+* Create a named heap
+* Add memory area(s) to the heap
+ - If IOVA table is not specified, IOVA addresses will be assumed to be
+ unavailable, and DMA mappings will not be performed
+ - Other processes must attach to the memory area before they can use it
+* Get socket ID used for the heap
+* Use normal DPDK allocation procedures, using supplied socket ID
+* If memory area is no longer needed, it can be removed from the heap
+ - Other processes must detach from this memory area before it can be removed
+* If heap is no longer needed, remove it
+ - Socket ID will become invalid and will not be reused
+
+For more information, please refer to ``rte_malloc`` API documentation,
+specifically the ``rte_malloc_heap_*`` family of function calls.
+
PCI Access
~~~~~~~~~~
@@ -321,6 +358,14 @@ Misc Functions
Locks and atomic operations are per-architecture (i686 and x86_64).
+IOVA Mode Configuration
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Auto detection of the IOVA mode, based on probing the bus and IOMMU configuration, may not report
+the desired addressing mode when virtual devices that are not directly attached to the bus are present.
+To facilitate forcing the IOVA mode to a specific value the EAL command line option ``--iova-mode`` can
+be used to select either physical addressing('pa') or virtual addressing('va').
+
Memory Segments and Memory Zones (memzone)
------------------------------------------
diff --git a/doc/guides/prog_guide/event_ethernet_tx_adapter.rst b/doc/guides/prog_guide/event_ethernet_tx_adapter.rst
new file mode 100644
index 00000000..192f9e1c
--- /dev/null
+++ b/doc/guides/prog_guide/event_ethernet_tx_adapter.rst
@@ -0,0 +1,165 @@
+.. SPDX-License-Identifier: BSD-3-Clause
+ Copyright(c) 2017 Intel Corporation.
+
+Event Ethernet Tx Adapter Library
+=================================
+
+The DPDK Eventdev API allows the application to use an event driven programming
+model for packet processing in which the event device distributes events
+referencing packets to the application cores in a dynamic load balanced fashion
+while handling atomicity and packet ordering. Event adapters provide the interface
+between the ethernet, crypto and timer devices and the event device. Event adapter
+APIs enable common application code by abstracting PMD specific capabilities.
+The Event ethernet Tx adapter provides configuration and data path APIs for the
+transmit stage of the application allowing the same application code to use eventdev
+PMD support or in its absence, a common implementation.
+
+In the common implementation, the application enqueues mbufs to the adapter
+which runs as a rte_service function. The service function dequeues events
+from its event port and transmits the mbufs referenced by these events.
+
+
+API Walk-through
+----------------
+
+This section will introduce the reader to the adapter API. The
+application has to first instantiate an adapter which is associated with
+a single eventdev, next the adapter instance is configured with Tx queues,
+finally the adapter is started and the application can start enqueuing mbufs
+to it.
+
+Creating an Adapter Instance
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+An adapter instance is created using ``rte_event_eth_tx_adapter_create()``. This
+function is passed the event device to be associated with the adapter and port
+configuration for the adapter to setup an event port if the adapter needs to use
+a service function.
+
+If the application desires to have finer control of eventdev port configuration,
+it can use the ``rte_event_eth_tx_adapter_create_ext()`` function. The
+``rte_event_eth_tx_adapter_create_ext()`` function is passed a callback function.
+The callback function is invoked if the adapter needs to use a service function
+and needs to create an event port for it. The callback is expected to fill the
+``struct rte_event_eth_tx_adapter_conf`` structure passed to it.
+
+.. code-block:: c
+
+ struct rte_event_dev_info dev_info;
+ struct rte_event_port_conf tx_p_conf = {0};
+
+ err = rte_event_dev_info_get(id, &dev_info);
+
+ tx_p_conf.new_event_threshold = dev_info.max_num_events;
+ tx_p_conf.dequeue_depth = dev_info.max_event_port_dequeue_depth;
+ tx_p_conf.enqueue_depth = dev_info.max_event_port_enqueue_depth;
+
+ err = rte_event_eth_tx_adapter_create(id, dev_id, &tx_p_conf);
+
+Adding Tx Queues to the Adapter Instance
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Ethdev Tx queues are added to the instance using the
+``rte_event_eth_tx_adapter_queue_add()`` function. A queue value
+of -1 is used to indicate all queues within a device.
+
+.. code-block:: c
+
+ int err = rte_event_eth_tx_adapter_queue_add(id,
+ eth_dev_id,
+ q);
+
+Querying Adapter Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``rte_event_eth_tx_adapter_caps_get()`` function allows
+the application to query the adapter capabilities for an eventdev and ethdev
+combination. Currently, the only capability flag defined is
+``RTE_EVENT_ETH_TX_ADAPTER_CAP_INTERNAL_PORT``, the application can
+query this flag to determine if a service function is associated with the
+adapter and retrieve its service identifier using the
+``rte_event_eth_tx_adapter_service_id_get()`` API.
+
+
+.. code-block:: c
+
+ int err = rte_event_eth_tx_adapter_caps_get(dev_id, eth_dev_id, &cap);
+
+ if (!(cap & RTE_EVENT_ETH_TX_ADAPTER_CAP_INTERNAL_PORT))
+ err = rte_event_eth_tx_adapter_service_id_get(id, &service_id);
+
+Linking a Queue to the Adapter's Event Port
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the adapter uses a service function as described in the previous section, the
+application is required to link a queue to the adapter's event port. The adapter's
+event port can be obtained using the ``rte_event_eth_tx_adapter_event_port_get()``
+function. The queue can be configured with the ``RTE_EVENT_QUEUE_CFG_SINGLE_LINK``
+since it is linked to a single event port.
+
+Configuring the Service Function
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+If the adapter uses a service function, the application can assign
+a service core to the service function as shown below.
+
+.. code-block:: c
+
+ if (rte_event_eth_tx_adapter_service_id_get(id, &service_id) == 0)
+ rte_service_map_lcore_set(service_id, TX_CORE_ID);
+
+Starting the Adapter Instance
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The application calls ``rte_event_eth_tx_adapter_start()`` to start the adapter.
+This function calls the start callback of the eventdev PMD if supported,
+and the ``rte_service_run_state_set()`` to enable the service function if one exists.
+
+Enqueuing Packets to the Adapter
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The application needs to notify the adapter about the transmit port and queue used
+to send the packet. The transmit port is set in the ``struct rte mbuf::port`` field
+and the transmit queue is set using the ``rte_event_eth_tx_adapter_txq_set()``
+function.
+
+If the eventdev PMD supports the ``RTE_EVENT_ETH_TX_ADAPTER_CAP_INTERNAL_PORT``
+capability for a given ethernet device, the application should use the
+``rte_event_eth_tx_adapter_enqueue()`` function to enqueue packets to the adapter.
+
+If the adapter uses a service function for the ethernet device then the application
+should use the ``rte_event_enqueue_burst()`` function.
+
+.. code-block:: c
+
+ struct rte_event event;
+
+ if (cap & RTE_EVENT_ETH_TX_ADAPTER_CAP_INTERNAL_PORT) {
+
+ event.mbuf = m;
+
+ m->port = tx_port;
+ rte_event_eth_tx_adapter_txq_set(m, tx_queue_id);
+
+ rte_event_eth_tx_adapter_enqueue(dev_id, ev_port, &event, 1);
+ } else {
+
+ event.queue_id = qid; /* event queue linked to adapter port */
+ event.op = RTE_EVENT_OP_NEW;
+ event.event_type = RTE_EVENT_TYPE_CPU;
+ event.sched_type = RTE_SCHED_TYPE_ATOMIC;
+ event.mbuf = m;
+
+ m->port = tx_port;
+ rte_event_eth_tx_adapter_txq_set(m, tx_queue_id);
+
+ rte_event_enqueue_burst(dev_id, ev_port, &event, 1);
+ }
+
+Getting Adapter Statistics
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``rte_event_eth_tx_adapter_stats_get()`` function reports counters defined
+in struct ``rte_event_eth_tx_adapter_stats``. The counter values are the sum of
+the counts from the eventdev PMD callback if the callback is supported, and
+the counts maintained by the service function, if one exists.
diff --git a/doc/guides/prog_guide/hash_lib.rst b/doc/guides/prog_guide/hash_lib.rst
index 76a1f323..f5beec1d 100644
--- a/doc/guides/prog_guide/hash_lib.rst
+++ b/doc/guides/prog_guide/hash_lib.rst
@@ -1,5 +1,6 @@
.. SPDX-License-Identifier: BSD-3-Clause
Copyright(c) 2010-2015 Intel Corporation.
+ Copyright(c) 2018 Arm Limited.
.. _Hash_Library:
@@ -38,7 +39,7 @@ The main methods exported by the hash are:
* Lookup for entry with key: The key is provided as input. If an entry with the specified key is found in the hash (lookup hit),
then the position of the entry is returned, otherwise (lookup miss) a negative value is returned.
-Apart from these method explained above, the API allows the user three more options:
+Apart from these methods explained above, the API provides the user with few more options:
* Add / lookup / delete with key and precomputed hash: Both the key and its precomputed hash are provided as input. This allows
the user to perform these operations faster, as hash is already computed.
@@ -48,6 +49,9 @@ Apart from these method explained above, the API allows the user three more opti
* Combination of the two options above: User can provide key, precomputed hash and data.
+* Ability to not free the position of the entry in the hash upon calling delete. This is useful for multi-threaded scenarios where
+ readers continue to use the position even after the entry is deleted.
+
Also, the API contains a method to allow the user to look up entries in bursts, achieving higher performance
than looking up individual entries, as the function prefetches next entries at the time it is operating
with the first ones, which reduces significantly the impact of the necessary memory accesses.
@@ -83,13 +87,20 @@ For concurrent writes, and concurrent reads and writes the following flag values
Key add, delete, and table reset are protected from other writer threads. With only this flag set, readers are not protected from ongoing writes.
* If the read/write concurrency (RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) is set, multithread read/write operation is safe
- (i.e., no need to stop the readers from accessing the hash table until writers finish their updates. Reads and writes can operate table concurrently).
+ (i.e., application does not need to stop the readers from accessing the hash table until writers finish their updates. Readers and writers can operate on the table concurrently).
+ The library uses a reader-writer lock to provide the concurrency.
* In addition to these two flag values, if the transactional memory flag (RTE_HASH_EXTRA_FLAGS_TRANS_MEM_SUPPORT) is also set,
- hardware transactional memory will be used to guarantee the thread safety as long as it is supported by the hardware (for example the Intel® TSX support).
+ the reader-writer lock will use hardware transactional memory to guarantee thread safety as long as it is supported by the hardware (for example the Intel® TSX support).
+ If the platform supports Intel® TSX, it is advised to set the transactional memory flag, as this will speed up concurrent table operations.
+ Otherwise concurrent operations will be slower because of the overhead associated with the software locking mechanisms.
+
+* If lock free read/write concurrency (RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF) is set, read/write concurrency is provided without using reader-writer lock.
+ For platforms (ex: current Arm based platforms), that do not support transactional memory, it is advised to set this flag to achieve greater scalability in performance.
-If the platform supports Intel® TSX, it is advised to set the transactional memory flag, as this will speed up concurrent table operations.
-Otherwise concurrent operations will be slower because of the overhead associated with the software locking mechanisms.
+* If, do not free on delete (RTE_HASH_EXTRA_FLAGS_NO_FREE_ON_DEL) flag is set, the position of the entry in the hash is not freed upon calling delete. This flag is enabled
+ by default when lock free read/write concurrency is set. The application should free the position after all the readers have stopped referencing the position.
+ Where required, the application can make use of RCU mechanisms to determine when the readers have stopped referencing the position.
Implementation Details
----------------------
@@ -148,6 +159,14 @@ key is considered not able to be stored.
With random keys, this method allows the user to get around 90% of the table utilization, without
having to drop any stored entry (LRU) or allocate more memory (extended buckets).
+
+Example of deletion:
+
+Similar to lookup, the key is searched in its primary and secondary buckets. If the key is found, the bucket
+entry is marked as an empty slot. If the hash table was configured with 'no free on delete' or 'lock free read/write concurrency',
+the position of the key is not freed. It is the responsibility of the user to free the position while making sure that
+readers are not referencing the position anymore.
+
Entry distribution in hash table
--------------------------------
@@ -240,6 +259,10 @@ The flow table operations on the application side are described below:
* Delete flow: Delete the flow key from the hash. If the returned position is valid,
use it to access the flow entry in the flow table to invalidate the information associated with the flow.
+* Free flow: Free flow key position. If 'no free on delete' or 'lock-free read/write concurrency' flags are set,
+ wait till the readers are not referencing the position returned during add/delete flow and then free the position.
+ RCU mechanisms can be used to find out when the readers are not referencing the position anymore.
+
* Lookup flow: Lookup for the flow key in the hash.
If the returned position is valid (flow lookup hit), use the returned position to access the flow entry in the flow table.
Otherwise (flow lookup miss) there is no flow registered for the current packet.
diff --git a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst
index 3b920e53..2086e244 100644
--- a/doc/guides/prog_guide/index.rst
+++ b/doc/guides/prog_guide/index.rst
@@ -44,6 +44,7 @@ Programmer's Guide
thread_safety_dpdk_functions
eventdev
event_ethernet_rx_adapter
+ event_ethernet_tx_adapter
event_timer_adapter
event_crypto_adapter
qos_framework
@@ -52,7 +53,6 @@ Programmer's Guide
packet_framework
vhost_lib
metrics_lib
- port_hotplug_framework
bpf_lib
source_org
dev_kit_build_system
diff --git a/doc/guides/prog_guide/kernel_nic_interface.rst b/doc/guides/prog_guide/kernel_nic_interface.rst
index 8fa13fa1..33ea980e 100644
--- a/doc/guides/prog_guide/kernel_nic_interface.rst
+++ b/doc/guides/prog_guide/kernel_nic_interface.rst
@@ -29,58 +29,222 @@ The components of an application using the DPDK Kernel NIC Interface are shown i
The DPDK KNI Kernel Module
--------------------------
-The KNI kernel loadable module provides support for two types of devices:
+The KNI kernel loadable module ``rte_kni`` provides the kernel interface
+for DPDK applications.
-* A Miscellaneous device (/dev/kni) that:
+When the ``rte_kni`` module is loaded, it will create a device ``/dev/kni``
+that is used by the DPDK KNI API functions to control and communicate with
+the kernel module.
- * Creates net devices (via ioctl calls).
+The ``rte_kni`` kernel module contains several optional parameters which
+can be specified when the module is loaded to control its behavior:
- * Maintains a kernel thread context shared by all KNI instances
- (simulating the RX side of the net driver).
+.. code-block:: console
- * For single kernel thread mode, maintains a kernel thread context shared by all KNI instances
- (simulating the RX side of the net driver).
+ # modinfo rte_kni.ko
+ <snip>
+ parm: lo_mode: KNI loopback mode (default=lo_mode_none):
+ lo_mode_none Kernel loopback disabled
+ lo_mode_fifo Enable kernel loopback with fifo
+ lo_mode_fifo_skb Enable kernel loopback with fifo and skb buffer
+ (charp)
+ parm: kthread_mode: Kernel thread mode (default=single):
+ single Single kernel thread mode enabled.
+ multiple Multiple kernel thread mode enabled.
+ (charp)
+ parm: carrier: Default carrier state for KNI interface (default=off):
+ off Interfaces will be created with carrier state set to off.
+ on Interfaces will be created with carrier state set to on.
+ (charp)
- * For multiple kernel thread mode, maintains a kernel thread context for each KNI instance
- (simulating the RX side of the net driver).
+Loading the ``rte_kni`` kernel module without any optional parameters is
+the typical way a DPDK application gets packets into and out of the kernel
+network stack. Without any parameters, only one kernel thread is created
+for all KNI devices for packet receiving in kernel side, loopback mode is
+disabled, and the default carrier state of KNI interfaces is set to *off*.
-* Net device:
+.. code-block:: console
- * Net functionality provided by implementing several operations such as netdev_ops,
- header_ops, ethtool_ops that are defined by struct net_device,
- including support for DPDK mbufs and FIFOs.
+ # insmod kmod/rte_kni.ko
- * The interface name is provided from userspace.
+.. _kni_loopback_mode:
- * The MAC address can be the real NIC MAC address or random.
+Loopback Mode
+~~~~~~~~~~~~~
+
+For testing, the ``rte_kni`` kernel module can be loaded in loopback mode
+by specifying the ``lo_mode`` parameter:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko lo_mode=lo_mode_fifo
+
+The ``lo_mode_fifo`` loopback option will loop back ring enqueue/dequeue
+operations in kernel space.
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko lo_mode=lo_mode_fifo_skb
+
+The ``lo_mode_fifo_skb`` loopback option will loop back ring enqueue/dequeue
+operations and sk buffer copies in kernel space.
+
+If the ``lo_mode`` parameter is not specified, loopback mode is disabled.
+
+.. _kni_kernel_thread_mode:
+
+Kernel Thread Mode
+~~~~~~~~~~~~~~~~~~
+
+To provide flexibility of performance, the ``rte_kni`` KNI kernel module
+can be loaded with the ``kthread_mode`` parameter. The ``rte_kni`` kernel
+module supports two options: "single kernel thread" mode and "multiple
+kernel thread" mode.
+
+Single kernel thread mode is enabled as follows:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko kthread_mode=single
+
+This mode will create only one kernel thread for all KNI interfaces to
+receive data on the kernel side. By default, this kernel thread is not
+bound to any particular core, but the user can set the core affinity for
+this kernel thread by setting the ``core_id`` and ``force_bind`` parameters
+in ``struct rte_kni_conf`` when the first KNI interface is created:
+
+For optimum performance, the kernel thread should be bound to a core in
+on the same socket as the DPDK lcores used in the application.
+
+The KNI kernel module can also be configured to start a separate kernel
+thread for each KNI interface created by the DPDK application. Multiple
+kernel thread mode is enabled as follows:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko kthread_mode=multiple
+
+This mode will create a separate kernel thread for each KNI interface to
+receive data on the kernel side. The core affinity of each ``kni_thread``
+kernel thread can be specified by setting the ``core_id`` and ``force_bind``
+parameters in ``struct rte_kni_conf`` when each KNI interface is created.
+
+Multiple kernel thread mode can provide scalable higher performance if
+sufficient unused cores are available on the host system.
+
+If the ``kthread_mode`` parameter is not specified, the "single kernel
+thread" mode is used.
+
+.. _kni_default_carrier_state:
+
+Default Carrier State
+~~~~~~~~~~~~~~~~~~~~~
+
+The default carrier state of KNI interfaces created by the ``rte_kni``
+kernel module is controlled via the ``carrier`` option when the module
+is loaded.
+
+If ``carrier=off`` is specified, the kernel module will leave the carrier
+state of the interface *down* when the interface is management enabled.
+The DPDK application can set the carrier state of the KNI interface using the
+``rte_kni_update_link()`` function. This is useful for DPDK applications
+which require that the carrier state of the KNI interface reflect the
+actual link state of the corresponding physical NIC port.
+
+If ``carrier=on`` is specified, the kernel module will automatically set
+the carrier state of the interface to *up* when the interface is management
+enabled. This is useful for DPDK applications which use the KNI interface as
+a purely virtual interface that does not correspond to any physical hardware
+and do not wish to explicitly set the carrier state of the interface with
+``rte_kni_update_link()``. It is also useful for testing in loopback mode
+where the NIC port may not be physically connected to anything.
+
+To set the default carrier state to *on*:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko carrier=on
+
+To set the default carrier state to *off*:
+
+.. code-block:: console
+
+ # insmod kmod/rte_kni.ko carrier=off
+
+If the ``carrier`` parameter is not specified, the default carrier state
+of KNI interfaces will be set to *off*.
KNI Creation and Deletion
-------------------------
-The KNI interfaces are created by a DPDK application dynamically.
-The interface name and FIFO details are provided by the application through an ioctl call
-using the rte_kni_device_info struct which contains:
+Before any KNI interfaces can be created, the ``rte_kni`` kernel module must
+be loaded into the kernel and configured withe ``rte_kni_init()`` function.
+
+The KNI interfaces are created by a DPDK application dynamically via the
+``rte_kni_alloc()`` function.
+
+The ``struct rte_kni_conf`` structure contains fields which allow the
+user to specify the interface name, set the MTU size, set an explicit or
+random MAC address and control the affinity of the kernel Rx thread(s)
+(both single and multi-threaded modes).
+
+The ``struct rte_kni_ops`` structure contains pointers to functions to
+handle requests from the ``rte_kni`` kernel module. These functions
+allow DPDK applications to perform actions when the KNI interfaces are
+manipulated by control commands or functions external to the application.
+
+For example, the DPDK application may wish to enabled/disable a physical
+NIC port when a user enabled/disables a KNI interface with ``ip link set
+[up|down] dev <ifaceX>``. The DPDK application can register a callback for
+``config_network_if`` which will be called when the interface management
+state changes.
+
+There are currently four callbacks for which the user can register
+application functions:
-* The interface name.
+``config_network_if``:
-* Physical addresses of the corresponding memzones for the relevant FIFOs.
+ Called when the management state of the KNI interface changes.
+ For example, when the user runs ``ip link set [up|down] dev <ifaceX>``.
-* Mbuf mempool details, both physical and virtual (to calculate the offset for mbuf pointers).
+``change_mtu``:
-* PCI information.
+ Called when the user changes the MTU size of the KNI
+ interface. For example, when the user runs ``ip link set mtu <size>
+ dev <ifaceX>``.
-* Core affinity.
+``config_mac_address``:
-Refer to rte_kni_common.h in the DPDK source code for more details.
+ Called when the user changes the MAC address of the KNI interface.
+ For example, when the user runs ``ip link set address <MAC>
+ dev <ifaceX>``. If the user sets this callback function to NULL,
+ but sets the ``port_id`` field to a value other than -1, a default
+ callback handler in the rte_kni library ``kni_config_mac_address()``
+ will be called which calls ``rte_eth_dev_default_mac_addr_set()``
+ on the specified ``port_id``.
-The physical addresses will be re-mapped into the kernel address space and stored in separate KNI contexts.
+``config_promiscusity``:
-The affinity of kernel RX thread (both single and multi-threaded modes) is controlled by force_bind and
-core_id config parameters.
+ Called when the user changes the promiscusity state of the KNI
+ interface. For example, when the user runs ``ip link set promisc
+ [on|off] dev <ifaceX>``. If the user sets this callback function to
+ NULL, but sets the ``port_id`` field to a value other than -1, a default
+ callback handler in the rte_kni library ``kni_config_promiscusity()``
+ will be called which calls ``rte_eth_promiscuous_<enable|disable>()``
+ on the specified ``port_id``.
-The KNI interfaces can be deleted by a DPDK application dynamically after being created.
-Furthermore, all those KNI interfaces not deleted will be deleted on the release operation
-of the miscellaneous device (when the DPDK application is closed).
+In order to run these callbacks, the application must periodically call
+the ``rte_kni_handle_request()`` function. Any user callback function
+registered will be called directly from ``rte_kni_handle_request()`` so
+care must be taken to prevent deadlock and to not block any DPDK fastpath
+tasks. Typically DPDK applications which use these callbacks will need
+to create a separate thread or secondary process to periodically call
+``rte_kni_handle_request()``.
+
+The KNI interfaces can be deleted by a DPDK application with
+``rte_kni_release()``. All KNI interfaces not explicitly deleted will be
+deleted when the the ``/dev/kni`` device is closed, either explicitly with
+``rte_kni_close()`` or when the DPDK application is closed.
DPDK mbuf Flow
--------------
@@ -118,7 +282,7 @@ The packet is received from the Linux net stack, by calling the kni_net_tx() cal
The mbuf is dequeued (without waiting due the cache) and filled with data from sk_buff.
The sk_buff is then freed and the mbuf sent in the tx_q FIFO.
-The DPDK TX thread dequeues the mbuf and sends it to the PMD (via rte_eth_tx_burst()).
+The DPDK TX thread dequeues the mbuf and sends it to the PMD via ``rte_eth_tx_burst()``.
It then puts the mbuf back in the cache.
Ethtool
@@ -128,16 +292,3 @@ Ethtool is a Linux-specific tool with corresponding support in the kernel
where each net device must register its own callbacks for the supported operations.
The current implementation uses the igb/ixgbe modified Linux drivers for ethtool support.
Ethtool is not supported in i40e and VMs (VF or EM devices).
-
-Link state and MTU change
--------------------------
-
-Link state and MTU change are network interface specific operations usually done via ifconfig.
-The request is initiated from the kernel side (in the context of the ifconfig process)
-and handled by the user space DPDK application.
-The application polls the request, calls the application handler and returns the response back into the kernel space.
-
-The application handlers can be registered upon interface creation or explicitly registered/unregistered in runtime.
-This provides flexibility in multiprocess scenarios
-(where the KNI is created in the primary process but the callbacks are handled in the secondary one).
-The constraint is that a single process can register and handle the requests.
diff --git a/doc/guides/prog_guide/packet_framework.rst b/doc/guides/prog_guide/packet_framework.rst
index f0b48566..48d25750 100644
--- a/doc/guides/prog_guide/packet_framework.rst
+++ b/doc/guides/prog_guide/packet_framework.rst
@@ -98,6 +98,10 @@ Port Types
| | | character device. |
| | | |
+---+------------------+---------------------------------------------------------------------------------------+
+ | 9 | Sym_crypto | Output port used to extract DPDK Cryptodev operations from a fixed offset of the |
+ | | | packet and then enqueue to the Cryptodev PMD. Input port used to dequeue the |
+ | | | Cryptodev operations from the Cryptodev PMD and then retrieve the packets from them. |
+ +---+------------------+---------------------------------------------------------------------------------------+
Port Interface
~~~~~~~~~~~~~~
@@ -1078,6 +1082,11 @@ with each table entry having its own set of enabled user actions and its own cop
| | | checksum. |
| | | |
+---+-----------------------------------+---------------------------------------------------------------------+
+ | 7 | Sym Crypto | Generate Cryptodev session based on the user-specified algorithm |
+ | | | and key(s), and assemble the cryptodev operation based on the |
+ | | | predefined offsets. |
+ | | | |
+ +---+-----------------------------------+---------------------------------------------------------------------+
Multicore Scaling
-----------------
@@ -1133,7 +1142,7 @@ Typical devices with acceleration capabilities are:
* Inline accelerators: NICs, switches, FPGAs, etc;
-* Look-aside accelerators: chipsets, FPGAs, etc.
+* Look-aside accelerators: chipsets, FPGAs, Intel QuickAssist, etc.
Usually, to support a specific functional block, specific implementation of Packet Framework tables and/or ports and/or actions has to be provided for each accelerator,
with all the implementations sharing the same API: pure SW implementation (no acceleration), implementation using accelerator A, implementation using accelerator B, etc.
diff --git a/doc/guides/prog_guide/port_hotplug_framework.rst b/doc/guides/prog_guide/port_hotplug_framework.rst
deleted file mode 100644
index fb0efc18..00000000
--- a/doc/guides/prog_guide/port_hotplug_framework.rst
+++ /dev/null
@@ -1,106 +0,0 @@
-.. BSD LICENSE
- Copyright(c) 2015 IGEL Co.,Ltd. All rights reserved.
- All rights reserved.
-
- Redistribution and use in source and binary forms, with or without
- modification, are permitted provided that the following conditions
- are met:
-
- * Redistributions of source code must retain the above copyright
- notice, this list of conditions and the following disclaimer.
- * Redistributions in binary form must reproduce the above copyright
- notice, this list of conditions and the following disclaimer in
- the documentation and/or other materials provided with the
- distribution.
- * Neither the name of IGEL Co.,Ltd. nor the names of its
- contributors may be used to endorse or promote products derived
- from this software without specific prior written permission.
-
- THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
- "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
- LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
- A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
- OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
- SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
- LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
- DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
- THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
- (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
- OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
-Port Hotplug Framework
-======================
-
-The Port Hotplug Framework provides DPDK applications with the ability to
-attach and detach ports at runtime. Because the framework depends on PMD
-implementation, the ports that PMDs cannot handle are out of scope of this
-framework. Furthermore, after detaching a port from a DPDK application, the
-framework doesn't provide a way for removing the devices from the system.
-For the ports backed by a physical NIC, the kernel will need to support PCI
-Hotplug feature.
-
-Overview
---------
-
-The basic requirements of the Port Hotplug Framework are:
-
-* DPDK applications that use the Port Hotplug Framework must manage their
- own ports.
-
- The Port Hotplug Framework is implemented to allow DPDK applications to
- manage ports. For example, when DPDK applications call the port attach
- function, the attached port number is returned. DPDK applications can
- also detach the port by port number.
-
-* Kernel support is needed for attaching or detaching physical device
- ports.
-
- To attach new physical device ports, the device will be recognized by
- userspace driver I/O framework in kernel at first. Then DPDK
- applications can call the Port Hotplug functions to attach the ports.
- For detaching, steps are vice versa.
-
-* Before detaching, they must be stopped and closed.
-
- DPDK applications must call "rte_eth_dev_stop()" and
- "rte_eth_dev_close()" APIs before detaching ports. These functions will
- start finalization sequence of the PMDs.
-
-* The framework doesn't affect legacy DPDK applications behavior.
-
- If the Port Hotplug functions aren't called, all legacy DPDK apps can
- still work without modifications.
-
-Port Hotplug API overview
--------------------------
-
-* Attaching a port
-
- "rte_eth_dev_attach()" API attaches a port to DPDK application, and
- returns the attached port number. Before calling the API, the device
- should be recognized by an userspace driver I/O framework. The API
- receives a pci address like "0000:01:00.0" or a virtual device name
- like "net_pcap0,iface=eth0". In the case of virtual device name, the
- format is the same as the general "--vdev" option of DPDK.
-
-* Detaching a port
-
- "rte_eth_dev_detach()" API detaches a port from DPDK application, and
- returns a pci address of the detached device or a virtual device name
- of the device.
-
-Reference
----------
-
- "testpmd" supports the Port Hotplug Framework.
-
-Limitations
------------
-
-* The Port Hotplug APIs are not thread safe.
-
-* The framework can only be enabled with Linux. BSD is not supported.
-
-* Not all PMDs support detaching feature.
- The underlying bus must support hot-unplug. If not supported,
- the function ``rte_eth_dev_detach()`` will return negative ENOTSUP.
diff --git a/doc/guides/prog_guide/power_man.rst b/doc/guides/prog_guide/power_man.rst
index eba1cc6b..68b7e8b6 100644
--- a/doc/guides/prog_guide/power_man.rst
+++ b/doc/guides/prog_guide/power_man.rst
@@ -106,6 +106,92 @@ User Cases
The power management mechanism is used to save power when performing L3 forwarding.
+
+Empty Poll API
+--------------
+
+Abstract
+~~~~~~~~
+
+For packet processing workloads such as DPDK polling is continuous.
+This means CPU cores always show 100% busy independent of how much work
+those cores are doing. It is critical to accurately determine how busy
+a core is hugely important for the following reasons:
+
+ * No indication of overload conditions
+ * User does not know how much real load is on a system, resulting
+ in wasted energy as no power management is utilized
+
+Compared to the original l3fwd-power design, instead of going to sleep
+after detecting an empty poll, the new mechanism just lowers the core frequency.
+As a result, the application does not stop polling the device, which leads
+to improved handling of bursts of traffic.
+
+When the system become busy, the empty poll mechanism can also increase the core
+frequency (including turbo) to do best effort for intensive traffic. This gives
+us more flexible and balanced traffic awareness over the standard l3fwd-power
+application.
+
+
+Proposed Solution
+~~~~~~~~~~~~~~~~~
+The proposed solution focuses on how many times empty polls are executed.
+The less the number of empty polls, means current core is busy with processing
+workload, therefore, the higher frequency is needed. The high empty poll number
+indicates the current core not doing any real work therefore, we can lower the
+frequency to safe power.
+
+In the current implementation, each core has 1 empty-poll counter which assume
+1 core is dedicated to 1 queue. This will need to be expanded in the future to
+support multiple queues per core.
+
+Power state definition:
+^^^^^^^^^^^^^^^^^^^^^^^
+
+* LOW: Not currently used, reserved for future use.
+
+* MED: the frequency is used to process modest traffic workload.
+
+* HIGH: the frequency is used to process busy traffic workload.
+
+There are two phases to establish the power management system:
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Training phase. This phase is used to measure the optimal frequency
+ change thresholds for a given system. The thresholds will differ from
+ system to system due to differences in processor micro-architecture,
+ cache and device configurations.
+ In this phase, the user must ensure that no traffic can enter the
+ system so that counts can be measured for empty polls at low, medium
+ and high frequencies. Each frequency is measured for two seconds.
+ Once the training phase is complete, the threshold numbers are
+ displayed, and normal mode resumes, and traffic can be allowed into
+ the system. These threshold number can be used on the command line
+ when starting the application in normal mode to avoid re-training
+ every time.
+
+* Normal phase. Every 10ms the run-time counters are compared
+ to the supplied threshold values, and the decision will be made
+ whether to move to a different power state (by adjusting the
+ frequency).
+
+API Overview for Empty Poll Power Management
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+* **State Init**: initialize the power management system.
+
+* **State Free**: free the resource hold by power management system.
+
+* **Update Empty Poll Counter**: update the empty poll counter.
+
+* **Update Valid Poll Counter**: update the valid poll counter.
+
+* **Set the Fequence Index**: update the power state/frequency mapping.
+
+* **Detect empty poll state change**: empty poll state change detection algorithm then take action.
+
+User Cases
+----------
+The mechanism can applied to any device which is based on polling. e.g. NIC, FPGA.
+
References
----------
diff --git a/doc/guides/prog_guide/profile_app.rst b/doc/guides/prog_guide/profile_app.rst
index 1106216a..02f05614 100644
--- a/doc/guides/prog_guide/profile_app.rst
+++ b/doc/guides/prog_guide/profile_app.rst
@@ -33,38 +33,12 @@ Refer to the
for details about application profiling.
-Empty cycles tracing
+Profiling with VTune
~~~~~~~~~~~~~~~~~~~~
-Iterations that yielded no RX packets (empty cycles, wasted iterations) can
-be analyzed using VTune Amplifier. This profiling employs the
-`Instrumentation and Tracing Technology (ITT) API
-<https://software.intel.com/en-us/node/544195>`_
-feature of VTune Amplifier and requires only reconfiguring the DPDK library,
-no changes in a DPDK application are needed.
-
-To trace wasted iterations on RX queues, first reconfigure DPDK with
-``CONFIG_RTE_ETHDEV_RXTX_CALLBACKS`` and
-``CONFIG_RTE_ETHDEV_PROFILE_ITT_WASTED_RX_ITERATIONS`` enabled.
-
-Then rebuild DPDK, specifying paths to the ITT header and library, which can
-be found in any VTune Amplifier distribution in the *include* and *lib*
-directories respectively:
-
-.. code-block:: console
-
- make EXTRA_CFLAGS=-I<path to ittnotify.h> \
- EXTRA_LDLIBS="-L<path to libittnotify.a> -littnotify"
-
-Finally, to see wasted iterations in your performance analysis results,
-select the *"Analyze user tasks, events, and counters"* checkbox in the
-*"Analysis Type"* tab when configuring analysis via VTune Amplifier GUI.
-Alternatively, when running VTune Amplifier via command line, specify
-``-knob enable-user-tasks=true`` option.
-
-Collected regions of wasted iterations will be marked on VTune Amplifier's
-timeline as ITT tasks. These ITT tasks have predefined names, containing
-Ethernet device and RX queue identifiers.
+To allow VTune attaching to the DPDK application, reconfigure and recompile
+the DPDK with ``CONFIG_RTE_ETHDEV_RXTX_CALLBACKS`` and
+``CONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNE`` enabled.
Profiling on ARM64
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index b305a72a..c1863750 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -1191,6 +1191,27 @@ Normally preceded by any of:
- `Item: ICMP6_ND_NS`_
- `Item: ICMP6_ND_OPT`_
+Item: ``META``
+^^^^^^^^^^^^^^
+
+Matches an application specific 32 bit metadata item.
+
+- Default ``mask`` matches the specified metadata value.
+
+.. _table_rte_flow_item_meta:
+
+.. table:: META
+
+ +----------+----------+---------------------------------------+
+ | Field | Subfield | Value |
+ +==========+==========+=======================================+
+ | ``spec`` | ``data`` | 32 bit metadata value |
+ +----------+--------------------------------------------------+
+ | ``last`` | ``data`` | upper range value |
+ +----------+----------+---------------------------------------+
+ | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
+ +----------+----------+---------------------------------------+
+
Actions
~~~~~~~
@@ -2076,6 +2097,250 @@ RTE_FLOW_ERROR_TYPE_ACTION error should be returned.
This action modifies the payload of matched flows.
+Action: ``RAW_ENCAP``
+^^^^^^^^^^^^^^^^^^^^^
+
+Adds outer header whose template is provided in its data buffer,
+as defined in the ``rte_flow_action_raw_encap`` definition.
+
+This action modifies the payload of matched flows. The data supplied must
+be a valid header, either holding layer 2 data in case of adding layer 2 after
+decap layer 3 tunnel (for example MPLSoGRE) or complete tunnel definition
+starting from layer 2 and moving to the tunnel item itself. When applied to
+the original packet the resulting packet must be a valid packet.
+
+.. _table_rte_flow_action_raw_encap:
+
+.. table:: RAW_ENCAP
+
+ +----------------+----------------------------------------+
+ | Field | Value |
+ +================+========================================+
+ | ``data`` | Encapsulation data |
+ +----------------+----------------------------------------+
+ | ``preserve`` | Bit-mask of data to preserve on output |
+ +----------------+----------------------------------------+
+ | ``size`` | Size of data and preserve |
+ +----------------+----------------------------------------+
+
+Action: ``RAW_DECAP``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Remove outer header whose template is provided in its data buffer,
+as defined in the ``rte_flow_action_raw_decap``
+
+This action modifies the payload of matched flows. The data supplied must
+be a valid header, either holding layer 2 data in case of removing layer 2
+before eincapsulation of layer 3 tunnel (for example MPLSoGRE) or complete
+tunnel definition starting from layer 2 and moving to the tunnel item itself.
+When applied to the original packet the resulting packet must be a
+valid packet.
+
+.. _table_rte_flow_action_raw_decap:
+
+.. table:: RAW_DECAP
+
+ +----------------+----------------------------------------+
+ | Field | Value |
+ +================+========================================+
+ | ``data`` | Decapsulation data |
+ +----------------+----------------------------------------+
+ | ``size`` | Size of data |
+ +----------------+----------------------------------------+
+
+Action: ``SET_IPV4_SRC``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Set a new IPv4 source address in the outermost IPv4 header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_IPV4 flow pattern item.
+Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_set_ipv4_src:
+
+.. table:: SET_IPV4_SRC
+
+ +-----------------------------------------+
+ | Field | Value |
+ +===============+=========================+
+ | ``ipv4_addr`` | new IPv4 source address |
+ +---------------+-------------------------+
+
+Action: ``SET_IPV4_DST``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Set a new IPv4 destination address in the outermost IPv4 header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_IPV4 flow pattern item.
+Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_set_ipv4_dst:
+
+.. table:: SET_IPV4_DST
+
+ +---------------+------------------------------+
+ | Field | Value |
+ +===============+==============================+
+ | ``ipv4_addr`` | new IPv4 destination address |
+ +---------------+------------------------------+
+
+Action: ``SET_IPV6_SRC``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Set a new IPv6 source address in the outermost IPv6 header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_IPV6 flow pattern item.
+Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_set_ipv6_src:
+
+.. table:: SET_IPV6_SRC
+
+ +---------------+-------------------------+
+ | Field | Value |
+ +===============+=========================+
+ | ``ipv6_addr`` | new IPv6 source address |
+ +---------------+-------------------------+
+
+Action: ``SET_IPV6_DST``
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+Set a new IPv6 destination address in the outermost IPv6 header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_IPV6 flow pattern item.
+Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_set_ipv6_dst:
+
+.. table:: SET_IPV6_DST
+
+ +---------------+------------------------------+
+ | Field | Value |
+ +===============+==============================+
+ | ``ipv6_addr`` | new IPv6 destination address |
+ +---------------+------------------------------+
+
+Action: ``SET_TP_SRC``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Set a new source port number in the outermost TCP/UDP header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_TCP or RTE_FLOW_ITEM_TYPE_UDP
+flow pattern item. Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_set_tp_src:
+
+.. table:: SET_TP_SRC
+
+ +----------+-------------------------+
+ | Field | Value |
+ +==========+=========================+
+ | ``port`` | new TCP/UDP source port |
+ +---------------+--------------------+
+
+Action: ``SET_TP_DST``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Set a new destination port number in the outermost TCP/UDP header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_TCP or RTE_FLOW_ITEM_TYPE_UDP
+flow pattern item. Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_set_tp_dst:
+
+.. table:: SET_TP_DST
+
+ +----------+------------------------------+
+ | Field | Value |
+ +==========+==============================+
+ | ``port`` | new TCP/UDP destination port |
+ +---------------+-------------------------+
+
+Action: ``MAC_SWAP``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Swap the source and destination MAC addresses in the outermost Ethernet
+header.
+
+It must be used with a valid RTE_FLOW_ITEM_TYPE_ETH flow pattern item.
+Otherwise, RTE_FLOW_ERROR_TYPE_ACTION error will be returned.
+
+.. _table_rte_flow_action_mac_swap:
+
+.. table:: MAC_SWAP
+
+ +---------------+
+ | Field |
+ +===============+
+ | no properties |
+ +---------------+
+
+Action: ``DEC_TTL``
+^^^^^^^^^^^^^^^^^^^
+
+Decrease TTL value.
+
+If there is no valid RTE_FLOW_ITEM_TYPE_IPV4 or RTE_FLOW_ITEM_TYPE_IPV6
+in pattern, Some PMDs will reject rule because behaviour will be undefined.
+
+.. _table_rte_flow_action_dec_ttl:
+
+.. table:: DEC_TTL
+
+ +---------------+
+ | Field |
+ +===============+
+ | no properties |
+ +---------------+
+
+Action: ``SET_TTL``
+^^^^^^^^^^^^^^^^^^^
+
+Assigns a new TTL value.
+
+If there is no valid RTE_FLOW_ITEM_TYPE_IPV4 or RTE_FLOW_ITEM_TYPE_IPV6
+in pattern, Some PMDs will reject rule because behaviour will be undefined.
+
+.. _table_rte_flow_action_set_ttl:
+
+.. table:: SET_TTL
+
+ +---------------+--------------------+
+ | Field | Value |
+ +===============+====================+
+ | ``ttl_value`` | new TTL value |
+ +---------------+--------------------+
+
+Action: ``SET_MAC_SRC``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set source MAC address
+
+.. _table_rte_flow_action_set_mac_src:
+
+.. table:: SET_MAC_SRC
+
+ +--------------+---------------+
+ | Field | Value |
+ +==============+===============+
+ | ``mac_addr`` | MAC address |
+ +--------------+---------------+
+
+Action: ``SET_MAC_DST``
+^^^^^^^^^^^^^^^^^^^^^^^
+
+Set source MAC address
+
+.. _table_rte_flow_action_set_mac_dst:
+
+.. table:: SET_MAC_DST
+
+ +--------------+---------------+
+ | Field | Value |
+ +==============+===============+
+ | ``mac_addr`` | MAC address |
+ +--------------+---------------+
+
Negative types
~~~~~~~~~~~~~~
@@ -2419,6 +2684,26 @@ This function initializes ``error`` (if non-NULL) with the provided
parameters and sets ``rte_errno`` to ``code``. A negative error ``code`` is
then returned.
+Object conversion
+~~~~~~~~~~~~~~~~~
+
+.. code-block:: c
+
+ int
+ rte_flow_conv(enum rte_flow_conv_op op,
+ void *dst,
+ size_t size,
+ const void *src,
+ struct rte_flow_error *error);
+
+Convert ``src`` to ``dst`` according to operation ``op``. Possible
+operations include:
+
+- Attributes, pattern item or action duplication.
+- Duplication of an entire pattern or list of actions.
+- Duplication of a complete flow rule description.
+- Pattern item or action name retrieval.
+
Caveats
-------
diff --git a/doc/guides/prog_guide/rte_security.rst b/doc/guides/prog_guide/rte_security.rst
index 0812abe7..cb70caa7 100644
--- a/doc/guides/prog_guide/rte_security.rst
+++ b/doc/guides/prog_guide/rte_security.rst
@@ -10,8 +10,8 @@ The security library provides a framework for management and provisioning
of security protocol operations offloaded to hardware based devices. The
library defines generic APIs to create and free security sessions which can
support full protocol offload as well as inline crypto operation with
-NIC or crypto devices. The framework currently only supports the IPSec protocol
-and associated operations, other protocols will be added in future.
+NIC or crypto devices. The framework currently only supports the IPsec and PDCP
+protocol and associated operations, other protocols will be added in future.
Design Principles
-----------------
@@ -253,6 +253,49 @@ for any protocol header addition.
+--------|--------+
V
+PDCP Flow Diagram
+~~~~~~~~~~~~~~~~~
+
+Based on 3GPP TS 36.323 Evolved Universal Terrestrial Radio Access (E-UTRA);
+Packet Data Convergence Protocol (PDCP) specification
+
+.. code-block:: c
+
+ Transmitting PDCP Entity Receiving PDCP Entity
+ | ^
+ | +-----------|-----------+
+ V | In order delivery and |
+ +---------|----------+ | Duplicate detection |
+ | Sequence Numbering | | (Data Plane only) |
+ +---------|----------+ +-----------|-----------+
+ | |
+ +---------|----------+ +-----------|----------+
+ | Header Compression*| | Header Decompression*|
+ | (Data-Plane only) | | (Data Plane only) |
+ +---------|----------+ +-----------|----------+
+ | |
+ +---------|-----------+ +-----------|----------+
+ | Integrity Protection| |Integrity Verification|
+ | (Control Plane only)| | (Control Plane only) |
+ +---------|-----------+ +-----------|----------+
+ +---------|-----------+ +----------|----------+
+ | Ciphering | | Deciphering |
+ +---------|-----------+ +----------|----------+
+ +---------|-----------+ +----------|----------+
+ | Add PDCP header | | Remove PDCP Header |
+ +---------|-----------+ +----------|----------+
+ | |
+ +----------------->>----------------+
+
+
+.. note::
+
+ * Header Compression and decompression are not supported currently.
+
+Just like IPsec, in case of PDCP also header addition/deletion, cipher/
+de-cipher, integrity protection/verification is done based on the action
+type chosen.
+
Device Features and Capabilities
---------------------------------
@@ -271,7 +314,7 @@ structure in the *DPDK API Reference*.
Each driver (crypto or ethernet) defines its own private array of capabilities
for the operations it supports. Below is an example of the capabilities for a
-PMD which supports the IPSec protocol.
+PMD which supports the IPsec and PDCP protocol.
.. code-block:: c
@@ -298,6 +341,24 @@ PMD which supports the IPSec protocol.
},
.crypto_capabilities = pmd_capabilities
},
+ { /* PDCP Lookaside Protocol offload Data Plane */
+ .action = RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
+ .protocol = RTE_SECURITY_PROTOCOL_PDCP,
+ .pdcp = {
+ .domain = RTE_SECURITY_PDCP_MODE_DATA,
+ .capa_flags = 0
+ },
+ .crypto_capabilities = pmd_capabilities
+ },
+ { /* PDCP Lookaside Protocol offload Control */
+ .action = RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
+ .protocol = RTE_SECURITY_PROTOCOL_PDCP,
+ .pdcp = {
+ .domain = RTE_SECURITY_PDCP_MODE_CONTROL,
+ .capa_flags = 0
+ },
+ .crypto_capabilities = pmd_capabilities
+ },
{
.action = RTE_SECURITY_ACTION_TYPE_NONE
}
@@ -429,6 +490,7 @@ Security Session configuration structure is defined as ``rte_security_session_co
union {
struct rte_security_ipsec_xform ipsec;
struct rte_security_macsec_xform macsec;
+ struct rte_security_pdcp_xform pdcp;
};
/**< Configuration parameters for security session */
struct rte_crypto_sym_xform *crypto_xform;
@@ -463,15 +525,17 @@ The ``rte_security_session_protocol`` is defined as
.. code-block:: c
enum rte_security_session_protocol {
- RTE_SECURITY_PROTOCOL_IPSEC,
+ RTE_SECURITY_PROTOCOL_IPSEC = 1,
/**< IPsec Protocol */
RTE_SECURITY_PROTOCOL_MACSEC,
/**< MACSec Protocol */
+ RTE_SECURITY_PROTOCOL_PDCP,
+ /**< PDCP Protocol */
};
-Currently the library defines configuration parameters for IPSec only. For other
-protocols like MACSec, structures and enums are defined as place holders which
-will be updated in the future.
+Currently the library defines configuration parameters for IPsec and PDCP only.
+For other protocols like MACSec, structures and enums are defined as place holders
+which will be updated in the future.
IPsec related configuration parameters are defined in ``rte_security_ipsec_xform``
@@ -494,6 +558,35 @@ IPsec related configuration parameters are defined in ``rte_security_ipsec_xform
/**< Tunnel parameters, NULL for transport mode */
};
+PDCP related configuration parameters are defined in ``rte_security_pdcp_xform``
+
+.. code-block:: c
+
+ struct rte_security_pdcp_xform {
+ int8_t bearer; /**< PDCP bearer ID */
+ /** Enable in order delivery, this field shall be set only if
+ * driver/HW is capable. See RTE_SECURITY_PDCP_ORDERING_CAP.
+ */
+ uint8_t en_ordering;
+ /** Notify driver/HW to detect and remove duplicate packets.
+ * This field should be set only when driver/hw is capable.
+ * See RTE_SECURITY_PDCP_DUP_DETECT_CAP.
+ */
+ uint8_t remove_duplicates;
+ /** PDCP mode of operation: Control or data */
+ enum rte_security_pdcp_domain domain;
+ /** PDCP Frame Direction 0:UL 1:DL */
+ enum rte_security_pdcp_direction pkt_dir;
+ /** Sequence number size, 5/7/12/15/18 */
+ enum rte_security_pdcp_sn_size sn_size;
+ /** Starting Hyper Frame Number to be used together with the SN
+ * from the PDCP frames
+ */
+ uint32_t hfn;
+ /** HFN Threshold for key renegotiation */
+ uint32_t hfn_threshold;
+ };
+
Security API
~~~~~~~~~~~~
diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst
index 77af4d77..c77df338 100644
--- a/doc/guides/prog_guide/vhost_lib.rst
+++ b/doc/guides/prog_guide/vhost_lib.rst
@@ -106,6 +106,14 @@ The following is an overview of some key Vhost API functions:
Enabling this flag with these Qemu version results in Qemu being blocked
when multiple queue pairs are declared.
+ - ``RTE_VHOST_USER_POSTCOPY_SUPPORT``
+
+ Postcopy live-migration support will be enabled when this flag is set.
+ It is disabled by default.
+
+ Enabling this flag should only be done when the calling application does
+ not pre-fault the guest shared memory, otherwise migration would fail.
+
* ``rte_vhost_driver_set_features(path, features)``
This function sets the feature bits the vhost-user driver supports. The