aboutsummaryrefslogtreecommitdiffstats
path: root/docs/report/vpp_performance_tests
diff options
context:
space:
mode:
authorMaciek Konstantynowicz <mkonstan@cisco.com>2018-02-19 14:16:30 +0000
committerTibor Frank <tifrank@cisco.com>2018-02-19 14:33:02 +0000
commitcddac498bafc7a6092dade5e183e5c7a95cff64d (patch)
tree995f4d8be7530ee537a2c133900264e1532681da /docs/report/vpp_performance_tests
parentec52d31c064fff2a6bf9c1e0716efa881bcf106e (diff)
rls1801 report: edits to static content for vpp and dpdk perf sections.
Change-Id: I22a38d2704b3a414798823c1846ff12f8f69d7b7 Signed-off-by: Maciek Konstantynowicz <mkonstan@cisco.com>
Diffstat (limited to 'docs/report/vpp_performance_tests')
-rw-r--r--docs/report/vpp_performance_tests/csit_release_notes.rst72
-rw-r--r--docs/report/vpp_performance_tests/overview.rst164
2 files changed, 142 insertions, 94 deletions
diff --git a/docs/report/vpp_performance_tests/csit_release_notes.rst b/docs/report/vpp_performance_tests/csit_release_notes.rst
index 754abc0d13..17003bc85a 100644
--- a/docs/report/vpp_performance_tests/csit_release_notes.rst
+++ b/docs/report/vpp_performance_tests/csit_release_notes.rst
@@ -6,32 +6,32 @@ Changes in CSIT |release|
#. Added VPP performance tests
- - **Container Topologies Orchestrated by K8s with VPP memif tests**
-
- - Added tests with VPP in L2 Cross-Connect and Bridge-Domain
- configurations containers, with service chain topologies orchestrated by
- Kubernetes. Added following forwarding topologies: i) "Parallel" with
- packets flowing from NIC via VPP to container and back to VPP and NIC;
- ii) "Chained" a.k.a. "Snake" with packets flowing via VPP to container,
- back to VPP, to next container, back to VPP and so on until the last
- container in chain, then back to VPP and NIC; iii) "Horizontal" with
- packets flowing via VPP to container, then via "horizontal" memif to
- next container, and so on until the last container, then back to VPP and
- NIC;.
+ - **Container Service Chain Topologies Orchestrated by K8s with VPP Memif**
+
+ - Added tests with VPP vswitch in container connecting a number of VPP-
+ in-container service chain topologies with L2 Cross-Connect and L2
+ Bridge-Domain configurations, orchestrated by Kubernetes. Added
+ following forwarding topologies: i) "Parallel" with packets flowing from
+ NIC via VPP to container and back to VPP and NIC; ii) "Chained" (a.k.a.
+ "Snake") with packets flowing via VPP to container, back to VPP, to next
+ container, back to VPP and so on until the last container in a chain,
+ then back to VPP and NIC; iii) "Horizontal" with packets flowing via VPP
+ to container, then via "horizontal" memif to next container, and so on
+ until the last container, then back to VPP and NIC;
- **VPP TCP/IP stack**
- Added tests for VPP TCP/IP stack using VPP built-in HTTP server.
WRK traffic generator is used as a client-side;
- - **SRv6 tests**
+ - **SRv6**
- Initial SRv6 (Segment Routing IPv6) tests verifying performance of
IPv6 and SRH (Segment Routing Header) encapsulation, decapsulation,
lookups and rewrites based on configured End and End.DX6 SRv6 egress
functions;
- - **IPSecSW tests**
+ - **IPSecSW**
- SW computed IPSec encryption with AES-GCM, CBC-SHA1 ciphers, in
combination with IPv4 routed-forwarding;
@@ -42,7 +42,7 @@ Changes in CSIT |release|
VPP tests into Presentation and Analytics Layer (PAL) for automated
CSIT test results analysis;
-#. Other improvements
+#. Other changes
- **Framework optimizations**
@@ -50,15 +50,30 @@ Changes in CSIT |release|
- Overall stability improvements;
+ - **NDR and PDR throughput binary search change**
+
+ - Increased binary search resolution by reducing final step from
+ 100kpps to 50kpps;
+
+ - **VPP plugin loaded as needed by tests**
+
+ - From this release only plugins required by tests are loaded at
+ VPP initialization time. Previously all plugins were loaded for
+ all tests;
+
Performance Changes
-------------------
-Substantial changes in measured packet throughput have been observed in a
-number of CSIT |release| tests listed below. Relative changes for this release
-are calculated against the test results listed in CSIT |release-1| report. The
-comparison is calculated between the mean values based on collected and
-archived test results' samples for involved VPP releases. Standard deviation
-has been also listed for CSIT |release|.
+Relative performance changes in measured packet throughput in CSIT
+|release| are calculated against the results from CSIT |release-1|
+report. Listed mean and standard deviation values are computed based on
+a series of the same tests executed against respective VPP releases to
+verify test results repeatibility, with percentage change calculated for
+mean values. Note that the standard deviation is quite high for a small
+number of packet throughput tests, what indicates poor test results
+repeatability and makes the relative change of mean throughput value not
+fully representative for these tests. The root causes behind poor
+results repeatibility vary between the test cases.
NDR Throughput Changes
~~~~~~~~~~~~~~~~~~~~~~
@@ -97,13 +112,14 @@ Here is the list of known issues in CSIT |release| for VPP performance tests:
| 1 | Vic1385 and Vic1227 low performance. | VPP-664 | Low NDR performance. |
| | | | |
+---+-------------------------------------------------+------------+-----------------------------------------------------------------+
-| 2 | Sporadic NDR discovery test failures on x520. | CSIT-750 | Suspected issue with HW combination of X710-X520 in LF |
-| | | | infrastructure. Issue can't be replicated outside LF. |
+| 2 | Sporadic (1 in 200) NDR discovery test failures | CSIT-570 | DPDK reporting rx-errors, indicating L1 issue. Suspected issue |
+| | on x520. | | with HW combination of X710-X520 in LF testbeds. Not observed |
+| | | | outside of LF testbeds. |
+---+-------------------------------------------------+------------+-----------------------------------------------------------------+
-| 3 | VPP in 2t2c setups - large variation | CSIT-568 | Suspected NIC firmware or DPDK driver issue affecting NDR |
-| | of discovered NDR throughput values across | | throughput. Applies to XL710 and X710 NICs, x520 NICs are fine. |
-| | multiple test runs with xl710 and x710 NICs. | | |
-+---+-------------------------------------------------+------------+-----------------------------------------------------------------+
-| 4 | Lower than expected NDR throughput with | CSIT-569 | Suspected NIC firmware or DPDK driver issue affecting NDR and |
+| 3 | Lower than expected NDR throughput with | CSIT-571 | Suspected NIC firmware or DPDK driver issue affecting NDR and |
| | xl710 and x710 NICs, compared to x520 NICs. | | PDR throughput. Applies to XL710 and X710 NICs. |
+---+-------------------------------------------------+------------+-----------------------------------------------------------------+
+| 4 | QAT IPSec scale with 1000 tunnels (interfaces) | VPP-1121 | VPP crashes during configuration of 1000 IPsec tunnels. |
+| | in 2t2c config, all tests are failing. | | 1t1c tests are not affected |
++---+-------------------------------------------------+------------+-----------------------------------------------------------------+
+
diff --git a/docs/report/vpp_performance_tests/overview.rst b/docs/report/vpp_performance_tests/overview.rst
index f243637a6f..86bea87c0b 100644
--- a/docs/report/vpp_performance_tests/overview.rst
+++ b/docs/report/vpp_performance_tests/overview.rst
@@ -10,23 +10,23 @@ CSIT VPP performance tests are executed on physical baremetal servers hosted by
:abbr:`LF (Linux Foundation)` FD.io project. Testbed physical topology is shown
in the figure below.::
- +------------------------+ +------------------------+
- | | | |
- | +------------------+ | | +------------------+ |
- | | | | | | | |
- | | <-----------------> | |
- | | DUT1 | | | | DUT2 | |
- | +--^---------------+ | | +---------------^--+ |
- | | | | | |
- | | SUT1 | | SUT2 | |
- +------------------------+ +------------------^-----+
- | |
- | |
- | +-----------+ |
- | | | |
- +------------------> TG <------------------+
- | |
- +-----------+
+ +------------------------+ +------------------------+
+ | | | |
+ | +------------------+ | | +------------------+ |
+ | | | | | | | |
+ | | <-----------------> | |
+ | | DUT1 | | | | DUT2 | |
+ | +--^---------------+ | | +---------------^--+ |
+ | | | | | |
+ | | SUT1 | | SUT2 | |
+ +------------------------+ +------------------^-----+
+ | |
+ | |
+ | +-----------+ |
+ | | | |
+ +------------------> TG <------------------+
+ | |
+ +-----------+
SUT1 and SUT2 are two System Under Test servers (Cisco UCS C240, each with two
Intel XEON CPUs), TG is a Traffic Generator (TG, another Cisco UCS C240, with
@@ -53,43 +53,59 @@ Going forward CSIT project will be looking to add more hardware into FD.io
performance labs to address larger scale multi-interface and multi-NIC
performance testing scenarios.
-For test cases that require DUT (VPP) to communicate with
-VirtualMachines (VMs) / Linux or Docker Containers (Ctrs) over
+For service chain topology test cases that require DUT (VPP) to communicate with
+VirtualMachines (VMs) or with Linux/Docker Containers (Ctrs) over
vhost-user/memif interfaces, N of VM/Ctr instances are created on SUT1
-and SUT2. For N=1 DUT forwards packets between vhost/memif and physical
-interfaces. For N>1 DUT a logical service chain forwarding topology is
-created on DUT by applying L2 or IPv4/IPv6 configuration depending on
-the test suite. DUT test topology with N VM/Ctr instances is shown in
-the figure below including applicable packet flow thru the DUTs and
-VMs/Ctrs (marked in the figure with ``***``).::
-
- +-------------------------+ +-------------------------+
- | +---------+ +---------+ | | +---------+ +---------+ |
- | |VM/Ctr[1]| |VM/Ctr[N]| | | |VM/Ctr[1]| |VM/Ctr[N]| |
- | | ***** | | ***** | | | | ***** | | ***** | |
- | +--^---^--+ +--^---^--+ | | +--^---^--+ +--^---^--+ |
- | *| |* *| |* | | *| |* *| |* |
- | +--v---v-------v---v--+ | | +--v---v-------v---v--+ |
- | | * * * * |*|***********|*| * * * * | |
- | | * ********* ***<-|-----------|->*** ********* * | |
- | | * DUT1 | | | | DUT2 * | |
- | +--^------------------+ | | +------------------^--+ |
- | *| | | |* |
- | *| SUT1 | | SUT2 |* |
- +-------------------------+ +-------------------------+
- *| |*
- *| |*
- *| +-----------+ |*
- *| | | |*
- *+--------------------> TG <--------------------+*
- **********************| |**********************
- +-----------+
-
-For VM/Ctr tests, packets are switched by DUT multiple times: twice for
-a single VM/Ctr, three times for two VMs/Ctrs, N+1 times for N VMs/Ctrs.
-Hence the external throughput rates measured by TG and listed in this
-report must be multiplied by (N+1) to represent the actual DUT aggregate
-packet forwarding rate.
+and SUT2. Three types of service chain topologies are tested in CSIT |release|:
+
+#. "Parallel" topology with packets flowing from NIC via DUT (VPP) to
+ VM/Container and back to VPP and NIC;
+
+#. "Chained" topology (a.k.a. "Snake") with packets flowing via DUT (VPP) to
+ VM/Container, back to DUT, then to the next VM/Container, back to DUT and
+ so on until the last VM/Container in a chain, then back to DUT and NIC;
+
+#. "Horizontal" topology with packets flowing via DUT (VPP) to Container,
+ then via "horizontal" memif to the next Container, and so on until the
+ last Container, then back to DUT and NIC. "Horizontal" topology is not
+ supported for VMs;
+
+For each of the above topologies, DUT (VPP) is tested in a range of L2
+or IPv4/IPv6 configurations depending on the test suite. A sample DUT
+"Chained" service topology with N of VM/Ctr instances is shown in the
+figure below. Packet flow thru the DUTs and VMs/Ctrs is marked with
+``***``::
+
+ +-------------------------+ +-------------------------+
+ | +---------+ +---------+ | | +---------+ +---------+ |
+ | |VM/Ctr[1]| |VM/Ctr[N]| | | |VM/Ctr[1]| |VM/Ctr[N]| |
+ | | ***** | | ***** | | | | ***** | | ***** | |
+ | +--^---^--+ +--^---^--+ | | +--^---^--+ +--^---^--+ |
+ | *| |* *| |* | | *| |* *| |* |
+ | +--v---v-------v---v--+ | | +--v---v-------v---v--+ |
+ | | * * * * |*|***********|*| * * * * | |
+ | | * ********* ***<-|-----------|->*** ********* * | |
+ | | * DUT1 | | | | DUT2 * | |
+ | +--^------------------+ | | +------------------^--+ |
+ | *| | | |* |
+ | *| SUT1 | | SUT2 |* |
+ +-------------------------+ +-------------------------+
+ *| |*
+ *| |*
+ *| +-----------+ |*
+ *| | | |*
+ *+--------------------> TG <--------------------+*
+ **********************| |**********************
+ +-----------+
+
+In above "Chained" topology, packets are switched by DUT multiple times:
+twice for a single VM/Ctr, three times for two VMs/Ctrs, N+1 times for N
+VMs/Ctrs. Hence the external throughput rates measured by TG and listed
+in this report must be multiplied by (N+1) to represent the actual DUT
+aggregate packet forwarding rate.
+
+For a "Parallel" and "Horizontal" service topologies packets are always
+switched by DUT twice per service chain.
Note that reported DUT (VPP) performance results are specific to the SUTs
tested. Current :abbr:`LF (Linux Foundation)` FD.io SUTs are based on Intel
@@ -162,8 +178,8 @@ CSIT |release| includes following performance test suites, listed per NIC type:
number of users and ports per user.
- **Container memif connections** - VPP memif virtual interface tests to
interconnect VPP instances with L2XC and L2BD.
- - **Container K8s Orchestrated Topologies** - Container topologies connected over
- the memif virtual interface.
+ - **Container K8s Orchestrated Topologies** - Container topologies connected
+ over the memif virtual interface.
- **SRv6** - Segment Routing IPv6 tests.
- 2port40GE XL710 Intel
@@ -236,11 +252,17 @@ following VPP thread and core configurations:
#. 1t1c - 1 VPP worker thread on 1 CPU physical core.
#. 2t2c - 2 VPP worker threads on 2 CPU physical cores.
+#. 4t4c - 4 VPP worker threads on 4 CPU physical cores.
-VPP worker threads are the data plane threads. VPP control thread is running on
-a separate non-isolated core together with other Linux processes. Note that in
-quite a few test cases running VPP workers on 2 physical cores hits the tested
-NIC I/O bandwidth or packets-per-second limit.
+VPP worker threads are the data plane threads. VPP control thread is
+running on a separate non-isolated core together with other Linux
+processes. Note that in quite a few test cases running VPP workers on 2
+or 4 physical cores hits the I/O bandwidth or packets-per-second limit
+of tested NIC.
+
+Section :ref:`throughput_speedup_multi_core` includes a set of graphs
+illustrating packet throughout speedup when running VPP on multiple
+cores.
Methodology: Packet Throughput
------------------------------
@@ -250,23 +272,33 @@ Following values are measured and reported for packet throughput tests:
- NDR binary search per :rfc:`2544`:
- Packet rate: "RATE: <aggregate packet rate in packets-per-second> pps
- (2x <per direction packets-per-second>)"
+ (2x <per direction packets-per-second>)";
- Aggregate bandwidth: "BANDWIDTH: <aggregate bandwidth in Gigabits per
- second> Gbps (untagged)"
+ second> Gbps (untagged)";
- PDR binary search per :rfc:`2544`:
- Packet rate: "RATE: <aggregate packet rate in packets-per-second> pps (2x
- <per direction packets-per-second>)"
+ <per direction packets-per-second>)";
- Aggregate bandwidth: "BANDWIDTH: <aggregate bandwidth in Gigabits per
- second> Gbps (untagged)"
+ second> Gbps (untagged)";
- Packet loss tolerance: "LOSS_ACCEPTANCE <accepted percentage of packets
- lost at PDR rate>""
+ lost at PDR rate>";
- NDR and PDR are measured for the following L2 frame sizes:
- - IPv4: 64B, IMIX_v4_1 (28x64B,16x570B,4x1518B), 1518B, 9000B.
- - IPv6: 78B, 1518B, 9000B.
+ - IPv4: 64B, IMIX_v4_1 (28x64B,16x570B,4x1518B), 1518B, 9000B;
+ - IPv6: 78B, 1518B, 9000B;
+
+- NDR and PDR binary search resolution is determined by the final value of the
+ rate change, referred to as the final step:
+
+ - The final step is set to 50kpps for all NIC to NIC tests and all L2
+ frame sizes except 9000B (changed from 100kpps used in previous
+ releases).
+
+ - The final step is set to 10kpps for all remaining tests, including 9000B
+ and all vhost VM and memif Container tests.
All rates are reported from external Traffic Generator perspective.