From 107438e93a51eefc61dc171cfa9b959007ccc739 Mon Sep 17 00:00:00 2001 From: Tibor Frank Date: Thu, 30 Jan 2020 09:10:38 +0100 Subject: Report: Placeholder for LD preload tests - methodology - test results Change-Id: I0d102875045ab295d9b44fa7bc328f2a728803d7 Signed-off-by: Tibor Frank Signed-off-by: Dave Wallace --- docs/report/index.html.template | 2 +- docs/report/introduction/methodology.rst | 6 +- .../introduction/methodology_http_tcp_with_wrk.rst | 39 ++++++++ .../methodology_http_tcp_with_wrk_tool.rst | 40 -------- .../introduction/methodology_quic_with_vppecho.rst | 43 ++++++++ .../introduction/methodology_tcp_with_iperf3.rst | 41 ++++++++ .../http_server_performance/index.rst | 111 +++++++++++++++++++++ .../hoststack_testing/index.rst | 8 ++ .../hoststack_testing/iperf3/index.rst | 2 + .../hoststack_testing/quic/index.rst | 2 + .../http_server_performance/index.rst | 111 --------------------- docs/report/vpp_performance_tests/index.rst | 1 + 12 files changed, 251 insertions(+), 155 deletions(-) create mode 100644 docs/report/introduction/methodology_http_tcp_with_wrk.rst delete mode 100644 docs/report/introduction/methodology_http_tcp_with_wrk_tool.rst create mode 100644 docs/report/introduction/methodology_quic_with_vppecho.rst create mode 100644 docs/report/introduction/methodology_tcp_with_iperf3.rst create mode 100644 docs/report/vpp_performance_tests/hoststack_testing/http_server_performance/index.rst create mode 100644 docs/report/vpp_performance_tests/hoststack_testing/index.rst create mode 100644 docs/report/vpp_performance_tests/hoststack_testing/iperf3/index.rst create mode 100644 docs/report/vpp_performance_tests/hoststack_testing/quic/index.rst delete mode 100644 docs/report/vpp_performance_tests/http_server_performance/index.rst diff --git a/docs/report/index.html.template b/docs/report/index.html.template index d8f3b6f281..cbbde1ed25 100644 --- a/docs/report/index.html.template +++ b/docs/report/index.html.template @@ -25,7 +25,7 @@ CSIT-2001 vpp_performance_tests/soak_tests/index vpp_performance_tests/reconf_tests/index vpp_performance_tests/nf_service_density/index - vpp_performance_tests/http_server_performance/index + vpp_performance_tests/hoststack_testing/index vpp_performance_tests/comparisons/index vpp_performance_tests/throughput_trending vpp_performance_tests/test_environment diff --git a/docs/report/introduction/methodology.rst b/docs/report/introduction/methodology.rst index ff3ecc31a4..107a6954c6 100644 --- a/docs/report/introduction/methodology.rst +++ b/docs/report/introduction/methodology.rst @@ -13,6 +13,9 @@ Test Methodology methodology_data_plane_throughput/index methodology_packet_latency methodology_multi_core_speedup + methodology_http_tcp_with_wrk + methodology_tcp_with_iperf3 + methodology_quic_with_vppecho methodology_reconf methodology_vpp_startup_settings methodology_kvm_vms_vhost_user @@ -21,6 +24,3 @@ Test Methodology methodology_vpp_device_functional methodology_ipsec_on_intel_qat methodology_trex_traffic_generator - -.. - methodology_http_tcp_with_wrk_tool diff --git a/docs/report/introduction/methodology_http_tcp_with_wrk.rst b/docs/report/introduction/methodology_http_tcp_with_wrk.rst new file mode 100644 index 0000000000..cd831b4481 --- /dev/null +++ b/docs/report/introduction/methodology_http_tcp_with_wrk.rst @@ -0,0 +1,39 @@ +HTTP/TCP with WRK +----------------- + +`WRK HTTP benchmarking tool `_ is used for +TCP/IP and HTTP tests of VPP Host Stack and built-in static HTTP server. +WRK has been chosen as it is capable of generating significant TCP/IP +and HTTP loads by scaling number of threads across multi-core processors. + +This in turn enables high scale benchmarking of the VPP Host Stack TCP/IP +and HTTP service including HTTP TCP/IP Connections-Per-Second (CPS) and +HTTP Requests-Per-Second. + +The initial tests are designed as follows: + +- HTTP and TCP/IP Connections-Per-Second (CPS) + + - WRK configured to use 8 threads across 8 cores, 1 thread per core. + - Maximum of 50 concurrent connections across all WRK threads. + - Timeout for server responses set to 5 seconds. + - Test duration is 30 seconds. + - Expected HTTP test sequence: + + - Single HTTP GET Request sent per open connection. + - Connection close after valid HTTP reply. + - Resulting flow sequence - 8 packets: >Syn, Ack, >Req, + Fin, Ack. + +- HTTP Requests-Per-Second + + - WRK configured to use 8 threads across 8 cores, 1 thread per core. + - Maximum of 50 concurrent connections across all WRK threads. + - Timeout for server responses set to 5 seconds. + - Test duration is 30 seconds. + - Expected HTTP test sequence: + + - Multiple HTTP GET Requests sent in sequence per open connection. + - Connection close after set test duration time. + - Resulting flow sequence: >Syn, Ack, >Req[1], Req[n], Fin, Ack. diff --git a/docs/report/introduction/methodology_http_tcp_with_wrk_tool.rst b/docs/report/introduction/methodology_http_tcp_with_wrk_tool.rst deleted file mode 100644 index 28f3fc6bbb..0000000000 --- a/docs/report/introduction/methodology_http_tcp_with_wrk_tool.rst +++ /dev/null @@ -1,40 +0,0 @@ -HTTP/TCP with WRK Tool ----------------------- - -`WRK HTTP benchmarking tool `_ is used for -experimental TCP/IP and HTTP tests of VPP TCP/IP stack and built-in -static HTTP server. WRK has been chosen as it is capable of generating -significant TCP/IP and HTTP loads by scaling number of threads across -multi-core processors. - -This in turn enables quite high scale benchmarking of the main TCP/IP -and HTTP service including HTTP TCP/IP Connections-Per-Second (CPS), -HTTP Requests-Per-Second and HTTP Bandwidth Throughput. - -The initial tests are designed as follows: - -- HTTP and TCP/IP Connections-Per-Second (CPS) - - - WRK configured to use 8 threads across 8 cores, 1 thread per core. - - Maximum of 50 concurrent connections across all WRK threads. - - Timeout for server responses set to 5 seconds. - - Test duration is 30 seconds. - - Expected HTTP test sequence: - - - Single HTTP GET Request sent per open connection. - - Connection close after valid HTTP reply. - - Resulting flow sequence - 8 packets: >Syn, Ack, >Req, - Fin, Ack. - -- HTTP Requests-Per-Second - - - WRK configured to use 8 threads across 8 cores, 1 thread per core. - - Maximum of 50 concurrent connections across all WRK threads. - - Timeout for server responses set to 5 seconds. - - Test duration is 30 seconds. - - Expected HTTP test sequence: - - - Multiple HTTP GET Requests sent in sequence per open connection. - - Connection close after set test duration time. - - Resulting flow sequence: >Syn, Ack, >Req[1], Req[n], Fin, Ack. diff --git a/docs/report/introduction/methodology_quic_with_vppecho.rst b/docs/report/introduction/methodology_quic_with_vppecho.rst new file mode 100644 index 0000000000..12b64203db --- /dev/null +++ b/docs/report/introduction/methodology_quic_with_vppecho.rst @@ -0,0 +1,43 @@ +Hoststack Throughput Testing over QUIC/UDP/IP with vpp_echo +----------------------------------------------------------- + +`vpp_echo performance testing tool `_ +is a bespoke performance test application which utilizes the 'native +HostStack APIs' to verify performance and correct handling of +connection/stream events with uni-directional and bi-directional +streams of data. + +Because iperf3 does not support the QUIC transport protocol, vpp_echo +is used for measuring the maximum attainable bandwidth of the VPP Host +Stack connection utilzing the QUIC transport protocol across two +instances of VPP running on separate DUT nodes. The QUIC transport +protocol supports multiple streams per connection and test cases +utilize different combinations of QUIC connections and number of +streams per connection. + +The test configuration is as follows: + + DUT1 Network DUT2 +[ vpp_echo-client -> VPP1 ]=======[ VPP2 -> vpp_echo-server] + N-streams/connection + +where, + + 1. vpp_echo server attaches to VPP2 and LISTENs on VPP2:TCP port 1234. + 2. vpp_echo client creates one or more connections to VPP1 and opens + one or more stream per connection to VPP2:TCP port 1234. + 3. vpp_echo client transmits a uni-directional stream as fast as the + VPP Host Stack allows to the vpp_echo server for the test duration. + 4. At the end of the test the vpp_echo client emits the goodput + measurements for all streams and the sum of all streams. + + Test cases include + 1. 1 QUIC Connection with 1 Stream + 2. 1 QUIC connection with 10 Streams + 3. 10 QUIC connetions with 1 Stream + 4. 10 QUIC connections with 10 Streams + + with stream sizes to provide reasonable test durations. The VPP Host + Stack QUIC transport is configured to utilize the picotls encryption + library. In the future, tests utilizing addtional encryption + algorithms will be added. diff --git a/docs/report/introduction/methodology_tcp_with_iperf3.rst b/docs/report/introduction/methodology_tcp_with_iperf3.rst new file mode 100644 index 0000000000..ef28dec4a3 --- /dev/null +++ b/docs/report/introduction/methodology_tcp_with_iperf3.rst @@ -0,0 +1,41 @@ +Hoststack Throughput Testing over TCP/IP with iperf3 +---------------------------------------------------- + +`iperf3 bandwidth measurement tool `_ +is used for measuring the maximum attainable bandwidth of the VPP Host +Stack connection across two instances of VPP running on separate DUT +nodes. iperf3 is a popular open source tool for active measurements +of the maximum achievable bandwidth on IP networks. + +Because iperf3 utilizes the POSIX socket interface APIs, the current +test configuration utilizes the LD_PRELOAD mechanism in the linux +kernel to connect iperf3 to the VPP Host Stack using the VPP +Communications Library (VCL) LD_PRELOAD library (libvcl_ldpreload.so). + +In the future, a forked version of iperf3 which has been modified to +directly use the VCL application APIs may be added to determine the +difference in performance of 'VCL Native' applications .vs. utilizing +LD_PRELOAD which inherently has more overhead and other limitations. + +The test configuration is as follows: + + DUT1 Network DUT2 +[ iperf3-client -> VPP1 ]=======[ VPP2 -> iperf3-server] + +where, + + 1. iperf3 server attaches to VPP2 and LISTENs on VPP2:TCP port 5201. + 2. iperf3 client attaches to VPP1 and opens one or more stream + connections to VPP2:TCP port 5201. + 3. iperf3 client transmits a uni-directional stream as fast as the + VPP Host Stack allows to the iperf3 server for the test duration. + 4. At the end of the test the iperf3 client emits the goodput + measurements for all streams and the sum of all streams. + + Test cases include 1 and 10 Streams with a 20 second test duration + with the VPP Host Stack configured to utilize the Cubic TCP + congestion algorithm. + + Note: iperf3 is single threaded, so it is expected that the 10 stream + test does not show any performance improvement due to + multi-thread/multi-core execution. diff --git a/docs/report/vpp_performance_tests/hoststack_testing/http_server_performance/index.rst b/docs/report/vpp_performance_tests/hoststack_testing/http_server_performance/index.rst new file mode 100644 index 0000000000..412ff6af63 --- /dev/null +++ b/docs/report/vpp_performance_tests/hoststack_testing/http_server_performance/index.rst @@ -0,0 +1,111 @@ + +.. raw:: latex + + \clearpage + +.. raw:: html + + + +HTTP and TCP/IP +=============== + +Performance graphs are generated by multiple executions of the same +performance tests across physical testbeds hosted LF FD.io labs: 3n-hsw. +Box-and-Whisker plots are used to display variations in measured +throughput values, without making any assumptions of the underlying +statistical distribution. + +For each test case, Box-and-Whisker plots show the quartiles (Min, 1st +quartile / 25th percentile, 2nd quartile / 50th percentile / mean, 3rd +quartile / 75th percentile, Max) across collected data set. Outliers are +plotted as individual points. + +Additional information about graph data: + +#. **X-axis Labels**: indices of individual test suites as listed in + Graph Legend. + +#. **Y-axis Labels**: measured Connections Per Second [cps] or Requests Per + Second [rps] throughput values. + +#. **Graph Legend**: lists X-axis indices with associated CSIT test + suites executed to generate graphed test results. + +#. **Hover Information**: lists minimum, first quartile, median, + third quartile, and maximum. If either type of outlier is present the + whisker on the appropriate side is taken to 1.5×IQR from the quartile + (the "inner fence") rather than the max or min, and individual outlying + data points are displayed as unfilled circles (for suspected outliers) + or filled circles (for outliers). (The "outer fence" is 3×IQR from the + quartile.) + +.. note:: + + Data sources for reported test results: i) `FD.io test executor vpp + performance job 2n-skx`_, ii) archived FD.io jobs test result `output files + <../../_static/archive/>`_. + + CSIT source code for the test cases used for plots can be found in + `CSIT git repository `_. + +.. raw:: latex + + \clearpage + +Connections per second +---------------------- + +.. raw:: html + + + +.. raw:: latex + + \begin{figure}[H] + \centering + \graphicspath{{../_build/_static/vpp/}} + \includegraphics[clip, trim=0cm 0cm 5cm 0cm, width=0.70\textwidth]{http-server-performance-cps} + \label{fig:http-server-performance-cps} + \end{figure} + +.. raw:: latex + + \clearpage + +Requests per second +------------------- + +.. raw:: html + + + +.. raw:: latex + + \begin{figure}[H] + \centering + \graphicspath{{../_build/_static/vpp/}} + \includegraphics[clip, trim=0cm 0cm 5cm 0cm, width=0.70\textwidth]{http-server-performance-rps} + \label{fig:http-server-performance-rps} + \end{figure} diff --git a/docs/report/vpp_performance_tests/hoststack_testing/index.rst b/docs/report/vpp_performance_tests/hoststack_testing/index.rst new file mode 100644 index 0000000000..e6da504128 --- /dev/null +++ b/docs/report/vpp_performance_tests/hoststack_testing/index.rst @@ -0,0 +1,8 @@ +Hoststack Testing +================= + +.. toctree:: + + http_server_performance/index + iperf3/index + quic/index diff --git a/docs/report/vpp_performance_tests/hoststack_testing/iperf3/index.rst b/docs/report/vpp_performance_tests/hoststack_testing/iperf3/index.rst new file mode 100644 index 0000000000..85d120c31c --- /dev/null +++ b/docs/report/vpp_performance_tests/hoststack_testing/iperf3/index.rst @@ -0,0 +1,2 @@ +Hoststack Throughput Testing over TCP/IP with iperf3 +---------------------------------------------------- diff --git a/docs/report/vpp_performance_tests/hoststack_testing/quic/index.rst b/docs/report/vpp_performance_tests/hoststack_testing/quic/index.rst new file mode 100644 index 0000000000..c1ec15bd6f --- /dev/null +++ b/docs/report/vpp_performance_tests/hoststack_testing/quic/index.rst @@ -0,0 +1,2 @@ +Hoststack Throughput Testing over QUIC(picotls)/UDP/IP with vpp_echo +-------------------------------------------------------------------- diff --git a/docs/report/vpp_performance_tests/http_server_performance/index.rst b/docs/report/vpp_performance_tests/http_server_performance/index.rst deleted file mode 100644 index 412ff6af63..0000000000 --- a/docs/report/vpp_performance_tests/http_server_performance/index.rst +++ /dev/null @@ -1,111 +0,0 @@ - -.. raw:: latex - - \clearpage - -.. raw:: html - - - -HTTP and TCP/IP -=============== - -Performance graphs are generated by multiple executions of the same -performance tests across physical testbeds hosted LF FD.io labs: 3n-hsw. -Box-and-Whisker plots are used to display variations in measured -throughput values, without making any assumptions of the underlying -statistical distribution. - -For each test case, Box-and-Whisker plots show the quartiles (Min, 1st -quartile / 25th percentile, 2nd quartile / 50th percentile / mean, 3rd -quartile / 75th percentile, Max) across collected data set. Outliers are -plotted as individual points. - -Additional information about graph data: - -#. **X-axis Labels**: indices of individual test suites as listed in - Graph Legend. - -#. **Y-axis Labels**: measured Connections Per Second [cps] or Requests Per - Second [rps] throughput values. - -#. **Graph Legend**: lists X-axis indices with associated CSIT test - suites executed to generate graphed test results. - -#. **Hover Information**: lists minimum, first quartile, median, - third quartile, and maximum. If either type of outlier is present the - whisker on the appropriate side is taken to 1.5×IQR from the quartile - (the "inner fence") rather than the max or min, and individual outlying - data points are displayed as unfilled circles (for suspected outliers) - or filled circles (for outliers). (The "outer fence" is 3×IQR from the - quartile.) - -.. note:: - - Data sources for reported test results: i) `FD.io test executor vpp - performance job 2n-skx`_, ii) archived FD.io jobs test result `output files - <../../_static/archive/>`_. - - CSIT source code for the test cases used for plots can be found in - `CSIT git repository `_. - -.. raw:: latex - - \clearpage - -Connections per second ----------------------- - -.. raw:: html - - - -.. raw:: latex - - \begin{figure}[H] - \centering - \graphicspath{{../_build/_static/vpp/}} - \includegraphics[clip, trim=0cm 0cm 5cm 0cm, width=0.70\textwidth]{http-server-performance-cps} - \label{fig:http-server-performance-cps} - \end{figure} - -.. raw:: latex - - \clearpage - -Requests per second -------------------- - -.. raw:: html - - - -.. raw:: latex - - \begin{figure}[H] - \centering - \graphicspath{{../_build/_static/vpp/}} - \includegraphics[clip, trim=0cm 0cm 5cm 0cm, width=0.70\textwidth]{http-server-performance-rps} - \label{fig:http-server-performance-rps} - \end{figure} diff --git a/docs/report/vpp_performance_tests/index.rst b/docs/report/vpp_performance_tests/index.rst index beb4dce988..042ee4fc24 100644 --- a/docs/report/vpp_performance_tests/index.rst +++ b/docs/report/vpp_performance_tests/index.rst @@ -13,6 +13,7 @@ VPP Performance soak_tests/index reconf_tests/index nf_service_density/index + hoststack_testing/index http_server_performance/index comparisons/index throughput_trending -- cgit 1.2.3-korg