summaryrefslogtreecommitdiffstats
path: root/doc
diff options
context:
space:
mode:
authorHanoh Haim <hhaim@cisco.com>2016-12-26 14:42:10 +0200
committerHanoh Haim <hhaim@cisco.com>2016-12-26 14:42:40 +0200
commit0bb904dc45e53cbdd4e8fe1d000fb61920a80f17 (patch)
tree0445db5787fb8c81288b5d8cc9c02115f1f95689 /doc
parentea175dea96f23fe6033a9babe7130eb07eafb62d (diff)
add tw appendix for many active of flows
Signed-off-by: Hanoh Haim <hhaim@cisco.com>
Diffstat (limited to 'doc')
-rw-r--r--doc/images/tw0_0_chart.pngbin0 -> 24148 bytes
-rw-r--r--doc/images/tw1.pngbin0 -> 32875 bytes
-rw-r--r--doc/images/tw1_0.pngbin0 -> 7726 bytes
-rw-r--r--doc/images/tw1_tbl.pngbin0 -> 12039 bytes
-rw-r--r--doc/images/tw2.pngbin0 -> 22878 bytes
-rw-r--r--doc/images/tw3.pngbin0 -> 33105 bytes
-rwxr-xr-xdoc/trex_book.asciidoc270
-rw-r--r--doc/visio_drawings/tw.xlsxbin0 -> 25465 bytes
8 files changed, 270 insertions, 0 deletions
diff --git a/doc/images/tw0_0_chart.png b/doc/images/tw0_0_chart.png
new file mode 100644
index 00000000..194c0bfe
--- /dev/null
+++ b/doc/images/tw0_0_chart.png
Binary files differ
diff --git a/doc/images/tw1.png b/doc/images/tw1.png
new file mode 100644
index 00000000..059410d7
--- /dev/null
+++ b/doc/images/tw1.png
Binary files differ
diff --git a/doc/images/tw1_0.png b/doc/images/tw1_0.png
new file mode 100644
index 00000000..4a00c611
--- /dev/null
+++ b/doc/images/tw1_0.png
Binary files differ
diff --git a/doc/images/tw1_tbl.png b/doc/images/tw1_tbl.png
new file mode 100644
index 00000000..e28ce361
--- /dev/null
+++ b/doc/images/tw1_tbl.png
Binary files differ
diff --git a/doc/images/tw2.png b/doc/images/tw2.png
new file mode 100644
index 00000000..08cd9683
--- /dev/null
+++ b/doc/images/tw2.png
Binary files differ
diff --git a/doc/images/tw3.png b/doc/images/tw3.png
new file mode 100644
index 00000000..f446c96f
--- /dev/null
+++ b/doc/images/tw3.png
Binary files differ
diff --git a/doc/trex_book.asciidoc b/doc/trex_book.asciidoc
index b82c5765..cd53bb22 100755
--- a/doc/trex_book.asciidoc
+++ b/doc/trex_book.asciidoc
@@ -1671,6 +1671,12 @@ based routes to pass all traffic from one DUT port to the other. +
*-w <num seconds>*::
Wait additional time between NICs initialization and sending traffic. Can be useful if DUT needs extra setup time. Default is 1 second.
+*--active-flows*::
+ An experimental switch to scale up or down the number of active flows.
+ It is not accurate due to the quantization of flow scheduler and in some case does not work.
+ Example --active-flows 500000 wil set the ballpark of the active flow to be ~0.5M
+
+
ifndef::backend-docbook[]
@@ -2604,4 +2610,268 @@ anchor:ciscovic_support[]
* link:https://trex-tgn.cisco.com/youtrack/issue/trex-272[QSFP+ issue]
+=== More active flows
+
+From version v2.13 there is a new Stateful scheduler that works better in the case of high concurrent/active flows.
+In case of EMIX 70% better performance was observed.
+In this tutorial there are 14 DP cores & up to 8M flows.
+There is a special config file to enlarge the number of flows. This tutorial present the difference in performance between the old scheduler and the new.
+
+==== Setup details
+
+[cols="1,5"]
+|=================
+| Server: | UCSC-C240-M4SX
+| CPU: | 2 x Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
+| RAM: | 65536 @ 2133 MHz
+| NICs: | 2 x Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 01)
+| QSFP: | Cisco QSFP-H40G-AOC1M
+| OS: | Fedora 18
+| Switch: | Cisco Nexus 3172 Chassis, System version: 6.0(2)U5(2).
+| TRex: | v2.13/v2.12 using 7 cores per dual interface.
+|=================
+
+==== Traffic profile
+
+.cap2/cur_flow_single.yaml
+[source,python]
+----
+- duration : 0.1
+ generator :
+ distribution : "seq"
+ clients_start : "16.0.0.1"
+ clients_end : "16.0.0.255"
+ servers_start : "48.0.0.1"
+ servers_end : "48.0.255.255"
+ clients_per_gb : 201
+ min_clients : 101
+ dual_port_mask : "1.0.0.0"
+ cap_info :
+ - name: cap2/udp_10_pkts.pcap <1>
+ cps : 100
+ ipg : 200
+ rtt : 200
+ w : 1
+----
+<1> One directional UDP flow with 10 packets of 64B
+
+
+==== Config file command
+
+./cfg/trex_08_5mflows.yaml
+[source,python]
+----
+- port_limit: 4
+ version: 2
+ interfaces: ['05:00.0', '05:00.1', '84:00.0', '84:00.1']
+ port_info:
+ - ip: 1.1.1.1
+ default_gw: 2.2.2.2
+ - ip: 3.3.3.3
+ default_gw: 4.4.4.4
+
+ - ip: 4.4.4.4
+ default_gw: 3.3.3.3
+ - ip: 2.2.2.2
+ default_gw: 1.1.1.1
+
+ platform:
+ master_thread_id: 0
+ latency_thread_id: 15
+ dual_if:
+ - socket: 0
+ threads: [1,2,3,4,5,6,7]
+
+ - socket: 1
+ threads: [8,9,10,11,12,13,14]
+ memory :
+ dp_flows : 1048576 <1>
+----
+<1> add memory section with more flows
+
+==== Traffic command
+
+.command
+[source,bash]
+----
+$sudo ./t-rex-64 -f cap2/cur_flow_single.yaml -m 30000 -c 7 -d 40 -l 1000 --active-flows 5000000 -p --cfg cfg/trex_08_5mflows.yaml
+----
+
+The number of active flows can be change using `--active-flows` CLI. in this example it is set to 5M flows
+
+
+==== Script to get performance per active number of flows
+
+[source,python]
+----
+
+def minimal_stateful_test(server,csv_file,a_active_flows):
+
+ trex_client = CTRexClient(server) <1>
+
+ trex_client.start_trex( <2>
+ c = 7,
+ m = 30000,
+ f = 'cap2/cur_flow_single.yaml',
+ d = 30,
+ l = 1000,
+ p=True,
+ cfg = "cfg/trex_08_5mflows.yaml",
+ active_flows=a_active_flows,
+ nc=True
+ )
+
+ result = trex_client.sample_to_run_finish() <3>
+
+ active_flows=result.get_value_list('trex-global.data.m_active_flows')
+ cpu_utl=result.get_value_list('trex-global.data.m_cpu_util')
+ pps=result.get_value_list('trex-global.data.m_tx_pps')
+ queue_full=result.get_value_list('trex-global.data.m_total_queue_full')
+ if queue_full[-1]>10000:
+ print("WARNING QUEU WAS FULL");
+ tuple=(active_flows[-5],cpu_utl[-5],pps[-5],queue_full[-1]) <4>
+ file_writer = csv.writer(test_file)
+ file_writer.writerow(tuple);
+
+
+
+if __name__ == '__main__':
+ test_file = open('tw_2_layers.csv', 'wb');
+ parser = argparse.ArgumentParser(description="Example for TRex Stateful, assuming server daemon is running.")
+
+ parser.add_argument('-s', '--server',
+ dest='server',
+ help='Remote trex address',
+ default='127.0.0.1',
+ type = str)
+ args = parser.parse_args()
+
+ max_flows=8000000;
+ min_flows=100;
+ active_flow=min_flows;
+ num_point=10
+ factor=math.exp(math.log(max_flows/min_flows,math.e)/num_point);
+ for i in range(num_point+1):
+ print("=====================",i,math.floor(active_flow))
+ minimal_stateful_test(args.server,test_file,math.floor(active_flow))
+ active_flow=active_flow*factor
+
+ test_file.close();
+----
+<1> connect
+<2> Start with different active_flows
+<3> wait for the results
+<4> get the results and save to csv file
+
+This script iterate between 100 to 8M active flows and save the results to csv file.
+
+==== The results v2.12 vs v2.13
+
+.MPPS/core
+image:images/tw1_0.png[title="results",align="center"]
+
+.MPPS/core
+image:images/tw0_0_chart.png[title="results",align="center",width=800]
+
+* TW0 - v2.13 default configuration
+* PQ - v2.12 default configuration
+
+* To run the same script on v2.12 (that does not support `active_flows` directive) a patch was introduced.
+
+*Observation*::
+ * TW works better (up to 250%) in case of 200-500K flows
+ * TW scale better with active-flows
+
+==== Tunning
+
+let's add another modes called *TW1*, in this mode the scheduler is tune to have more buckets (more memory)
+
+.TW1 cap2/cur_flow_single_tw_8.yaml
+[source,python]
+----
+- duration : 0.1
+ generator :
+ distribution : "seq"
+ clients_start : "16.0.0.1"
+ clients_end : "16.0.0.255"
+ servers_start : "48.0.0.1"
+ servers_end : "48.0.255.255"
+ clients_per_gb : 201
+ min_clients : 101
+ dual_port_mask : "1.0.0.0"
+ tw :
+ buckets : 16384 <1>
+ levels : 2 <2>
+ bucket_time_usec : 20.0
+ cap_info :
+ - name: cap2/udp_10_pkts.pcap
+ cps : 100
+ ipg : 200
+ rtt : 200
+ w : 1
+----
+<1> more buckets
+<2> less levels
+
+
+in *TW2* mode we have the same template, duplicated one with short IPG and another one with high IPG
+10% of the new flows will be with long IPG
+
+.TW2 cap2/cur_flow.yaml
+[source,python]
+----
+- duration : 0.1
+ generator :
+ distribution : "seq"
+ clients_start : "16.0.0.1"
+ clients_end : "16.0.0.255"
+ servers_start : "48.0.0.1"
+ servers_end : "48.0.255.255"
+ clients_per_gb : 201
+ min_clients : 101
+ dual_port_mask : "1.0.0.0"
+ tcp_aging : 0
+ udp_aging : 0
+ mac : [0x0,0x0,0x0,0x1,0x0,0x00]
+ #cap_ipg : true
+ cap_info :
+ - name: cap2/udp_10_pkts.pcap
+ cps : 10
+ ipg : 100000
+ rtt : 100000
+ w : 1
+ - name: cap2/udp_10_pkts.pcap
+ cps : 90
+ ipg : 2
+ rtt : 2
+ w : 1
+----
+
+==== Full results
+
+
+* PQ - v2.12 default configuration
+* TW0 - v2.13 default configuration
+* TW1 - v2.13 more buckets 16K
+* TW2 - v2.13 two templates
+
+.MPPS/core Comparison
+image:images/tw1.png[title="results",align="center",width=800]
+
+.MPPS/core
+image:images/tw1_tbl.png[title="results",align="center"]
+
+.Factor relative to v2.12 results
+image:images/tw2.png[title="results",align="center",width=800]
+
+.Extrapolation Total GbE per UCS with average packet size of 600B
+image:images/tw3.png[title="results",align="center",width=800]
+
+Observation:
+
+* TW2 (two flows) almost does not have a performance impact
+* TW1 (more buckets) improve the performance up to a point
+* TW is general is better than PQ
+
+
diff --git a/doc/visio_drawings/tw.xlsx b/doc/visio_drawings/tw.xlsx
new file mode 100644
index 00000000..2ddc86d3
--- /dev/null
+++ b/doc/visio_drawings/tw.xlsx
Binary files differ