summaryrefslogtreecommitdiffstats
path: root/doc/trex_stateless_bench.asciidoc
blob: 2e0cbf3a182cfeb39d99ebba940a64f4b3a723bb (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
TRex Stateless support
======================
:email: trex.tgen@gmail.com 
:quotes.++:
:numbered:
:web_server_url: https://trex-tgn.cisco.com/trex
:local_web_server_url: csi-wiki-01:8181/trex
:toclevels: 6
:tabledef-default.subs: normal,callouts

include::trex_ga.asciidoc[]

// PDF version - image width variable
ifdef::backend-docbook[]
:p_width: 450
endif::backend-docbook[]

// HTML version - image width variable
ifdef::backend-xhtml11[]
:p_width: 800
endif::backend-xhtml11[]


== TRex stateless L2 benchmarks using XL710 40G NICs

=== Setup details

[cols="1,5"]
|=================
| Server: | UCSC-C240-M4SX
| CPU:    | 2 x Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz
| RAM:    | 65536 @ 2133 MHz
| NICs:   | 2 x Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 01)
| QSFP:   | Cisco QSFP-H40G-AOC1M
| OS:     | Fedora 18
| Switch: | Cisco Nexus 3172 Chassis, System version: 6.0(2)U5(2).
| TRex:   | v2.09 using 7 cores per dual interface.
|=================

=== Topology

TRex port 1 ↔ Switch port Eth1/50 (vlan 1005) ↔ Switch port Eth1/52 (vlan 1005) ↔ TRex port 2

=== Results

.Cached VM
[cols="2,2^,2^,2^,2^,2^,2^,2^,3", options="header"]
|=================
| Packet size | Line Utilization (%) | Total L1 (Gb/s) | Total L2 (Gb/s) | CPU Util (%) | Total MPPS | BW per core (Gb/s) <1> | MPPS per core <2> | Multiplier
| Imix        | 100.04               | 80.03           | 76.03           | 2.7          | 25.03      | 89.74                  | 28.07             | 100%
| 1514        | 100.12               | 80.1            | 79.05           | 1.33         | 6.53       | 430.18                 | 35.07             | 100%
| 590         | 99.36                | 79.49           | 76.89           | 3.2          | 16.29      | 177.43                 | 36.36             | 99.5%
| 128         | 99.56                | 79.65           | 68.89           | 15.4         | 67.27      | 36.94                  | 31.2              | 99.5%
| 64          | 52.8                 | 42.3            | 32.23           | 14.1         | 62.95      | 21.43                  | 31.89             | 31.5mpps
|=================

.VM with 1 variable
[cols="2,2^,2^,2^,2^,2^,2^,2^,3", options="header"]
|=================
| Packet size | Line Utilization (%) | Total L1 (Gb/s) | Total L2 (Gb/s) | CPU Util (%) | Total MPPS | BW per core (Gb/s) <1> | MPPS per core <2> | Multiplier
| Imix        | 100.04               | 80.03           | 76.03           | 12.6         | 25.03      | 45.37                  | 14.19             | 100%
| 1514        | 100.12               | 80.1            | 79.05           | 2.6          | 6.53       | 220.05                 | 17.94             | 100%
| 590         | 99.36                | 79.49           | 76.89           | 5.6          | 16.29      | 101.39                 | 20.78             | 99.5%
| 128         | 99.56                | 79.65           | 68.89           | 33.1         | 67.27      | 17.19                  | 14.52             | 99.5%
| 64          | 52.8                 | 42.3            | 32.23           | 31.3         | 63.06      | 9.65                   | 14.37             | 31.5mpps
|=================

<1> Extrapolated L1 bandwidth per 1 core @ 100% CPU utilization.
<2> Extrapolated amount of MPPS per 1 core @ 100% CPU utilization.

== Appendix

=== Preparing setup and running the tests.

==== Hardware preparations

Order the UCS with HW described above.

* There are several NICs with this chipset. +
Bare Intel NICs don't work with Cisco QSFP+ optics, for such case you will need Silicom NICs.
* Use NICs with 2x40G ports in each.
* Put the NICs at different NUMAs (first on the left side, second on the right side).

==== Software preparations

* Install the OS (Bare metal Linux, *not* VM!)
* Obtain the latest TRex package: wget https://trex-tgn.cisco.com/trex/release/latest
* Untar the package: tar -xzf latest
* Change dir to unzipped TRex
* Create config file using command: sudo python dpdk_setup_ports.py -i
** In case of Ubuntu 16 need python3
** See paragraph link:trex_stateless_bench.html#_config_creation[config creation] for detailed step-by-step

==== The tests

* Run the TRex server: sudo ./t-rex-64 -i -c 7
* In another shell run TRex console: trex-console
** The console can be run from another computer with -s argument, --help for more info.
** Other options for TRex client are automation or GUI
* In the console, run "tui" command, and then send the traffic with commands like:
** start -f stl/bench.py -m 50% --port 0 3 -t size=590,vm=var1
** stop
** clear
** start -f stl/bench.py -m 30mpps --port 0 -t size=64,vm=cached
** start -f stl/bench.py -m 100% -t size=1514,vm=random --force

==== Config creation

In our setup we will not use hyper-threading. +
We will start with command: +
sudo ./dpdk_setup_ports.py -i --no-ht +
 +
Printed table with interfaces info:

[cols="4,6,9,19,33,9,10,10", options="header"]
|=================
^| ID ^| NUMA ^|   PCI   ^|        MAC        ^|              Name               ^| Driver  ^| Linux IF ^|  Active
| 0  | 0    | 02:00.0 | 68:05:ca:32:15:b0 | Device 1583                     | i40e    | p1p1     |
| 1  | 0    | 02:00.1 | 68:05:ca:32:15:b1 | Device 1583                     | i40e    | p1p2     |
| 2  | 0    | 05:00.0 | 00:E0:ED:5D:82:D1 | Device 1583                     | igb_uio |          |
| 3  | 0    | 05:00.1 | 00:E0:ED:5D:82:D2 | Device 1583                     | igb_uio |          |
| 4  | 0    | 0a:00.0 | 04:62:73:5f:e8:a8 | I350 Gigabit Network Connection | igb     | p4p1     | \*Active*
| 5  | 0    | 0a:00.1 | 04:62:73:5f:e8:a9 | I350 Gigabit Network Connection | igb     | p4p2     |
| 6  | 1    | 84:00.0 | 68:05:CA:32:0C:38 | Device 1583                     | igb_uio |          |
| 7  | 1    | 84:00.1 | 68:05:CA:32:0C:39 | Device 1583                     | igb_uio |          |
|=================

We will be asked to specify interfaces for TRex usage:

==========================
Please choose even number of interfaces either by ID or PCI or Linux IF (look at columns above). +
Stateful will use order of interfaces: Client1 Server1 Client2 Server2 etc. for flows. +
Stateless can be in any order. +
Try to choose each pair of interfaces to be on same NUMA within the pair for performance. +
Enter list of interfaces in line (for example: 1 3) : *2 3 6 7*
==========================

In our setup we have used 2, 3, 6, 7. +
Next, we need to specify destination MAC addresses for given interfaces. +
By default assumed loopback or L2 Switch with ports connection: 1^st^ port&#8596;2^nd^ port, 3^rd^ port&#8596;4^th^ port etc. +
If you have router or L3 switch or some different connection, change the destination MACs accordingly. +
In our case, ports are connected 2&#8596;7, 3&#8596;6. +
We will give proper MACs as destination by clicking "y" and copy-paste MAC:

==========================
For interface 2, assuming loopback to it's dual interface 3. +
Destination MAC is 00:E0:ED:5D:82:D2. Change it to MAC of DUT? (y/N).*y* +
Please enter new destination MAC of interface 2: *68:05:CA:32:0C:39* +
For interface 3, assuming loopback to it's dual interface 2. +
Destination MAC is 00:E0:ED:5D:82:D1. Change it to MAC of DUT? (y/N).*y* +
Please enter new destination MAC of interface 3: *68:05:CA:32:0C:38* +
For interface 6, assuming loopback to it's dual interface 7. +
Destination MAC is 68:05:CA:32:0C:39. Change it to MAC of DUT? (y/N).*y* +
Please enter new destination MAC of interface 6: *00:E0:ED:5D:82:D2* +
For interface 7, assuming loopback to it's dual interface 6. +
Destination MAC is 68:05:CA:32:0C:38. Change it to MAC of DUT? (y/N).*y* +
Please enter new destination MAC of interface 7: *00:E0:ED:5D:82:D1*
==========================

Finally, you can print generated config and save it to file:

==========================
Print preview of generated config? (Y/n) +
++++
<pre>### Config file generated by dpdk_setup_ports.py ###

- port_limit: 4
  version: 2
  interfaces: ['05:00.0', '05:00.1', '84:00.0', '84:00.1']
  port_info:
      - dest_mac: [0x68, 0x05, 0xca, 0x32, 0x0c, 0x39]
        src_mac:  [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd1]
      - dest_mac: [0x68, 0x05, 0xca, 0x32, 0x0c, 0x38]
        src_mac:  [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd2]

      - dest_mac: [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd2]
        src_mac:  [0x68, 0x05, 0xca, 0x32, 0x0c, 0x38]
      - dest_mac: [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd1]
        src_mac:  [0x68, 0x05, 0xca, 0x32, 0x0c, 0x39]

  platform:
      master_thread_id: 0
      latency_thread_id: 15
      dual_if:
        - socket: 0
          threads: [1,2,3,4,5,6,7] +

        - socket: 1
          threads: [8,9,10,11,12,13,14]

</pre>
++++
Save the config to file? (Y/n) +
Default filename is /etc/trex_cfg.yaml +
Press ENTER to confirm or enter new file: +
File /etc/trex_cfg.yaml already exist, overwrite? (y/N)*y* +
Saved.
==========================


=== Some of screenshots of console with commands

==== 64 bytes

Utilization:

image:images/64_util.png[title="64 bytes util",align="left",width={p_width}, link="images/64_util.png"]

No drops:

image:images/64_nodrop.png[title="64 bytes no drops",align="left",width={p_width}, link="images/64_nodrop.png"]

==== 128 bytes

Utilization:

image:images/128_util.png[title="128 bytes util",align="left",width={p_width}, link="images/128_util.png"]

No drops:

image:images/128_nodrop.png[title="128 bytes no drops",align="left",width={p_width}, link="images/128_nodrop.png"]

==== 590 bytes

Utilization:

image:images/590_util.png[title="128 bytes util",align="left",width={p_width}, link="images/590_util.png"]

No drops:

image:images/590_nodrop.png[title="590 bytes no drops",align="left",width={p_width}, link="images/590_nodrop.png"]

==== 1514 bytes

Utilization:

image:images/1514_util.png[title="128 bytes util",align="left",width={p_width}, link="images/1514_util.png"]

No drops:

image:images/1514_nodrop.png[title="1514 bytes no drops",align="left",width={p_width}, link="images/1514_nodrop.png"]