TRex Stateless support ====================== :email: trex.tgen@gmail.com :quotes.++: :numbered: :web_server_url: https://trex-tgn.cisco.com/trex :local_web_server_url: csi-wiki-01:8181/trex :toclevels: 6 :tabledef-default.subs: normal,callouts include::trex_ga.asciidoc[] // PDF version - image width variable ifdef::backend-docbook[] :p_width: 450 endif::backend-docbook[] // HTML version - image width variable ifdef::backend-xhtml11[] :p_width: 800 endif::backend-xhtml11[] == TRex stateless L2 benchmarks using XL710 40G NICs === Setup details [cols="1,5"] |================= | Server: | UCSC-C240-M4SX | CPU: | 2 x Intel(R) Xeon(R) CPU E5-2667 v3 @ 3.20GHz | RAM: | 65536 @ 2133 MHz | NICs: | 2 x Intel Corporation Ethernet Controller XL710 for 40GbE QSFP+ (rev 01) | QSFP: | Cisco QSFP-H40G-AOC1M | OS: | Fedora 18 | Switch: | Cisco Nexus 3172 Chassis, System version: 6.0(2)U5(2). | TRex: | v2.09 using 7 cores per dual interface. |================= === Topology TRex port 1 ↔ Switch port Eth1/50 (vlan 1005) ↔ Switch port Eth1/52 (vlan 1005) ↔ TRex port 2 === Results .Cached VM [cols="2,2^,2^,2^,2^,2^,2^,2^,3", options="header"] |================= | Packet size | Line Utilization (%) | Total L1 (Gb/s) | Total L2 (Gb/s) | CPU Util (%) | Total MPPS | BW per core (Gb/s) <1> | MPPS per core <2> | Multiplier | Imix | 100.04 | 80.03 | 76.03 | 2.7 | 25.03 | 89.74 | 28.07 | 100% | 1514 | 100.12 | 80.1 | 79.05 | 1.33 | 6.53 | 430.18 | 35.07 | 100% | 590 | 99.36 | 79.49 | 76.89 | 3.2 | 16.29 | 177.43 | 36.36 | 99.5% | 128 | 99.56 | 79.65 | 68.89 | 15.4 | 67.27 | 36.94 | 31.2 | 99.5% | 64 | 52.8 | 42.3 | 32.23 | 14.1 | 62.95 | 21.43 | 31.89 | 31.5mpps |================= .VM with 1 variable [cols="2,2^,2^,2^,2^,2^,2^,2^,3", options="header"] |================= | Packet size | Line Utilization (%) | Total L1 (Gb/s) | Total L2 (Gb/s) | CPU Util (%) | Total MPPS | BW per core (Gb/s) <1> | MPPS per core <2> | Multiplier | Imix | 100.04 | 80.03 | 76.03 | 12.6 | 25.03 | 45.37 | 14.19 | 100% | 1514 | 100.12 | 80.1 | 79.05 | 2.6 | 6.53 | 220.05 | 17.94 | 100% | 590 | 99.36 | 79.49 | 76.89 | 5.6 | 16.29 | 101.39 | 20.78 | 99.5% | 128 | 99.56 | 79.65 | 68.89 | 33.1 | 67.27 | 17.19 | 14.52 | 99.5% | 64 | 52.8 | 42.3 | 32.23 | 31.3 | 63.06 | 9.65 | 14.37 | 31.5mpps |================= <1> Extrapolated L1 bandwidth per 1 core @ 100% CPU utilization. <2> Extrapolated amount of MPPS per 1 core @ 100% CPU utilization. == Appendix === Preparing setup and running the tests. ==== Hardware preparations Order the UCS with HW described above. * There are several NICs with this chipset. + Bare Intel NICs don't work with Cisco QSFP+ optics, for such case you will need Silicom NICs. * Use NICs with 2x40G ports in each. * Put the NICs at different NUMAs (first on the left side, second on the right side). ==== Software preparations * Install the OS (Bare metal Linux, *not* VM!) * Obtain the latest TRex package: wget https://trex-tgn.cisco.com/trex/release/latest * Untar the package: tar -xzf latest * Change dir to unzipped TRex * Create config file using command: sudo python dpdk_setup_ports.py -i ** In case of Ubuntu 16 need python3 ** See paragraph link:trex_stateless_bench.html#_config_creation[config creation] for detailed step-by-step ==== The tests * Run the TRex server: sudo ./t-rex-64 -i -c 7 * In another shell run TRex console: trex-console ** The console can be run from another computer with -s argument, --help for more info. ** Other options for TRex client are automation or GUI * In the console, run "tui" command, and then send the traffic with commands like: ** start -f stl/bench.py -m 50% --port 0 3 -t size=590,vm=var1 ** stop ** clear ** start -f stl/bench.py -m 30mpps --port 0 -t size=64,vm=cached ** start -f stl/bench.py -m 100% -t size=1514,vm=random --force ==== Config creation In our setup we will not use hyper-threading. + We will start with command: + sudo ./dpdk_setup_ports.py -i --no-ht + + Printed table with interfaces info: [cols="4,6,9,19,33,9,10,10", options="header"] |================= ^| ID ^| NUMA ^| PCI ^| MAC ^| Name ^| Driver ^| Linux IF ^| Active | 0 | 0 | 02:00.0 | 68:05:ca:32:15:b0 | Device 1583 | i40e | p1p1 | | 1 | 0 | 02:00.1 | 68:05:ca:32:15:b1 | Device 1583 | i40e | p1p2 | | 2 | 0 | 05:00.0 | 00:E0:ED:5D:82:D1 | Device 1583 | igb_uio | | | 3 | 0 | 05:00.1 | 00:E0:ED:5D:82:D2 | Device 1583 | igb_uio | | | 4 | 0 | 0a:00.0 | 04:62:73:5f:e8:a8 | I350 Gigabit Network Connection | igb | p4p1 | \*Active* | 5 | 0 | 0a:00.1 | 04:62:73:5f:e8:a9 | I350 Gigabit Network Connection | igb | p4p2 | | 6 | 1 | 84:00.0 | 68:05:CA:32:0C:38 | Device 1583 | igb_uio | | | 7 | 1 | 84:00.1 | 68:05:CA:32:0C:39 | Device 1583 | igb_uio | | |================= We will be asked to specify interfaces for TRex usage: ========================== Please choose even number of interfaces either by ID or PCI or Linux IF (look at columns above). + Stateful will use order of interfaces: Client1 Server1 Client2 Server2 etc. for flows. + Stateless can be in any order. + Try to choose each pair of interfaces to be on same NUMA within the pair for performance. + Enter list of interfaces in line (for example: 1 3) : *2 3 6 7* ========================== In our setup we have used 2, 3, 6, 7. + Next, we need to specify destination MAC addresses for given interfaces. + By default assumed loopback or L2 Switch with ports connection: 1^st^ port↔2^nd^ port, 3^rd^ port↔4^th^ port etc. + If you have router or L3 switch or some different connection, change the destination MACs accordingly. + In our case, ports are connected 2↔7, 3↔6. + We will give proper MACs as destination by clicking "y" and copy-paste MAC: ========================== For interface 2, assuming loopback to it's dual interface 3. + Destination MAC is 00:E0:ED:5D:82:D2. Change it to MAC of DUT? (y/N).*y* + Please enter new destination MAC of interface 2: *68:05:CA:32:0C:39* + For interface 3, assuming loopback to it's dual interface 2. + Destination MAC is 00:E0:ED:5D:82:D1. Change it to MAC of DUT? (y/N).*y* + Please enter new destination MAC of interface 3: *68:05:CA:32:0C:38* + For interface 6, assuming loopback to it's dual interface 7. + Destination MAC is 68:05:CA:32:0C:39. Change it to MAC of DUT? (y/N).*y* + Please enter new destination MAC of interface 6: *00:E0:ED:5D:82:D2* + For interface 7, assuming loopback to it's dual interface 6. + Destination MAC is 68:05:CA:32:0C:38. Change it to MAC of DUT? (y/N).*y* + Please enter new destination MAC of interface 7: *00:E0:ED:5D:82:D1* ========================== Finally, you can print generated config and save it to file: ========================== Print preview of generated config? (Y/n) + ++++
### Config file generated by dpdk_setup_ports.py ###

- port_limit: 4
  version: 2
  interfaces: ['05:00.0', '05:00.1', '84:00.0', '84:00.1']
  port_info:
      - dest_mac: [0x68, 0x05, 0xca, 0x32, 0x0c, 0x39]
        src_mac:  [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd1]
      - dest_mac: [0x68, 0x05, 0xca, 0x32, 0x0c, 0x38]
        src_mac:  [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd2]

      - dest_mac: [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd2]
        src_mac:  [0x68, 0x05, 0xca, 0x32, 0x0c, 0x38]
      - dest_mac: [0x00, 0xe0, 0xed, 0x5d, 0x82, 0xd1]
        src_mac:  [0x68, 0x05, 0xca, 0x32, 0x0c, 0x39]

  platform:
      master_thread_id: 0
      latency_thread_id: 15
      dual_if:
        - socket: 0
          threads: [1,2,3,4,5,6,7] +

        - socket: 1
          threads: [8,9,10,11,12,13,14]

++++ Save the config to file? (Y/n) + Default filename is /etc/trex_cfg.yaml + Press ENTER to confirm or enter new file: + File /etc/trex_cfg.yaml already exist, overwrite? (y/N)*y* + Saved. ========================== === Some of screenshots of console with commands ==== 64 bytes Utilization: image:images/64_util.png[title="64 bytes util",align="left",width={p_width}, link="images/64_util.png"] No drops: image:images/64_nodrop.png[title="64 bytes no drops",align="left",width={p_width}, link="images/64_nodrop.png"] ==== 128 bytes Utilization: image:images/128_util.png[title="128 bytes util",align="left",width={p_width}, link="images/128_util.png"] No drops: image:images/128_nodrop.png[title="128 bytes no drops",align="left",width={p_width}, link="images/128_nodrop.png"] ==== 590 bytes Utilization: image:images/590_util.png[title="128 bytes util",align="left",width={p_width}, link="images/590_util.png"] No drops: image:images/590_nodrop.png[title="590 bytes no drops",align="left",width={p_width}, link="images/590_nodrop.png"] ==== 1514 bytes Utilization: image:images/1514_util.png[title="128 bytes util",align="left",width={p_width}, link="images/1514_util.png"] No drops: image:images/1514_nodrop.png[title="1514 bytes no drops",align="left",width={p_width}, link="images/1514_nodrop.png"]