C-Docs: Fixes and improvments

Change-Id: I167720869e15236fde76a91559c17bf94f0ed68d Signed-off-by: Tibor Frank <tifrank@cisco.com>
author: Tibor Frank <tifrank@cisco.com> 2023-06-16 12:27:45 +0000
committer: Tibor Frank <tifrank@cisco.com> 2023-06-19 11:01:51 +0000
commit: 62a485fa3753dbb9e0cf337164fe65d213e89ef2 (patch)
tree: f2de17f41dc07d48906e232dd0f8032826718bcd /docs/content/methodology/overview
parent: 67c40f2918753bbadc56545d84fb81030334e9dc (diff)
2 files changed, 204 insertions, 0 deletions
diff --git a/docs/content/methodology/overview/_index.md b/docs/content/methodology/overview/_index.md
index 10f362013f..d336361754 100644
--- a/docs/content/methodology/overview/_index.md
+++ b/docs/content/methodology/overview/_index.md
@@ -4,3 +4,12 @@ bookFlatSection: false
 title: "Overview"
 weight: 1
 ---
+
+# Methodology
+
+- [Terminology](terminology)
+- [Per Thread Resources](per_thread_resources)
+- [Multi-Core Speedup](multi_core_speedup)
+- [VPP Forwarding Modes](vpp_forwarding_modes)
+- [DUT State Considerations](dut_state_considerations)
+- [TRex Traffic Generator](trex_traffic_generator)
diff --git a/docs/content/methodology/overview/trex_traffic_generator.md b/docs/content/methodology/overview/trex_traffic_generator.md
new file mode 100644
index 0000000000..8771bf9780
--- /dev/null
+++ b/docs/content/methodology/overview/trex_traffic_generator.md
@@ -0,0 +1,195 @@
+---
+title: "TRex Traffic Generator"
+weight: 6
+---
+
+# TRex Traffic Generator
+
+## Usage
+
+[TRex traffic generator](https://trex-tgn.cisco.com) is used for majority of
+CSIT performance tests. TRex is used in multiple types of performance tests,
+see [Data Plane Throughtput]({{< ref "../measurements/data_plane_throughput/data_plane_throughput/#Data Plane Throughtput" >}})
+for more details.
+
+## Traffic modes
+
+TRex is primarily used in two (mutually incompatible) modes.
+
+### Stateless mode
+
+Sometimes abbreviated as STL.
+A mode with high performance, which is unable to react to incoming traffic.
+We use this mode whenever it is possible.
+Typical test where this mode is not applicable is NAT44ED,
+as DUT does not assign deterministic outside address+port combinations,
+so we are unable to create traffic that does not lose packets
+in out2in direction.
+
+Measurement results are based on simple L2 counters
+(opackets, ipackets) for each traffic direction.
+
+### Stateful mode
+
+A mode capable of reacting to incoming traffic.
+Contrary to the stateless mode, only UDP and TCP is supported
+(carried over IPv4 or IPv6 packets).
+Performance is limited, as TRex needs to do more CPU processing.
+TRex suports two subtypes of stateful traffic,
+CSIT uses ASTF (Advanced STateFul mode).
+
+This mode is suitable for NAT44ED tests, as clients send packets from inside,
+and servers react to it, so they see the outside address and port to respond to.
+Also, they do not send traffic before NAT44ED has created the corresponding
+translation entry.
+
+When possible, L2 counters (opackets, ipackets) are used.
+Some tests need L7 counters, which track protocol state (e.g. TCP),
+but those values are less than reliable on high loads.
+
+## Traffic Continuity
+
+Generated traffic is either continuous, or limited (by number of transactions).
+Both modes support both continuities in principle.
+
+### Continuous traffic
+
+Traffic is started without any data size goal.
+Traffic is ended based on time duration, as hinted by search algorithm.
+This is useful when DUT behavior does not depend on the traffic duration.
+The default for stateless mode.
+
+### Limited traffic
+
+Traffic has defined data size goal (given as number of transactions),
+duration is computed based on this goal.
+Traffic is ended when the size goal is reached,
+or when the computed duration is reached.
+This is useful when DUT behavior depends on traffic size,
+e.g. target number of NAT translation entries, each to be hit exactly once
+per direction.
+This is used mainly for stateful mode.
+
+## Traffic synchronicity
+
+Traffic can be generated synchronously (test waits for duration)
+or asynchronously (test operates during traffic and stops traffic explicitly).
+
+### Synchronous traffic
+
+Trial measurement is driven by given (or precomputed) duration,
+no activity from test driver during the traffic.
+Used for most trials.
+
+### Asynchronous traffic
+
+Traffic is started, but then the test driver is free to perform
+other actions, before stopping the traffic explicitly.
+This is used mainly by reconf tests, but also by some trials
+used for runtime telemetry.
+
+## Trafic profiles
+
+TRex supports several ways to define the traffic.
+CSIT uses small Python modules based on Scapy as definitions.
+Details of traffic profiles depend on modes (STL or ASTF),
+but some are common for both modes.
+
+Search algorithms are intentionally unaware of the traffic mode used,
+so CSIT defines some terms to use instead of mode-specific TRex terms.
+
+### Transactions
+
+TRex traffic profile defines a small number of behaviors,
+in CSIT called transaction templates. Traffic profiles also instruct
+TRex how to create a large number of transactions based on the templates.
+
+Continuous traffic loops over the generated transactions.
+Limited traffic usually executes each transaction once
+(typically as constant number of loops over source addresses,
+each loop with different source ports).
+
+Currently, ASTF profiles define one transaction template each.
+Number of packets expected per one transaction varies based on profile details,
+as does the criterion for when a transaction is considered successful.
+
+Stateless transactions are just one packet (sent from one TG port,
+successful if received on the other TG port).
+Thus unidirectional stateless profiles define one transaction template,
+bidirectional stateless profiles define two transaction templates.
+
+### TPS multiplier
+
+TRex aims to open transaction specified by the profile at a steady rate.
+While TRex allows the transaction template to define its intended "cps" value,
+CSIT does not specify it, so the default value of 1 is applied,
+meaning TRex will open one transaction per second (and transaction template)
+by default. But CSIT invocation uses "multiplier" (mult) argument
+when starting the traffic, that multiplies the cps value,
+meaning it acts as TPS (transactions per second) input.
+
+With a slight abuse of nomenclature, bidirectional stateless tests
+set "packets per transaction" value to 2, just to keep the TPS semantics
+as a unidirectional input value.
+
+### Duration stretching
+
+TRex can be IO-bound, CPU-bound, or have any other reason
+why it is not able to generate the traffic at the requested TPS.
+Some conditions are detected, leading to TRex failure,
+for example when the bandwidth does not fit into the line capacity.
+But many reasons are not detected.
+
+Unfortunately, TRex frequently reacts by not honoring the duration
+in synchronous mode, taking longer to send the traffic,
+leading to lower then requested load offered to DUT.
+This usualy breaks assumptions used in search algorithms,
+so it has to be avoided.
+
+For stateless traffic, the behavior is quite deterministic,
+so the workaround is to apply a fictional TPS limit (max_rate)
+to search algorithms, usually depending only on the NIC used.
+
+For stateful traffic the behavior is not deterministic enough,
+for example the limit for TCP traffic depends on DUT packet loss.
+In CSIT we decided to use logic similar to asynchronous traffic.
+The traffic driver sleeps for a time, then stops the traffic explicitly.
+The library that parses counters into measurement results
+than usually treats unsent packets/transactions as lost/failed.
+
+We have added a IP4base tests for every NAT44ED test,
+so that users can compare results.
+If the results are very similar, it is probable TRex was the bottleneck.
+
+### Startup delay
+
+By investigating TRex behavior, it was found that TRex does not start
+the traffic in ASTF mode immediately. There is a delay of zero traffic,
+after which the traffic rate ramps up to the defined TPS value.
+
+It is possible to poll for counters during the traffic
+(fist nonzero means traffic has started),
+but that was found to influence the NDR results.
+
+Thus "sleep and stop" stategy is used, which needs a correction
+to the computed duration so traffic is stopped after the intended
+duration of real traffic. Luckily, it turns out this correction
+is not dependend on traffic profile nor CPU used by TRex,
+so a fixed constant (0.112 seconds) works well.
+Unfortunately, the constant may depend on TRex version,
+or execution environment (e.g. TRex in AWS).
+
+The result computations need a precise enough duration of the real traffic,
+luckily server side of TRex has precise enough counter for that.
+
+It is unknown whether stateless traffic profiles also exhibit a startup delay.
+Unfortunately, stateless mode does not have similarly precise duration counter,
+so some results (mostly MRR) are affected by less precise duration measurement
+in Python part of CSIT code.
+
+## Measuring Latency
+
+If measurement of latency is requested, two more packet streams are
+created (one for each direction) with TRex flow_stats parameter set to
+STLFlowLatencyStats. In that case, returned statistics will also include
+min/avg/max latency values and encoded HDRHistogram data.
author	Tibor Frank <tifrank@cisco.com>	2023-06-16 12:27:45 +0000
committer	Tibor Frank <tifrank@cisco.com>	2023-06-19 11:01:51 +0000
commit	62a485fa3753dbb9e0cf337164fe65d213e89ef2 (patch)
tree	f2de17f41dc07d48906e232dd0f8032826718bcd /docs/content/methodology/overview
parent	67c40f2918753bbadc56545d84fb81030334e9dc (diff)