csit - Integration tests

Age	Commit message (Collapse)	Author	Files	Lines
2023-05-26	feat(core): T-Rex 3.03	pmikus	1	-1/+1
	Signed-off-by: pmikus <peter.mikus@protonmail.ch> Change-Id: I58607f50e2889092e40ff831ed4f1515444e29f8
2023-02-09	Add 1M flows test suites of VPP IP4 and VPP IP6	Yulong Pei	2	-0/+277
	VPP routing with one million fib rules also is a key indicator for performance benchmark Signed-off-by: xinfeng zhao <xinfengx.zhao@intel.com> Signed-off-by: Yulong Pei <yulong.pei@intel.com> Change-Id: I19b52f6b96bdc5ed1e76305d258825fb17bd3af9
2022-10-13	feat(trex): Bump T-Rex to v3.00	pmikus	1	-1/+1
	Signed-off-by: pmikus <peter.mikus@protonmail.ch> Change-Id: I9aa88fb094b03888fc30d84edc1deaa406075db4
2022-09-30	Add wireguard multiple tunnels test suites	Yulong Pei	6	-0/+920
	Signed-off-by: Yulong Pei <yulong.pei@intel.com> Change-Id: I7abd546e67fdbe481b204bb6a1ec7e9c654dcdae
2022-07-20	fix(astf): avoid issues in pps tput	Vratko Polak	5	-45/+60
	When more than 1 data packet is sent in the same chunk, TRex is sometimes not fully deterministic in its usage of delayed ACKs. This changes the "application protocol" to sent 5 chunks (1 data packet each), c2s and s2c interleaved, so each subsequent chunk acts as an ACK. The overall packet count remains the same, and even though this interleaved way may be more demanding on TRex CPU, preliminary results show NAT performance is still well below ip4base performance. As a side effect, the interleaved way seems to work also for 100B data frames, so we are avoiding two issues at once. Ticket: CSIT-1846 Ticket: CSIT-1830 Change-Id: Ia4dcfa7c89f2c08fc32bd6118e2e009316b33c25 Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2022-05-16	Core: T-rex 2.97	pmikus	1	-1/+1
	Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Id4d84aa7268080843b099fd7ab9851234612968b
2022-05-03	feat(astf): Support framesizes for ASTF	Vratko Polak	21	-640/+617
	- No support for IMIX. + Fix a bad bug in padding (most ASTF profiles had wrong frame sizes). + Fix a big typo in TCP PPS profiles (s->c was not data, just RST). + Control transaction size via ASTF_N_DATA_FRAMES env variable. - Default value 5 leads to transactions smaller than before. + It ensures transaction is one burst (per direction) even for jumbo. + Edit autogen to set supported frame sizes based on suite id. + Both TCP and UDP use the same values: + 64B for CPS (exact for UDP, nominal for TCP). + 100B, 1518B and 9000B for TPUT and PPS. - TCP TPUT achievable minimum is 70B. + Used 100B to leave room for possible IPv6 ASTF tests. + Separate function for code reused by vpp and trex tests. - I do not really like the new "copy and edit" approach added here. + But it is a quick edit, better autogen refactor is low priority. + Consider both established and transitory sessions as valid. - Mostly for compatibility with 2202 behavior and to avoid ramp-ups. - Assuming both session states have similar enough VPP CPU overhead. + Added a TODO to investigate and maybe reconsider later. + Update the state timeout value to 240s. + That is the default for TCP (for transitory state). - UDP could keep using 300s. + But I prefer UDP and TCP to behave as similarly as possible. + Use TRex tunables to get the exact frame size (for data packets). - It is not clear why the recipe for MSS has to be this complicated. + Move code away from profile init, as frame size is not known there. + Change internal profile API, so values related to MSS are passed. + Lower ramp-up rate for TCP TPUT tests. + Because without lower rate, jumbo fails on packet loss in ramp-up. + UDP TPUT ramp-up rate also lowered (just to keep suites more similar). + Distinguish one-direction and aggregated average frame size. + Update keyword documentation where the distiction matters. + One-direction is needed for turning bandwidth limit to TPS limit. + Aggregated is needed for correct NDRPDR bandwidth result value. - TCP TPUT will always be few percent below bidirectional maximum. + That is unavoidable, as one direction sends more control packets. + Add runtime consistency checks so future refactors are safer. + Fail if padding requested would be negative. + Fail if suite claims unexpected values for packets per transaction. + Edit the 4 types of ASTF profiles to keep them similar to each other. + Move UDP TPUT limit value from a field back to direct argument. + Stop pretending first UDP packet is not data. + Apply small improvements where convenient. + Replace "aggregate" with "aggregated" where possible. + To lower probability of any future typos in variable names. + Avoid calling Set Numeric Frame Sizes twice. + Code formatting, keyword documentation, code comments, ... + Add TODOs for less important code quality improvements. - Postpone updating of methodology pages to a subsequent change. Change-Id: I4b381e5210e69669f972326202fdcc5a2c9c923b Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2021-06-10	FIX: Pylint reduce	pmikus	44	-81/+46
	Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I909942dbb920df7f0fe15c0c92cda92c3cd8d8ad
2021-03-16	Perf: Bump T-Rex to 2.88	pmikus	1	-1/+1
	+ Mellanox 4.6 is not for Ubuntu 20.04 + Mellanox for ubuntu 20.04 is 4.9+ + T-Rex 2.86 is not for Mellanox 4.6+ + T-Rex for Mellanox 5.2 is 2.88+ ================================== = Bump Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: I902dfc2a43e6718b385e89f31a34260e09d61bd3
2021-02-26	IPsec: add 2n crypto udir perf tests	Juraj Linkeš	12	-0/+1261
	Add unidirectional 2n crypto tests. Only one direction can be tested on a 2 node topology, since we can't use the same interface for both encrypted and unencrypted traffic. Add the following tests: * {n}tnlsw-ip4base-int ndrpdr tests * {n}tnlsw-1atnl-ip4base-int reconf tests * {n}tnlswasync-scheduler-ip4base-int ndrpdr tests Where n is the number of tunnels: 1, 4, 40, 400, 1000, 5000, 10000, 20000, 40000, 60000 for the first two 1, 2, 4, 8 for the async scheduler tests All of these with the following ecryption-auth algorithms: aes128gcm aes256gcm aes128cbc-hmac256sha aes128cbc-hmac512sha Also add the corresponding trex profiles: trex-stl-2n-ethip4-ip4dst{n}-udir.py Where n is the number of tunnels listed above. The profiles are shared among the tests. Change-Id: I22bb46e6ad59801581a78aa19310bee8a5293e56 Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
2021-02-19	Add test suites for crypto sw scheduler engine	Yulong Pei	2	-0/+270
	This patch is to add test suites for vpp plugin crypto_sw_scheduler, IPsec sync mode is to do crypto and packet forward work in same worker cores, crypto_sw_scheduler can schedule crypto work to other async crypto cores to improve whole crypto processing capability. This test suites configure fixed 1 rx queues per port, then measure IPsec performance with 1, 2, 3 crypto cores. This patchset include 1, 2, 4, 8 ipsec tunnels test cases. +Vratko help to change to count total physical cores instead of previous only count crypto cores in test cases. Change-Id: I0e67182e3d13273890a23703d838101900e25126 Signed-off-by: Yulong Pei <yulong.pei@intel.com> Signed-off-by: Vratko Polak <vrpolak@cisco.com> Signed-off-by: pmikus <pmikus@cisco.com>
2021-02-03	UDP_PPS: Ensure keepalive is long enough	Vratko Polak	5	-10/+10
	Previously, it was long enough for the current performance, but not long enough for the theoretical worst case. See https://gerrit.fd.io/r/c/csit/+/29803/9/docs/report/introduction/methodology_nat44.rst#293 Change-Id: I3f57a834c77d93b38bca81fdbb714a6374b81bae Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2021-01-20	Add suites with randomized ip6 profiles	Vratko Polak	6	-16/+25
	+ Replace pair of traffic profiles (2n and 3n) with single nodeless one. + Compared to incremental suites, randomized ones add IP6_RND tag. Change-Id: I2f0dfc9e04bbcd0f88e95b92edf2da2c73faaab6 Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2021-01-18	Random flows: Use seeds again and increase limit.	Vratko Polak	3	-45/+54
	TRex does mix seeds when distributing over workers, but it is multiplicative [0], so zero is the only bad value. Limit restricts the cycle length of PRNG (by resetting [1] the seed). We want the cycle as long as possible. [0] https://github.com/cisco-system-traffic-generator/trex-core/blob/v2.73/src/stx/stl/trex_stl_stream_vm.h#L1616 [1] https://github.com/cisco-system-traffic-generator/trex-core/blob/v2.73/src/stx/stl/trex_stl_stream_vm.h#L313-L314 Change-Id: I33a29496f0853ef60d592c988f81a9d1109b5878 Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2021-01-14	perf: GENEVE tunnel test, l3 mode	Jan Gelety	7	-12/+1403
	Jira: CSIT-1768 Change-Id: I888ae1a5754fa07297d4cdf65c2be0e3e49d89a5 Signed-off-by: Jan Gelety <jgelety@cisco.com>
2021-01-13	Add 3n ip4-rnd tests	Vratko Polak	3	-0/+0
	+ Rename traffic profiles to avoid mentioning number of nodes. + Improve 2n rnd suite documentation slightly. Change-Id: I82d6fb6a99133163a58d56f2acf8a7b9568ee77c Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2021-01-12	License: Wrap GPL block to 80 characters	Vratko Polak	163	-325/+488
	The original license block was adapted from https://wiki.fd.io/view/TSC/Proposed_Header_Python_Test_Scripts resulting in a line longer than 80 chars, but those are reported (although not blocked) by tox verify job. As the text from wiki was not used in verbatim (it uses c-style comment block), minor formatting change like this should not be a big deal. + Bump copyright year. Change-Id: I55e3a0232639b448b1a6d7b1f3af84d903a8d0a5 Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2021-01-11	tests: add 2n1l l2 acl tests, update 2n-tx2 specs	Juraj Linkeš	1	-0/+170
	Modify initialize L2BD ACL keywords to be usable for both 2-node and 3-node topologies and update testsuites accordingly. Add the missing macip T-rex profile. Add classifier tests to 2n-tx2 job specs. Change-Id: I17b84b8fc18ef9a6f275ae0238a0665ac2017f01 Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
2020-11-18	T-Rex: 2.86	pmikus	1	-1/+1
	Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Id56b87ab868f2897a6563914b0beca2acc25e706
2020-11-12	Switch licenses in GPL directory	Vratko Polak	162	-486/+1944
	To be merged after this completes: https://wiki.fd.io/view/TSC/Relicensing_Procedure Change-Id: I003e53a620a5f82ba2bcc65b12f9c84ae92264ef Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2020-11-10	ASTF: Lessen L7 processing in UDP_CPS profiles	Vratko Polak	5	-40/+30
	Change-Id: I8b154156120821adb24273db2a232fa82200c0fe Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2020-10-29	Support existing test types with ASTF	Vratko Polak	20	-70/+1340
	+ Add UDP_CPS, TCP_CPS, UDP_PPS and TCP_PPS suites. + Update existing cps traffic profiles. + Add missing traffic profiles. + UDP: + Single burst of 32 packets was confirmed as safe enough for TRex. + Maybe 64 could work, but not enough testing for that. + Multiple bursts have lead to reduced TRex performance, as overlaping bursts (from different client instances) tend to fill up the buffers. + TCP: + Data size set to 11111 bytes, completely arbitrarily. + Results look reasonable, so I have kept that. - MSS not set at all - No tested support for frame size other than 64B. - Frame size does not even factor into TCP profiles. + So other frame sizes are skipped in autogen. + Update tags in related suites. - HOSTS_{n} and SRC_USER_{n} should be unified. - Questionable clarification on difference between IP4BASE and SCALE. + Add NAT state resetters to tests that need them. + Resetter is called (if set) before each measurement. + If ramp-up is detected, resetter is not set. + Rename "mult" argument to "multiplier". + Abstracted from packets to transactions. + Transaction corresponds to profile. + TRex multiplier argument sets target rate in transactions per second. + The familiar STL traffic: + Bidirectional is considered to be 2 packets per transaction. + Unidirectional is considered to be 1 packet per transaction. + The newer ASTF traffic: + 4 subtypes, each has different number of packets per transaction. + For max rate computation: + Packets in the more numerous direction are considered. + Rely on TRex reported traffic duration for ASTF: + Use the server side value. - Client side value is higher by an overhead. - TRex is not sending traffic during that time. + Remove delays from traffic profiles. - Those delays would increase the reprted traffic time. + Support for scale lmited trials. + Only for ASTF profiles, each ASTF profile has limited scale. + Scale defined in suite variables. + For TRex to send all transactions provided duration value is ignored. + The appropriate value is computed in TrafficGenerator. + An ad-hoc time constant is added to match the TRex client side time overhead. + The profile driver receives the computed duration. + Measurement for PLRsearch add a sleep if the computed duration is smaller. + Alternative argument for search algos if scale is limited. + Both need higher timeout to accomodate big scales. + MLRsearch can afford fewer phases. + Added a parameter to optionally shorten the duration. + Use short duration for runtime stats trial and failure stats trial. + Use very large keepalive values in udp profiles to avoid ka packets. + No polling in ASTF profile driver. - Polling could eliminate the time overhead value. + But polling proved to introduce some loss, affecting the results. + Handle duration stretching in ASTF by stopping traffic. + The stop has several steps so that: + The traffic is really stopped entirely. + Late packets do not count (maybe as errors). + Stats are preserved to read for results (and cleared afterwards). + Several quantities added to ReceiveRateMeasurement: + Original target duration is preserved (algos need that). + Input estimate (tps) for early search iterations. + Output estimate (maybe pps) for MRR output. + Strict result (unsent counts as loss) for NDR. + Use L2 counters (opackets, ipackets) where possible. - TRex has trouble processing packets for the L7 ones at high loads. + Remove warmup from profile drivers and keywords. + Suites should call "Send ramp-up traffic" explicitly if needed. + Added parsing for few more counters. + Both to use in formulas or just for debug purposes. - Only 64B cases in autogen, framesize support to be added later. + Latency streams during search can be enabled via PERF_USE_LATENCY env var. + MLRsearch improvments: + Rename argument names to min_rate and max_rate. + Use relative receive rate in initial phase. + PLRsearch improvements: + Careful computation when output (pps) does not match input (tps). + Use geometric distribution (instead of Poisson). + Helps agains math errors. + This should improve estimate stability. - But in practice big losses still lead to significant jumps. + Traffic generator improvements: + send_traffic_on_tg now calls the full set_rate_provider_defaults. + _send_traffic_on_tg_internal for the logic without provider defaults. + As the internal function is re-used by measure() without affecting defaults. + Move _parse_traffic_results just before get_measurement_result. + As the latter uses fields set bu the former, it is now easier to read. + Multiple sources for approximate duration. + Tried from more precise to more available. + Includes logic for _pps tests (added in later change). + Move explicit type conversions to earlier occurences. + Profile driver output field uses semicolons to simplify parsing. + Performance Robot lib file split to several smaller ones. + performance_actions.robot: + Hosts Additional Statistics Action For * keywords. + performance_display.robot: + Hosts keyword for displaying and verifying results. + Change test message to use the correct unit (pps or cps). + performance_limits.robot renamed to performance_vars.robot + Added many keywords, mostly for accessing test variables. + Moved variables for Policer into a new keyword there. + Some keywords need sophisticated logic. - Other are basically Get Variable Value. + But in future more logic can be added, without editing callers. + Documentation for the new keywords acts as a documentation for test variables. + performance_utils.robot has the rest. + Eliminated arguments if the value is in test variable. + Small improvements to documentation. - Still not enough cleanup with respect to arguments and test variables. + Keywords are sorted alphabetically now in each one. + Suites: + Unified variables table: + No colons in comments. + ${n_hosts}, ${n_ports} and use them instead hardcoded numbers. + Add -cps to existing cps suite names. + Remove "trial data overwrite". + Compute max rate as in STL suites. + Each NAT suite has ip4base suite to compare results to. - Those act as indirect TRex calibration. - VPP does not lose packets in those. + Latency in ASTF suites is disabled hard. - As we do not support latency in ASTF profiles yet. + Unidirectional tests governed by suite variable, not an argument. + Write long argument lists vertically. + Prefer to use argument names. + In Python, also the last argument is followed by comma. + It makes renaming and reordering easier. + Similarly applies to prints with long lists of values. + A TODO to update api crc file comments. Change-Id: I84729355edbec051298a9de1162107f88ff5737d Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2020-09-24	test: nat44det - add session number check	Jan Gelety	2	-338/+0
	- some tests need to reduce rate for ramp-up phase - some tests need to extend trail duration in ramp-up phase - removed 2n1l-10ge2p1x710-ethip4udp-nat44det-h1-p63-s63 suite as nat out ports are randomly selected from available port range so T-Rex stateless is not able to provide required out2in traffic Change-Id: I1145496610d202f81d911e68aa819844d7600918 Signed-off-by: Jan Gelety <jgelety@cisco.com>
2020-09-21	Tests: nat44ed-uni	pmikus	5	-0/+590
	Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Iee488e2244a4e253471310bc7fb9640c69c6b0cb
2020-08-24	T-Rex: 2.82, core pin, 8 workers	pmikus	1	-1/+1
	+ Bump T-Rex version. We need new features for ASTF test. + Apply core pining. Results in a more stable performance. + Tweak the number of T-Rex workers. + We need an even value to achieve ymmetric performance with pinning. + Value 8 was selected as a best compromise. This is a combination of 3 commits. This is the 1st commit message: T-Rex: 2.82 This is the commit message #2: Change Trex to CORE_MASK_PIN mode to improve performance https://trex-tgn.cisco.com/trex/doc/trex_stateless.html#_core_masking_per_interface Above link have below explaination, "When the profile is symmetric, performance can be improved by pinning half of the cores to port 0, and half of the cores to port 1, thus avoiding cache trashing and bouncing." The reason to change this is that to run CSIT with 100G NIC often failed with "TRex stateless runtime error timeout", it caused by Trex can not send enough traffic within the fixed duration. by change to CORE_MASK_PIN mode fix the issue. Not editing ASTF, as that supports different options. This is the commit message #3: Experiment: Vary number of TRex workers With CORE_MASK_PIN, we can get more predictable time distribution. Decided to use 8 workers, that gives good results both for high end (RDMA-core l2patch) and low end (vhost) tests. Change-Id: I5c61127799e0624464e960fcb980ad1c4058e744 Signed-off-by: pmikus <pmikus@cisco.com> Signed-off-by: Yulong Pei <yulong.pei@intel.com> Signed-off-by: Vratko Polak <vrpolak@cisco.com>
2020-08-20	Framework: use 'stl' in trex stateless profile names	Jan Gelety	137	-0/+0
	Change-Id: I74641cc89d2f25d50b67d51bf2567082b420aabb Signed-off-by: Jan Gelety <jgelety@cisco.com>
2020-08-19	Perf: NAT44 endpoint-dependent mode - tcp, part I	Jan Gelety	10	-0/+620
	Jira: CSIT-1736 - tcp synthetic profiles w/o data packets - tcp cps perf tests, phase I (no special "search cps" KW) Change-Id: I52be34b0fdd51d7a33c8c5de9b46d7064c48f7fa Signed-off-by: Jan Gelety <jgelety@cisco.com>
2020-08-07	Perf: NAT44 endpoint-dependent mode - udp, part I	Jan Gelety	7	-7/+540
	- continuation of https://gerrit.fd.io/r/c/csit/+/26898 as there was reached limit of changes (1000) Jira: CSIT-1711 - udp synthetic profiles w/o data packets - udp cps perf tests, phase I (no special "search cps" KW) Part I means that we are using MRR tests to collect traffic data until there is ready new CPS test type with corresponding algorithm. Change-Id: I0d30feb9ecf1d0bff937152656f8eb422f831378 Signed-off-by: Jan Gelety <jgelety@cisco.com>
2020-07-23	T-Rex: Add advanced stateful mode	Jan Gelety	2	-0/+246
	- provide base routines to run T-Rex in advanced stateful mode Change-Id: Ib0dc5f2919c370753335f6446860683dc4b12d93 Signed-off-by: Jan Gelety <jgelety@cisco.com>
2020-06-17	NAT44-EI traffic profiles fix.	Maros Mullner	8	-8/+16
	Signed-off-by: Maros Mullner <maros.mullner@pantheon.tech> Change-Id: I200d566aa94c2ae183aa3bc9db85a290f9da858d
2020-06-10	NAT44 EI tests	Maros Mullner	9	-1/+1345
	Signed-off-by: Maros Mullner <maros.mullner@pantheon.tech> Change-Id: Ib5f58f60a1409ed139e2846793bf52fdc02a6571
2020-05-12	FIX: L3fwd properly	pmikus	1	-6/+6
	Signed-off-by: pmikus <pmikus@cisco.com> Change-Id: Ibdfc0350a101c4815f25456176e25bb1d90fd881
2020-05-06	Separate files needing GPL license	Vratko Polak	130	-0/+20864
	+ Keep apache license for now, until this is completed: https://wiki.fd.io/view/TSC/Relicensing_Procedure + Add utilities for switching license comment blocks. - They do not preserve attributes, so executable flag is lost. + Move the affected files to GPL/. + Update paths so files are executed from the new location. + Change the way scripts are started to do not require executable flag. + Employ OptionString when constructing longer command lines. + Move also PacketVerifier.py and TrafficScriptArg.py as they are linked with traffic scripts. + That means the two files are outside "resources" package tree now. + Added __init__.py files so relative imports work in new package tree. + Start traffic scripts as python modules to allow relative imports. + Once again needed because they are outside the default PYTHONPATH. Change-Id: Ieb135629e890adbaf5b79497570f3be25b746f9f Signed-off-by: Vratko Polak <vrpolak@cisco.com>