Age | Commit message (Collapse) | Author | Files | Lines |
|
dpdk_nic_bind.py from <trex>/scripts/ is out of date, often bumped into
errors when using it to bind nic port, e.g.
/usr/bin/python3 dpdk_nic_bind.py --bind=vfio-pci 0000:ca:00.0
/opt/trex-core-3.00/scripts/dpdk_nic_bind.py:40: DeprecationWarning:
The distutils package is deprecated and slated for removal in Python 3.12.
Use setuptools or check PEP 632 for potential alternatives
from distutils.util import strtobool
Error: bind failed for 0000:ca:00.0 - Cannot bind to driver vfio-pci
so remove dpdk_nic_bind.py dependency in csit.
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Change-Id: I5a3f641cd77d339aa7a213f410ce2efe7c322b8a
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
|
|
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: I3f7efcbfc82f683e7afc986d00fa71ae7413d93d
|
|
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: I652844e722e24cb49f09a3f30aabe3103e271079
|
|
- Some older documentation files are not updated yet.
Change-Id: If1717e12308f0e2e76c10024e6eebe68ddeddc9f
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: I23f2ab678e6666a1423620fa373261d822030bc8
|
|
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: Ia7eefc28645c78ad346d294099ef6258faa9814f
|
|
- Static content will be removed separately
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: I0992e941e1c24552837eaf7e8d3f6564b3cb21c8
|
|
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: Ibfab25ee82eaa207987f4070cf2386ea9d0781cd
|
|
- Due to divergence from original design path the RAW was never
consumed. It adds too much code complexity and requires processing
on both storage and compute. Removing entirely to make modeling
efficient.
- log (apparently SSH) section will never be consumed in the way it is
coded in model. This section is also not part of model schema itself
due to the point above.
- Introducing telemetry section that is going to carry telemetry
items required for CDash.
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: I7e0256c6c9715de8ee559eed29dce96329aac97d
|
|
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: I9aa88fb094b03888fc30d84edc1deaa406075db4
|
|
Add VPP wireguard async mode test suite to use QAT device for crypto.
Also change keyword ipsechw to cryptohw in suite_setup.robot since currently
crypto device is not only used by IPSec.
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Signed-off-by: xinfeng zhao <xinfengx.zhao@intel.com>
Change-Id: Ibdadb3b09c04b7181415ffd4a248abddc6289075
|
|
Signed-off-by: pmikus <peter.mikus@protonmail.ch>
Change-Id: If898b07e7406bfe28a63e4a793cf2f4e08a53e9d
|
|
Signed-off-by: Peter Mikus <pmikus@cisco.com>
Change-Id: Ib3e9b9bb7937557f9880fb230856eb96534446d0
|
|
Signed-off-by: Peter Mikus <pmikus@cisco.com>
Change-Id: I276e12e881a4db0a82d5e03107fe153d02a762c4
|
|
Signed-off-by: Peter Mikus <pmikus@cisco.com>
Change-Id: I0a84ce88b318c488ba2d519b20237c88b9f9f1e6
|
|
Signed-off-by: Peter Mikus <pmikus@cisco.com>
Change-Id: Ie358593f9977d04aca9e50a14e0d14158e1b0cf1
|
|
- This is actually bug not a feature.
- AB to be added later
- Tested on TREX and iPerf3
Signed-off-by: Peter Mikus <pmikus@cisco.com>
Change-Id: Ib6f2d13e3b9401a9fb5759e42a8a310ee11b9d41
|
|
The implementation of GTPU offload rx is to use ip4_gtpu flow MARK action
of NIC and vpp flow REDIRECT_TO_NODE and BUFFER_ADVANCE fuction to direct
received gtpu flow to gtpu4-flow-input graph node, skipped ethernet-input,
ip4-input, ip4-lookup, ip4-local, ip4-udp-lookup normal graph node
processing.
Verified on 3n-clx and Intel E810 NIC environment, single core with 64B packet,
performance improve ~33% that compare with pure software way.
Signed-off-by: xinfeng zhao <xinfengx.zhao@intel.com>
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Change-Id: I2af4589448bdb1729e4ce206a8cf3a1239c61af8
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
|
|
Signed-off-by: Peter Mikus <pmikus@cisco.com>
Change-Id: Id0e7d31dc1368140c2c829fb2fcab009fbbed26d
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Id4d84aa7268080843b099fd7ab9851234612968b
|
|
- No support for IMIX.
+ Fix a bad bug in padding (most ASTF profiles had wrong frame sizes).
+ Fix a big typo in TCP PPS profiles (s->c was not data, just RST).
+ Control transaction size via ASTF_N_DATA_FRAMES env variable.
- Default value 5 leads to transactions smaller than before.
+ It ensures transaction is one burst (per direction) even for jumbo.
+ Edit autogen to set supported frame sizes based on suite id.
+ Both TCP and UDP use the same values:
+ 64B for CPS (exact for UDP, nominal for TCP).
+ 100B, 1518B and 9000B for TPUT and PPS.
- TCP TPUT achievable minimum is 70B.
+ Used 100B to leave room for possible IPv6 ASTF tests.
+ Separate function for code reused by vpp and trex tests.
- I do not really like the new "copy and edit" approach added here.
+ But it is a quick edit, better autogen refactor is low priority.
+ Consider both established and transitory sessions as valid.
- Mostly for compatibility with 2202 behavior and to avoid ramp-ups.
- Assuming both session states have similar enough VPP CPU overhead.
+ Added a TODO to investigate and maybe reconsider later.
+ Update the state timeout value to 240s.
+ That is the default for TCP (for transitory state).
- UDP could keep using 300s.
+ But I prefer UDP and TCP to behave as similarly as possible.
+ Use TRex tunables to get the exact frame size (for data packets).
- It is not clear why the recipe for MSS has to be this complicated.
+ Move code away from profile init, as frame size is not known there.
+ Change internal profile API, so values related to MSS are passed.
+ Lower ramp-up rate for TCP TPUT tests.
+ Because without lower rate, jumbo fails on packet loss in ramp-up.
+ UDP TPUT ramp-up rate also lowered (just to keep suites more similar).
+ Distinguish one-direction and aggregated average frame size.
+ Update keyword documentation where the distiction matters.
+ One-direction is needed for turning bandwidth limit to TPS limit.
+ Aggregated is needed for correct NDRPDR bandwidth result value.
- TCP TPUT will always be few percent below bidirectional maximum.
+ That is unavoidable, as one direction sends more control packets.
+ Add runtime consistency checks so future refactors are safer.
+ Fail if padding requested would be negative.
+ Fail if suite claims unexpected values for packets per transaction.
+ Edit the 4 types of ASTF profiles to keep them similar to each other.
+ Move UDP TPUT limit value from a field back to direct argument.
+ Stop pretending first UDP packet is not data.
+ Apply small improvements where convenient.
+ Replace "aggregate" with "aggregated" where possible.
+ To lower probability of any future typos in variable names.
+ Avoid calling Set Numeric Frame Sizes twice.
+ Code formatting, keyword documentation, code comments, ...
+ Add TODOs for less important code quality improvements.
- Postpone updating of methodology pages to a subsequent change.
Change-Id: I4b381e5210e69669f972326202fdcc5a2c9c923b
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Signed-off-by: Peter Mikus <pmikus@cisco.com>
Change-Id: Ia5a097797b54c2e71acdcc8d72706e5540536252
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Ib52fab112d458decfecf39c77085bcd85f811eba
|
|
+ Model version 1.0.0.
- Only some result types are exported.
+ MRR, NDRPDR and SOAK.
- Other result types to be added later.
+ In contrast, all test types are detected.
+ Convert custom classes to JSON-serializable equivalents.
+ Sort dict keys before converting to JSON.
+ Override the order for some known keys.
+ Export sets as sorted arrays.
+ Convert to info content from serialized raw content.
+ Also export outputs for suite setups and teardowns.
+ Info files for setup/teardown exist only temporarily.
+ The data is merged into suite.info.json file.
+ This simplifies presentation of total suite duration.
+ Define model via JSON schema:
- Just test case, suite setup/teardown/suite to be added later.
- Just info, raw to be added later.
+ Proper descriptions.
+ Json is generated from yaml.
+ This is a convenience for maintainers.
+ The officially used schema is the .json one.
+ TODOs written into a separate .txt file.
+ Validate exported instance against the schema.
+ Include format checking.
+ Update CSIT requirements for validation dependencies.
+ This needs python-dateutil==2.8.2, only a patch bump.
+ Compute bandwidth also for soak tests.
+ This unifies with NDRPDR to simplify schema definition.
- PAL may need an update for parsing soak test message.
+ Include SSH log items, raw output only.
+ Generate all outputs in a single filesystem tree.
+ Move raw outputs into test_output_raw.tar.xz.
+ Rename existing tar with suites to generated_robot_files.tar.xz.
Change-Id: I69ff7b330ed1a14dc435fd0ef008e753c0d7f78c
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
+ Implementation stub so checker can check already.
+ Also add documentation stub for the implemented model.
+ Checker checks also for bumps in documentation version.
- Not comparing implementation and documentation version yet.
Change-Id: I4d19c00315a1c171de325c4494c28f5210635f32
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
When building documentation using sphinx we see ~1200 similar warnings [0]
[0] - https://s3-logs.fd.io/vex-yul-rot-jenkins-1/csit-verify-tox-master-ubuntu2004-x86_64/3289/doc_verify.log.gz
These warning are harmless and can be fixed later
Signed-off-by: Viliam Luc <vluc@cisco.com>
Change-Id: I1ac1099d38935971d47491dde905715345d3935c
|
|
+ Add ability to switch between hugepages.
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I84d8eae28ed414a32e5ba82e6c9ed10d7f0ef9cb
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Ic249493a39faa8f429ae4fa1de644d74d874151b
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I81958fbf6ef240d53a0fb8708ca882baf02f606c
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I331fe9134c62594189ca231349ab4c5ba43b51e5
|
|
- enabling for fortville, columbiaville
- enabling experimental for mlx
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I1b7ceb54769f4a0089ac7309350499e60c5cca0a
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I5bbb573fb75d0ee7b5e9b21591e4d6ff48df917e
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I2f019a083916aec9f7816266f6ad5b92dcc31fa0
|
|
1. Suite steup add download nginx
2. Add nginx-1.14.2/1.15.0 ldp test suite
3. Add NginxUtils,NginxConfigGenerator method
4. Taskset the PID of nginx to the unused cores in VPP and these cores are under NIC's NUMA ID
5. cleanup add Kill Processes - nohup
Signed-off-by: xizhanx <xix.zhang@intel.com>
Change-Id: Idbf0e4ec3bf63e88281a8e3e34f52e00a6801c85
Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I6edd980cb72111a008ae7fa19e1a4df279febdb2
|
|
+ PPS limit for AWS set to 1.2 Mpps.
+ The logic is very similar to that one in ASTF driver.
+ This helps for testbeds with high duration stretching (e.g. AWS).
+ Difference: No transaction scale, and we deal with floats.
+ Update loss counting to count unsent packets as lost.
+ Also count "unsent" transactions for other transaction types.
+ If nonzero, log the number of unsent packets/transactions.
+ Make STL and ASTF time overhead constant (called delay) configurable.
+ Subtract delay from approximated_duration, also for ASTF.
Change-Id: I6ee6aa6fba4f110ba1636e1b0ff76cac64383e33
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Additional configuration can provide performance boosts when running in virtual environments
If set to 0, uses default DPDK value
By tuning this value it's also possible to run T-Rex 2.88 with ENA NICs on AWS
Signed-off-by: Tomas Alexy <tomas.alexy@pantheon.tech>
Change-Id: I43c86ea1d9aa854a1087f07fe544ac77a5b80397
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I0341a1564ba510acf46bda3e24225209abef2f82
|
|
+ Mellanox 4.6 is not for Ubuntu 20.04
+ Mellanox for ubuntu 20.04 is 4.9+
+ T-Rex 2.86 is not for Mellanox 4.6+
+ T-Rex for Mellanox 5.2 is 2.88+
==================================
= Bump
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I902dfc2a43e6718b385e89f31a34260e09d61bd3
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Id56b87ab868f2897a6563914b0beca2acc25e706
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I145a4b5511141f1e2b4e387daa358e32dd2c8015
|
|
+ Add UDP_CPS, TCP_CPS, UDP_PPS and TCP_PPS suites.
+ Update existing cps traffic profiles.
+ Add missing traffic profiles.
+ UDP:
+ Single burst of 32 packets was confirmed as safe enough for TRex.
+ Maybe 64 could work, but not enough testing for that.
+ Multiple bursts have lead to reduced TRex performance,
as overlaping bursts (from different client instances)
tend to fill up the buffers.
+ TCP:
+ Data size set to 11111 bytes, completely arbitrarily.
+ Results look reasonable, so I have kept that.
- MSS not set at all
- No tested support for frame size other than 64B.
- Frame size does not even factor into TCP profiles.
+ So other frame sizes are skipped in autogen.
+ Update tags in related suites.
- HOSTS_{n} and SRC_USER_{n} should be unified.
- Questionable clarification on difference between IP4BASE and SCALE.
+ Add NAT state resetters to tests that need them.
+ Resetter is called (if set) before each measurement.
+ If ramp-up is detected, resetter is not set.
+ Rename "mult" argument to "multiplier".
+ Abstracted from packets to transactions.
+ Transaction corresponds to profile.
+ TRex multiplier argument sets target rate in transactions per second.
+ The familiar STL traffic:
+ Bidirectional is considered to be 2 packets per transaction.
+ Unidirectional is considered to be 1 packet per transaction.
+ The newer ASTF traffic:
+ 4 subtypes, each has different number of packets per transaction.
+ For max rate computation:
+ Packets in the more numerous direction are considered.
+ Rely on TRex reported traffic duration for ASTF:
+ Use the server side value.
- Client side value is higher by an overhead.
- TRex is not sending traffic during that time.
+ Remove delays from traffic profiles.
- Those delays would increase the reprted traffic time.
+ Support for scale lmited trials.
+ Only for ASTF profiles, each ASTF profile has limited scale.
+ Scale defined in suite variables.
+ For TRex to send all transactions provided duration value is ignored.
+ The appropriate value is computed in TrafficGenerator.
+ An ad-hoc time constant is added to match the TRex client side time overhead.
+ The profile driver receives the computed duration.
+ Measurement for PLRsearch add a sleep if the computed duration is smaller.
+ Alternative argument for search algos if scale is limited.
+ Both need higher timeout to accomodate big scales.
+ MLRsearch can afford fewer phases.
+ Added a parameter to optionally shorten the duration.
+ Use short duration for runtime stats trial and failure stats trial.
+ Use very large keepalive values in udp profiles to avoid ka packets.
+ No polling in ASTF profile driver.
- Polling could eliminate the time overhead value.
+ But polling proved to introduce some loss, affecting the results.
+ Handle duration stretching in ASTF by stopping traffic.
+ The stop has several steps so that:
+ The traffic is really stopped entirely.
+ Late packets do not count (maybe as errors).
+ Stats are preserved to read for results (and cleared afterwards).
+ Several quantities added to ReceiveRateMeasurement:
+ Original target duration is preserved (algos need that).
+ Input estimate (tps) for early search iterations.
+ Output estimate (maybe pps) for MRR output.
+ Strict result (unsent counts as loss) for NDR.
+ Use L2 counters (opackets, ipackets) where possible.
- TRex has trouble processing packets for the L7 ones at high loads.
+ Remove warmup from profile drivers and keywords.
+ Suites should call "Send ramp-up traffic" explicitly if needed.
+ Added parsing for few more counters.
+ Both to use in formulas or just for debug purposes.
- Only 64B cases in autogen, framesize support to be added later.
+ Latency streams during search can be enabled via PERF_USE_LATENCY env var.
+ MLRsearch improvments:
+ Rename argument names to min_rate and max_rate.
+ Use relative receive rate in initial phase.
+ PLRsearch improvements:
+ Careful computation when output (pps) does not match input (tps).
+ Use geometric distribution (instead of Poisson).
+ Helps agains math errors.
+ This should improve estimate stability.
- But in practice big losses still lead to significant jumps.
+ Traffic generator improvements:
+ send_traffic_on_tg now calls the full set_rate_provider_defaults.
+ _send_traffic_on_tg_internal for the logic without provider defaults.
+ As the internal function is re-used by measure() without affecting defaults.
+ Move _parse_traffic_results just before get_measurement_result.
+ As the latter uses fields set bu the former, it is now easier to read.
+ Multiple sources for approximate duration.
+ Tried from more precise to more available.
+ Includes logic for _pps tests (added in later change).
+ Move explicit type conversions to earlier occurences.
+ Profile driver output field uses semicolons to simplify parsing.
+ Performance Robot lib file split to several smaller ones.
+ performance_actions.robot:
+ Hosts Additional Statistics Action For * keywords.
+ performance_display.robot:
+ Hosts keyword for displaying and verifying results.
+ Change test message to use the correct unit (pps or cps).
+ performance_limits.robot renamed to performance_vars.robot
+ Added many keywords, mostly for accessing test variables.
+ Moved variables for Policer into a new keyword there.
+ Some keywords need sophisticated logic.
- Other are basically Get Variable Value.
+ But in future more logic can be added, without editing callers.
+ Documentation for the new keywords acts as a documentation for test variables.
+ performance_utils.robot has the rest.
+ Eliminated arguments if the value is in test variable.
+ Small improvements to documentation.
- Still not enough cleanup with respect to arguments and test variables.
+ Keywords are sorted alphabetically now in each one.
+ Suites:
+ Unified variables table:
+ No colons in comments.
+ ${n_hosts}, ${n_ports} and use them instead hardcoded numbers.
+ Add -cps to existing cps suite names.
+ Remove "trial data overwrite".
+ Compute max rate as in STL suites.
+ Each NAT suite has ip4base suite to compare results to.
- Those act as indirect TRex calibration.
- VPP does not lose packets in those.
+ Latency in ASTF suites is disabled hard.
- As we do not support latency in ASTF profiles yet.
+ Unidirectional tests governed by suite variable, not an argument.
+ Write long argument lists vertically.
+ Prefer to use argument names.
+ In Python, also the last argument is followed by comma.
+ It makes renaming and reordering easier.
+ Similarly applies to prints with long lists of values.
+ A TODO to update api crc file comments.
Change-Id: I84729355edbec051298a9de1162107f88ff5737d
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
The previous 60 Mpps stil leads to ~30% duration stretching.
+ Add comment on why 36 Mpps was chosen as the new limit.
Change-Id: Ic11e8ece03939bdc8680cd7bc4122373583a2f17
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
+ Unifying code structures
- To easily plug another DUT
+ New PCI PassThrough templates
+ Improved perf stat on cores allocated in test.
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I325f17b977314f93cb91818feddfddf3e607eb8a
|
|
+ Bump T-Rex version. We need new features for ASTF test.
+ Apply core pining. Results in a more stable performance.
+ Tweak the number of T-Rex workers.
+ We need an even value to achieve ymmetric performance with pinning.
+ Value 8 was selected as a best compromise.
This is a combination of 3 commits.
This is the 1st commit message:
T-Rex: 2.82
This is the commit message #2:
Change Trex to CORE_MASK_PIN mode to improve performance
https://trex-tgn.cisco.com/trex/doc/trex_stateless.html#_core_masking_per_interface
Above link have below explaination,
"When the profile is symmetric, performance can be improved by pinning half of
the cores to port 0, and half of the cores to port 1, thus avoiding cache
trashing and bouncing."
The reason to change this is that to run CSIT with 100G NIC often failed with
"TRex stateless runtime error timeout", it caused by Trex can not send enough
traffic within the fixed duration.
by change to CORE_MASK_PIN mode fix the issue.
Not editing ASTF, as that supports different options.
This is the commit message #3:
Experiment: Vary number of TRex workers
With CORE_MASK_PIN, we can get more predictable time distribution.
Decided to use 8 workers, that gives good results
both for high end (RDMA-core l2patch) and low end (vhost) tests.
Change-Id: I5c61127799e0624464e960fcb980ad1c4058e744
Signed-off-by: pmikus <pmikus@cisco.com>
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
for Intel E810CQ 100G NIC, kernel driver of PF is ice, kernel driver of VF
is iavf, its VF hardware support VPP native avf driver.
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Change-Id: Ic8d86e5ee00057bbbcd09df619a38bd1371c8fd7
|
|
- continuation of https://gerrit.fd.io/r/c/csit/+/26898 as there was
reached limit of changes (1000)
Jira: CSIT-1711
- udp synthetic profiles w/o data packets
- udp cps perf tests, phase I (no special "search cps" KW)
Part I means that we are using MRR tests to collect traffic data
until there is ready new CPS test type with corresponding algorithm.
Change-Id: I0d30feb9ecf1d0bff937152656f8eb422f831378
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I1b535ea61ab68f6e37989ffc942979cdfd24f55e
|
|
Change-Id: I3bbe1fe0073ddeead5219993675f24955e8c3dfd
Signed-off-by: Peter Mikus <pmikus@cisco.com>
|
|
- test/suite/global
- binary logic is not working
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Ia3d81cbf2c5f04d1093a0a408c84a9ffc6f3eef0
|