Age | Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I33fefddda9055524c1817b13b5c99bb1b97ebff4
(cherry picked from commit 96c5f5ef45cf039691404a4451b1c6d9260d6ea0)
|
|
Change-Id: Iebd62fd6b0c798f7b4dd1f3b093c156e533b3900
Signed-off-by: Jan Gelety <jgelety@cisco.com>
(cherry picked from commit a5b3a8b91d8e6c9baa4361d70b96769a98bfc454)
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I7d83ba048e0609d6b8623fab5c2960e48a37c023
(cherry picked from commit 3c3930b9f6f9a40d6b30f6e56e8c40279e34f650)
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Ieff60a44d42d66acee8ba1680e7e285d6cd01bc9
(cherry picked from commit e62ade12b7c9918cda2d363387ed0f517aa9840d)
|
|
+ Reduce time overhead when parsing --include vs --test
+ Input files will remain the same
+ 3n-hsw 150include ~24min, 150test ~5min
+ 2n-clx 489include ~61min, 489test ~9min
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Ia453b1bc1d1862bfc378a7611064a67ee564e2f2
|
|
Attempt to unbind a driver from a device only if it is bound to a
driver.
Remove the dynamic addition of an existing device ID to a driver. From
the docs [0]:
Writing a device ID to this file will attempt to
dynamically add a new device ID to a PCI device driver.
Since we assume the VFs are bound to the kernel driver when VPP Device
topology creation is done, it implies that the kernel driver supports
the device ID of those VFs, removing the need to add the support.
[0]: https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-pci
Change-Id: I20f3ca071a5a84a06ff358ba514532248a8f9ad0
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
(cherry picked from commit 71d7150a65a7c006bf46b2c1001dbaa00b5681fb)
|
|
This is a follow-up to https://gerrit.fd.io/r/c/csit/+/20394
https://gerrit.fd.io/r/20119
has changed the way archival works,
everything should now go to logs.fd.io
(instead of appearing on run page in jenkins.fd.io).
The glob pattern for archiving is quite eager,
doing recursive search. That is good, as it can find
also misplaced useful outputs.
But it also means our usage of copy_archives function
creates two copies of archived directtories,
usually archives/ and archives/archive/.
This change renames copy_archives to move_archives,
with few workarounds to support multiple calls.
I also renamed ARCHIVE_DIR value from $CSIT_DIR/archive
to $CSIT_DIR/archives to make "move" operation look natural.
Finally, download_builds function is being removed,
as after recent improvements to VPP compilation speed
nobody seems to be using it.
Change-Id: I19a429e1dfdfaab7fcf32a9609963b1aebd33c6c
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
(cherry picked from commit 523c6e6e24101206ff1318ca17c310dff8b3c9d2)
|
|
+ Move "|| true" to a place that really aviods errors.
+ Attemp to parse decoded string if trigger is not found in plain one.
Change-Id: If3587229ec588f9ad41acb3050add1142032d2d8
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
(cherry picked from commit 4582f0f408616cdff8e606ac3abfe154f8f0514b)
(cherry picked from commit f89bc7a87e0b6015e50de3557a1724c8aaafbf60)
|
|
- Better to get it fully aligned then cherry-pick spaghetti.
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: If223ef3f0247413d53225eb57f8903a7675632e3
|
|
The previous code counted full TCP connections,
which need one more packet, leading to worse results.
Change-Id: Ifcf78356b6ed54819ea0bf5aa069d7d9cb951183
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
(cherry picked from commit b71112bc323b55e39d8a9992a46530e1eb7f6f58)
|
|
Manual cherry-pick from master [1],
reverting impact of [2] and [3].
[1] https://gerrit.fd.io/r/c/csit/+/28208/176
[2] https://gerrit.fd.io/r/c/csit/+/29077
[3] https://gerrit.fd.io/r/c/csit/+/29529
The heap multipliers are left in suites,
as that simplifies cherry-picking between branches.
Original [0] commit message:
Support existing test types with ASTF
+ Add UDP_CPS, TCP_CPS, UDP_PPS and TCP_PPS suites.
+ Update existing cps traffic profiles.
+ Add missing traffic profiles.
+ UDP:
+ Single burst of 32 packets was confirmed as safe enough for TRex.
+ Maybe 64 could work, but not enough testing for that.
+ Multiple bursts have lead to reduced TRex performance,
as overlaping bursts (from different client instances)
tend to fill up the buffers.
+ TCP:
+ Data size set to 11111 bytes, completely arbitrarily.
+ Results look reasonable, so I have kept that.
- MSS not set at all
- No tested support for frame size other than 64B.
- Frame size does not even factor into TCP profiles.
+ So other frame sizes are skipped in autogen.
+ Update tags in related suites.
- HOSTS_{n} and SRC_USER_{n} should be unified.
- Questionable clarification on difference between IP4BASE and SCALE.
+ Add NAT state resetters to tests that need them.
+ Resetter is called (if set) before each measurement.
+ If ramp-up is detected, resetter is not set.
+ Rename "mult" argument to "multiplier".
+ Abstracted from packets to transactions.
+ Transaction corresponds to profile.
+ TRex multiplier argument sets target rate in transactions per second.
+ The familiar STL traffic:
+ Bidirectional is considered to be 2 packets per transaction.
+ Unidirectional is considered to be 1 packet per transaction.
+ The newer ASTF traffic:
+ 4 subtypes, each has different number of packets per transaction.
+ For max rate computation:
+ Packets in the more numerous direction are considered.
+ Rely on TRex reported traffic duration for ASTF:
+ Use the server side value.
- Client side value is higher by an overhead.
- TRex is not sending traffic during that time.
+ Remove delays from traffic profiles.
- Those delays would increase the reprted traffic time.
+ Support for scale lmited trials.
+ Only for ASTF profiles, each ASTF profile has limited scale.
+ Scale defined in suite variables.
+ For TRex to send all transactions provided duration value is ignored.
+ The appropriate value is computed in TrafficGenerator.
+ An ad-hoc time constant is added to match the TRex client side time overhead.
+ The profile driver receives the computed duration.
+ Measurement for PLRsearch add a sleep if the computed duration is smaller.
+ Alternative argument for search algos if scale is limited.
+ Both need higher timeout to accomodate big scales.
+ MLRsearch can afford fewer phases.
+ Added a parameter to optionally shorten the duration.
+ Use short duration for runtime stats trial and failure stats trial.
+ Use very large keepalive values in udp profiles to avoid ka packets.
+ No polling in ASTF profile driver.
- Polling could eliminate the time overhead value.
+ But polling proved to introduce some loss, affecting the results.
+ Handle duration stretching in ASTF by stopping traffic.
+ The stop has several steps so that:
+ The traffic is really stopped entirely.
+ Late packets do not count (maybe as errors).
+ Stats are preserved to read for results (and cleared afterwards).
+ Several quantities added to ReceiveRateMeasurement:
+ Original target duration is preserved (algos need that).
+ Input estimate (tps) for early search iterations.
+ Output estimate (maybe pps) for MRR output.
+ Strict result (unsent counts as loss) for NDR.
+ Use L2 counters (opackets, ipackets) where possible.
- TRex has trouble processing packets for the L7 ones at high loads.
+ Remove warmup from profile drivers and keywords.
+ Suites should call "Send ramp-up traffic" explicitly if needed.
+ Added parsing for few more counters.
+ Both to use in formulas or just for debug purposes.
- Only 64B cases in autogen, framesize support to be added later.
+ Latency streams during search can be enabled via PERF_USE_LATENCY env var.
+ MLRsearch improvments:
+ Rename argument names to min_rate and max_rate.
+ Use relative receive rate in initial phase.
+ PLRsearch improvements:
+ Careful computation when output (pps) does not match input (tps).
+ Use geometric distribution (instead of Poisson).
+ Helps agains math errors.
+ This should improve estimate stability.
- But in practice big losses still lead to significant jumps.
+ Traffic generator improvements:
+ send_traffic_on_tg now calls the full set_rate_provider_defaults.
+ _send_traffic_on_tg_internal for the logic without provider defaults.
+ As the internal function is re-used by measure() without affecting defaults.
+ Move _parse_traffic_results just before get_measurement_result.
+ As the latter uses fields set bu the former, it is now easier to read.
+ Multiple sources for approximate duration.
+ Tried from more precise to more available.
+ Includes logic for _pps tests (added in later change).
+ Move explicit type conversions to earlier occurences.
+ Profile driver output field uses semicolons to simplify parsing.
+ Performance Robot lib file split to several smaller ones.
+ performance_actions.robot:
+ Hosts Additional Statistics Action For * keywords.
+ performance_display.robot:
+ Hosts keyword for displaying and verifying results.
+ Change test message to use the correct unit (pps or cps).
+ performance_limits.robot renamed to performance_vars.robot
+ Added many keywords, mostly for accessing test variables.
+ Moved variables for Policer into a new keyword there.
+ Some keywords need sophisticated logic.
- Other are basically Get Variable Value.
+ But in future more logic can be added, without editing callers.
+ Documentation for the new keywords acts as a documentation for test variables.
+ performance_utils.robot has the rest.
+ Eliminated arguments if the value is in test variable.
+ Small improvements to documentation.
- Still not enough cleanup with respect to arguments and test variables.
+ Keywords are sorted alphabetically now in each one.
+ Suites:
+ Unified variables table:
+ No colons in comments.
+ ${n_hosts}, ${n_ports} and use them instead hardcoded numbers.
+ Add -cps to existing cps suite names.
+ Remove "trial data overwrite".
+ Compute max rate as in STL suites.
+ Each NAT suite has ip4base suite to compare results to.
- Those act as indirect TRex calibration.
- VPP does not lose packets in those.
+ Latency in ASTF suites is disabled hard.
- As we do not support latency in ASTF profiles yet.
+ Unidirectional tests governed by suite variable, not an argument.
+ Write long argument lists vertically.
+ Prefer to use argument names.
+ In Python, also the last argument is followed by comma.
+ It makes renaming and reordering easier.
+ Similarly applies to prints with long lists of values.
+ A TODO to update api crc file comments.
Change-Id: I84729355edbec051298a9de1162107f88ff5737d
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Replace the hacky grep of /etc/resolv.conf with default docker gateway
IP.
Change-Id: Iec3a4658826f2ba871acb14d511e9c79a1273290
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
|
|
One ThunderX2 9975 server (.69) was replaced with two ThunderX2 9980
(.70, .71) servers. Move the .69 server under ansible perf section in
anticipation of repurposing it for that purpose. Update the ansible
scripts with .70 and .71 config and rename port names in device.sh lib
to reflect the NIC differences between .69 and .70 (and .71).
Change-Id: I88b75648735243e5559175d3192ffcc8fc70071c
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
|
|
Change-Id: I3aa50ec1ef9b0445014daa31e767323060f4a03f
Signed-off-by: Jan Gelety <jgelety@cisco.com>
(cherry picked from commit d68be735d882bafcb672ebb27a66efbcabbeb02d)
Change-Id: Iad67c8445e18b22eccbea25d75b91827b398775f
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
Change-Id: Ie24184ca4ac2d6c7abc32f0f103e10bc402ad93b
Signed-off-by: Jan Gelety <jgelety@cisco.com>
(cherry picked from commit 61044d391d6e8d6b47d0d4f156071bd61cd278df)
Change-Id: I50e2a674784688e6eeea566fc2bc4d45a8ecfb8b
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
Change-Id: I1c638aef886bf37a9feb4a29e4949c7c8f19b717
Signed-off-by: Jan Gelety <jgelety@cisco.com>
(cherry picked from commit d99951620507d354c4803eb1ee26609d992b70b3)
Change-Id: Iaf5b4dffe603b0cf5cf0430fc6ca20acb7a01fda
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
The previous 60 Mpps stil leads to ~30% duration stretching.
+ Add comment on why 36 Mpps was chosen as the new limit.
Change-Id: Ic11e8ece03939bdc8680cd7bc4122373583a2f17
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
(cherry picked from commit 34dadfe8d168b72340b497469ee6550349689f1a)
|
|
- Follow up https://jira.fd.io/browse/VPP-1934
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Id0a26c5f67f229480332530a8531401d954f4422
|
|
- some tests need to reduce rate for ramp-up phase
- some tests need to extend trail duration in ramp-up phase
- removed 2n1l-10ge2p1x710-ethip4udp-nat44det-h1-p63-s63 suite
as nat out ports are randomly selected from available port range
so T-Rex stateless is not able to provide required out2in traffic
Change-Id: I1145496610d202f81d911e68aa819844d7600918
Signed-off-by: Jan Gelety <jgelety@cisco.com>
(cherry picked from commit 3b408b7ea702dd3817442186035121fe862cbf7f)
Change-Id: I53da8c086373d06e0842e5563964d9287c0fa403
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
avf tests works, but vfio-pci tests are following, the interfaces must
be in down state (ideally unbind, as vpp cannot pick them).
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I77af85ec4239059a5455ef68683ca129548bd7bd
|
|
Fix 1/3: Explicitly put PF interface up (this patch)
Fix 2/3: Done on TB
Fix 3/3: VPP bug displaying VF up if underlying PF is not up
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I45d66986ec76e6e14eebaad6828ef72724c626ab
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I8d6108af943d729fecbcfe4867ea820a69b4eb1e
|
|
In VPP 20.05, vpp added async crypto engine that support to use QAT hardware
to do encryption and decryption, vnet/ipsec enabled async mode to use async
crypto engine.
Current async crypto engine also use dpdk_cryptodev as async handlers, in the
future it may add other native QAT driver as async handlers.
Note that async crypto engine is to support vnet/ipsec, it is different
with current existing dpdk backend which itself has ESP implementation
in plugins/dpdk/ipsec.
Change-Id: I4e6eaa7ca1eddb8b1c45212de0684fb26907119b
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
(cherry picked from commit d12f510caf3bb83695488684eb07de79b3e753b9)
|
|
Jira: CSIT-1755
Change-Id: I34baa22a49f44da3fa80d91fa2f4132c982fe610
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
+ Unifying code structures
- To easily plug another DUT
+ New PCI PassThrough templates
+ Improved perf stat on cores allocated in test.
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I325f17b977314f93cb91818feddfddf3e607eb8a
|
|
+ DPDK 20.08
+ Migrate make -> meson
+ Fix all trending issues
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I31dcb22627c0f8d17ec63c5b138a2da958b006f4
|
|
+ Bump T-Rex version. We need new features for ASTF test.
+ Apply core pining. Results in a more stable performance.
+ Tweak the number of T-Rex workers.
+ We need an even value to achieve ymmetric performance with pinning.
+ Value 8 was selected as a best compromise.
This is a combination of 3 commits.
This is the 1st commit message:
T-Rex: 2.82
This is the commit message #2:
Change Trex to CORE_MASK_PIN mode to improve performance
https://trex-tgn.cisco.com/trex/doc/trex_stateless.html#_core_masking_per_interface
Above link have below explaination,
"When the profile is symmetric, performance can be improved by pinning half of
the cores to port 0, and half of the cores to port 1, thus avoiding cache
trashing and bouncing."
The reason to change this is that to run CSIT with 100G NIC often failed with
"TRex stateless runtime error timeout", it caused by Trex can not send enough
traffic within the fixed duration.
by change to CORE_MASK_PIN mode fix the issue.
Not editing ASTF, as that supports different options.
This is the commit message #3:
Experiment: Vary number of TRex workers
With CORE_MASK_PIN, we can get more predictable time distribution.
Decided to use 8 workers, that gives good results
both for high end (RDMA-core l2patch) and low end (vhost) tests.
Change-Id: I5c61127799e0624464e960fcb980ad1c4058e744
Signed-off-by: pmikus <pmikus@cisco.com>
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I2781e85f44acffb4f8d7f02326ba2ca668dad0c5
|
|
- align CSIT code with VPP code changes for NAT44 deterministic
(DET44) feature
- align test names according to snat44ed tests
- remove obsolete 3-node nat tests
- remove 2n1l-10ge2p1x710-ethip4udp-snat44det-h1048576-p63-s66060288 tests
(not enough memory for such high number of sessions)
Change-Id: I9a22b99b4cfa56d18e9c7ef9c58296e202567d42
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
The old-style variables contain None in some places,
mainly in SRv6 proxy tests.
Change-Id: If3887a7dba051454c504b345a6a316d5d69d0139
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Change-Id: I74641cc89d2f25d50b67d51bf2567082b420aabb
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
for Intel E810CQ 100G NIC, kernel driver of PF is ice, kernel driver of VF
is iavf, its VF hardware support VPP native avf driver.
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
Change-Id: Ic8d86e5ee00057bbbcd09df619a38bd1371c8fd7
|
|
- continuation of https://gerrit.fd.io/r/c/csit/+/26898 as there was
reached limit of changes (1000)
Jira: CSIT-1711
- udp synthetic profiles w/o data packets
- udp cps perf tests, phase I (no special "search cps" KW)
Part I means that we are using MRR tests to collect traffic data
until there is ready new CPS test type with corresponding algorithm.
Change-Id: I0d30feb9ecf1d0bff937152656f8eb422f831378
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I20098fca8fb513accef3edc9a72bfd3c56bf9be2
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: I1b535ea61ab68f6e37989ffc942979cdfd24f55e
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Iab84aff31a23bb9d8e1165f5314004803fd8a501
|
|
+ Measure latency in 90/50/10/0% PDR loads in ndrpdr tests.
+ Do not measure latency anywhere else.
- Needs manual editing to re-enable in soak tests.
Change-Id: I69fa11bfcf71012f683061c5effea52a1be91620
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
To avoid runs such as:
https://jenkins.fd.io/job/vpp-csit-verify-perf-master-3n-hsw/340/
Change-Id: I1b30d5f440ddf8ff32b11265b2ac2176f4b9a665
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
- provide base routines to run T-Rex in advanced stateful mode
Change-Id: Ib0dc5f2919c370753335f6446860683dc4b12d93
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
Change-Id: I3bbe1fe0073ddeead5219993675f24955e8c3dfd
Signed-off-by: Peter Mikus <pmikus@cisco.com>
|
|
- test/suite/global
- binary logic is not working
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Ia3d81cbf2c5f04d1093a0a408c84a9ffc6f3eef0
|
|
Change-Id: I31c2d7744b5cd3021132fb188480b8edec74986c
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
+ Latency measurements need more than 9000 pps.
- Previously 0% measurement used 9500 pps.
Change-Id: Ic0841de096dfa8a61329f98aa1ba6d3f0ce60c66
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Change-Id: If0975b1d54882390c5be418927e2961d0f4c8429
Signed-off-by: Jan Gelety <jgelety@cisco.com>
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
VPP uses MAKE_PARALLEL_FLAGS or MAKE_PARALLEL_JOBS to limit the number
of cpus to use during build, so emit a line on stdout if it's used.
Change-Id: I669398d474d172abb6c848a45f24f1bdd56990d8
Signed-off-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
|
|
Change-Id: If61783fb717757c6189f06924412bd079e15a08f
Signed-off-by: Tibor Frank <tifrank@cisco.com>
|
|
Change-Id: I6da359d25edc415e44263d3f85f166369e564987
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Change-Id: I636f020e97df1b37ac8b6a30af511eebe611b56f
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Previously, number of directions was not taken into account.
Also, ideally PLRsearch never reports value under the hard minimum,
so successful results are now required to be more than 10% better.
Change-Id: I8622726b97bd1da3e139c8044a2932837fc268b7
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Signed-off-by: pmikus <pmikus@cisco.com>
Change-Id: Ie88f0df239725a4de62d727e1923cdb3ad040809
|