Age | Commit message (Collapse) | Author | Files | Lines |
|
The old limit of 1522 was introduced long time ago.
First, it appeared here [0], where it is correct
as that suite has zero overhead.
The first suite with wrong logic seems to be here [1]
(no "Add No Multi Seg to all DUTs" in 1518B testcase).
And when I was moving that logic to a keyword in [2],
I did not realize it is wrong with overhead.
This Change uses 1800 as the new threshold,
matching the value used for non-jumbo MTU.
[0] https://gerrit.fd.io/r/c/csit/+/2652/12/tests/perf/Bridge_Domain_Intel-X520-DA2.robot#70
[1] https://gerrit.fd.io/r/c/csit/+/4454/96/tests/perf/40ge2p1xl710-ethip4ipsecscaleip4-ip4base-interfaces-aes-gcm-ndrpdrdisc.robot#229
[2] https://gerrit.fd.io/r/c/csit/+/13411/35/resources/libraries/robot/performance/performance_utils.robot#84
Change-Id: Iff3703fcff0e4bbb1a6b10be359fa5ef67fd5422
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
When changing MTU on a running VPP, the interface has to be down.
- Other plugins (rdma, avf, af_xdp) need vastly different logic,
so support for them will be added later.
+ Mlx5-core does not need to set MTU on Linux interface.
+ MTU setting now does not happen at final setting path up,
it happens in driver initialization layer instead
E.g. AVF tests will not attempt to change MTU.
+ MTU edit removed from some non-hardware interfaces (including memif)
e.g. bond interfaces. MTU on parent hw interface seems to be enough.
+ The non-jumbo MTU value used is 1800,
so 1518B tests with additional encapsulation can still work.
+ When VPP MTU setting fails, the failure is now propagated.
Previously, the failure was just logged and ignored,
but now there is no reason to hide it.
Ticket: CSIT-1797
Change-Id: I3b853f1faf90001d544cbbb87b2affbb882ffba0
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Previous change did not consider TCP_PPS and TCP_CPS robot tags
are used by two different suite types (hoststack or ASTF).
This fixes the unintended impact on hoststack.
+ Add HOSTSTACK tag to VSAP suites.
- They could also get VSAP tag, but not needed for this Change.
Fixes: 1daa6fdc0bae284dee1b61f34534e59b60b7526a
Change-Id: Ic583b5ae336c9b74794706fefc232f221a243c87
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
- No support for IMIX.
+ Fix a bad bug in padding (most ASTF profiles had wrong frame sizes).
+ Fix a big typo in TCP PPS profiles (s->c was not data, just RST).
+ Control transaction size via ASTF_N_DATA_FRAMES env variable.
- Default value 5 leads to transactions smaller than before.
+ It ensures transaction is one burst (per direction) even for jumbo.
+ Edit autogen to set supported frame sizes based on suite id.
+ Both TCP and UDP use the same values:
+ 64B for CPS (exact for UDP, nominal for TCP).
+ 100B, 1518B and 9000B for TPUT and PPS.
- TCP TPUT achievable minimum is 70B.
+ Used 100B to leave room for possible IPv6 ASTF tests.
+ Separate function for code reused by vpp and trex tests.
- I do not really like the new "copy and edit" approach added here.
+ But it is a quick edit, better autogen refactor is low priority.
+ Consider both established and transitory sessions as valid.
- Mostly for compatibility with 2202 behavior and to avoid ramp-ups.
- Assuming both session states have similar enough VPP CPU overhead.
+ Added a TODO to investigate and maybe reconsider later.
+ Update the state timeout value to 240s.
+ That is the default for TCP (for transitory state).
- UDP could keep using 300s.
+ But I prefer UDP and TCP to behave as similarly as possible.
+ Use TRex tunables to get the exact frame size (for data packets).
- It is not clear why the recipe for MSS has to be this complicated.
+ Move code away from profile init, as frame size is not known there.
+ Change internal profile API, so values related to MSS are passed.
+ Lower ramp-up rate for TCP TPUT tests.
+ Because without lower rate, jumbo fails on packet loss in ramp-up.
+ UDP TPUT ramp-up rate also lowered (just to keep suites more similar).
+ Distinguish one-direction and aggregated average frame size.
+ Update keyword documentation where the distiction matters.
+ One-direction is needed for turning bandwidth limit to TPS limit.
+ Aggregated is needed for correct NDRPDR bandwidth result value.
- TCP TPUT will always be few percent below bidirectional maximum.
+ That is unavoidable, as one direction sends more control packets.
+ Add runtime consistency checks so future refactors are safer.
+ Fail if padding requested would be negative.
+ Fail if suite claims unexpected values for packets per transaction.
+ Edit the 4 types of ASTF profiles to keep them similar to each other.
+ Move UDP TPUT limit value from a field back to direct argument.
+ Stop pretending first UDP packet is not data.
+ Apply small improvements where convenient.
+ Replace "aggregate" with "aggregated" where possible.
+ To lower probability of any future typos in variable names.
+ Avoid calling Set Numeric Frame Sizes twice.
+ Code formatting, keyword documentation, code comments, ...
+ Add TODOs for less important code quality improvements.
- Postpone updating of methodology pages to a subsequent change.
Change-Id: I4b381e5210e69669f972326202fdcc5a2c9c923b
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
+ Set overhead in those suites to numeric values.
+ Change the library to tolerate string representations anyway.
Change-Id: Ic6215840f7797801c994a38db5637999eb85a034
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
+ With ramp-up, without reset, with session verification.
+ Uses the same profile as pps tests.
+ Ramp up duration is not specified, as duration is computed.
+ Timeout tracking with automated ramp-up.
+ Correct computation of next duration.
+ Checking both early and late sessions.
+ No loss measurement also acts as a ramp-up.
+ Return ReceiveRateMeasurement from send_traffic_on_tg_internal,
as that is needed for detecting whether trial is ok as ramp up.
- Udp needs quite low ramp-up rate after recent regression.
- Max scale has higher rate (so failing) to avoid session timeouts.
+ Bump copyright year.
Change-Id: I50c928659cd5b985b490a2e5fb69c5cd790600b0
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|
|
Jira: CSIT-1768
Change-Id: I888ae1a5754fa07297d4cdf65c2be0e3e49d89a5
Signed-off-by: Jan Gelety <jgelety@cisco.com>
|
|
+ Add UDP_CPS, TCP_CPS, UDP_PPS and TCP_PPS suites.
+ Update existing cps traffic profiles.
+ Add missing traffic profiles.
+ UDP:
+ Single burst of 32 packets was confirmed as safe enough for TRex.
+ Maybe 64 could work, but not enough testing for that.
+ Multiple bursts have lead to reduced TRex performance,
as overlaping bursts (from different client instances)
tend to fill up the buffers.
+ TCP:
+ Data size set to 11111 bytes, completely arbitrarily.
+ Results look reasonable, so I have kept that.
- MSS not set at all
- No tested support for frame size other than 64B.
- Frame size does not even factor into TCP profiles.
+ So other frame sizes are skipped in autogen.
+ Update tags in related suites.
- HOSTS_{n} and SRC_USER_{n} should be unified.
- Questionable clarification on difference between IP4BASE and SCALE.
+ Add NAT state resetters to tests that need them.
+ Resetter is called (if set) before each measurement.
+ If ramp-up is detected, resetter is not set.
+ Rename "mult" argument to "multiplier".
+ Abstracted from packets to transactions.
+ Transaction corresponds to profile.
+ TRex multiplier argument sets target rate in transactions per second.
+ The familiar STL traffic:
+ Bidirectional is considered to be 2 packets per transaction.
+ Unidirectional is considered to be 1 packet per transaction.
+ The newer ASTF traffic:
+ 4 subtypes, each has different number of packets per transaction.
+ For max rate computation:
+ Packets in the more numerous direction are considered.
+ Rely on TRex reported traffic duration for ASTF:
+ Use the server side value.
- Client side value is higher by an overhead.
- TRex is not sending traffic during that time.
+ Remove delays from traffic profiles.
- Those delays would increase the reprted traffic time.
+ Support for scale lmited trials.
+ Only for ASTF profiles, each ASTF profile has limited scale.
+ Scale defined in suite variables.
+ For TRex to send all transactions provided duration value is ignored.
+ The appropriate value is computed in TrafficGenerator.
+ An ad-hoc time constant is added to match the TRex client side time overhead.
+ The profile driver receives the computed duration.
+ Measurement for PLRsearch add a sleep if the computed duration is smaller.
+ Alternative argument for search algos if scale is limited.
+ Both need higher timeout to accomodate big scales.
+ MLRsearch can afford fewer phases.
+ Added a parameter to optionally shorten the duration.
+ Use short duration for runtime stats trial and failure stats trial.
+ Use very large keepalive values in udp profiles to avoid ka packets.
+ No polling in ASTF profile driver.
- Polling could eliminate the time overhead value.
+ But polling proved to introduce some loss, affecting the results.
+ Handle duration stretching in ASTF by stopping traffic.
+ The stop has several steps so that:
+ The traffic is really stopped entirely.
+ Late packets do not count (maybe as errors).
+ Stats are preserved to read for results (and cleared afterwards).
+ Several quantities added to ReceiveRateMeasurement:
+ Original target duration is preserved (algos need that).
+ Input estimate (tps) for early search iterations.
+ Output estimate (maybe pps) for MRR output.
+ Strict result (unsent counts as loss) for NDR.
+ Use L2 counters (opackets, ipackets) where possible.
- TRex has trouble processing packets for the L7 ones at high loads.
+ Remove warmup from profile drivers and keywords.
+ Suites should call "Send ramp-up traffic" explicitly if needed.
+ Added parsing for few more counters.
+ Both to use in formulas or just for debug purposes.
- Only 64B cases in autogen, framesize support to be added later.
+ Latency streams during search can be enabled via PERF_USE_LATENCY env var.
+ MLRsearch improvments:
+ Rename argument names to min_rate and max_rate.
+ Use relative receive rate in initial phase.
+ PLRsearch improvements:
+ Careful computation when output (pps) does not match input (tps).
+ Use geometric distribution (instead of Poisson).
+ Helps agains math errors.
+ This should improve estimate stability.
- But in practice big losses still lead to significant jumps.
+ Traffic generator improvements:
+ send_traffic_on_tg now calls the full set_rate_provider_defaults.
+ _send_traffic_on_tg_internal for the logic without provider defaults.
+ As the internal function is re-used by measure() without affecting defaults.
+ Move _parse_traffic_results just before get_measurement_result.
+ As the latter uses fields set bu the former, it is now easier to read.
+ Multiple sources for approximate duration.
+ Tried from more precise to more available.
+ Includes logic for _pps tests (added in later change).
+ Move explicit type conversions to earlier occurences.
+ Profile driver output field uses semicolons to simplify parsing.
+ Performance Robot lib file split to several smaller ones.
+ performance_actions.robot:
+ Hosts Additional Statistics Action For * keywords.
+ performance_display.robot:
+ Hosts keyword for displaying and verifying results.
+ Change test message to use the correct unit (pps or cps).
+ performance_limits.robot renamed to performance_vars.robot
+ Added many keywords, mostly for accessing test variables.
+ Moved variables for Policer into a new keyword there.
+ Some keywords need sophisticated logic.
- Other are basically Get Variable Value.
+ But in future more logic can be added, without editing callers.
+ Documentation for the new keywords acts as a documentation for test variables.
+ performance_utils.robot has the rest.
+ Eliminated arguments if the value is in test variable.
+ Small improvements to documentation.
- Still not enough cleanup with respect to arguments and test variables.
+ Keywords are sorted alphabetically now in each one.
+ Suites:
+ Unified variables table:
+ No colons in comments.
+ ${n_hosts}, ${n_ports} and use them instead hardcoded numbers.
+ Add -cps to existing cps suite names.
+ Remove "trial data overwrite".
+ Compute max rate as in STL suites.
+ Each NAT suite has ip4base suite to compare results to.
- Those act as indirect TRex calibration.
- VPP does not lose packets in those.
+ Latency in ASTF suites is disabled hard.
- As we do not support latency in ASTF profiles yet.
+ Unidirectional tests governed by suite variable, not an argument.
+ Write long argument lists vertically.
+ Prefer to use argument names.
+ In Python, also the last argument is followed by comma.
+ It makes renaming and reordering easier.
+ Similarly applies to prints with long lists of values.
+ A TODO to update api crc file comments.
Change-Id: I84729355edbec051298a9de1162107f88ff5737d
Signed-off-by: Vratko Polak <vrpolak@cisco.com>
|