aboutsummaryrefslogtreecommitdiffstats
path: root/docs/ietf/draft-ietf-bmwg-mlrsearch-07.txt
diff options
context:
space:
mode:
Diffstat (limited to 'docs/ietf/draft-ietf-bmwg-mlrsearch-07.txt')
-rw-r--r--docs/ietf/draft-ietf-bmwg-mlrsearch-07.txt2800
1 files changed, 0 insertions, 2800 deletions
diff --git a/docs/ietf/draft-ietf-bmwg-mlrsearch-07.txt b/docs/ietf/draft-ietf-bmwg-mlrsearch-07.txt
deleted file mode 100644
index c5c94410a8..0000000000
--- a/docs/ietf/draft-ietf-bmwg-mlrsearch-07.txt
+++ /dev/null
@@ -1,2800 +0,0 @@
-
-
-
-
-Benchmarking Working Group M. Konstantynowicz
-Internet-Draft V. Polak
-Intended status: Informational Cisco Systems
-Expires: 18 January 2025 18 July 2024
-
-
- Multiple Loss Ratio Search
- draft-ietf-bmwg-mlrsearch-07
-
-Abstract
-
- This document proposes extensions to [RFC2544] throughput search by
- defining a new methodology called Multiple Loss Ratio search
- (MLRsearch). MLRsearch aims to minimize search duration, support
- multiple loss ratio searches, and enhance result repeatability and
- comparability.
-
- The primary reason for extending [RFC2544] is to address the
- challenges and requirements presented by the evaluation and testing
- of software-based networking systems' data planes.
-
- To give users more freedom, MLRsearch provides additional
- configuration options such as allowing multiple short trials per load
- instead of one large trial, tolerating a certain percentage of trial
- results with higher loss, and supporting the search for multiple
- goals with varying loss ratios.
-
-Status of This Memo
-
- This Internet-Draft is submitted in full conformance with the
- provisions of BCP 78 and BCP 79.
-
- Internet-Drafts are working documents of the Internet Engineering
- Task Force (IETF). Note that other groups may also distribute
- working documents as Internet-Drafts. The list of current Internet-
- Drafts is at https://datatracker.ietf.org/drafts/current/.
-
- Internet-Drafts are draft documents valid for a maximum of six months
- and may be updated, replaced, or obsoleted by other documents at any
- time. It is inappropriate to use Internet-Drafts as reference
- material or to cite them other than as "work in progress."
-
- This Internet-Draft will expire on 18 January 2025.
-
-Copyright Notice
-
- Copyright (c) 2024 IETF Trust and the persons identified as the
- document authors. All rights reserved.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 1]
-
-Internet-Draft MLRsearch July 2024
-
-
- This document is subject to BCP 78 and the IETF Trust's Legal
- Provisions Relating to IETF Documents (https://trustee.ietf.org/
- license-info) in effect on the date of publication of this document.
- Please review these documents carefully, as they describe your rights
- and restrictions with respect to this document. Code Components
- extracted from this document must include Revised BSD License text as
- described in Section 4.e of the Trust Legal Provisions and are
- provided without warranty as described in the Revised BSD License.
-
-Table of Contents
-
- 1. Purpose and Scope . . . . . . . . . . . . . . . . . . . . . . 4
- 2. Identified Problems . . . . . . . . . . . . . . . . . . . . . 5
- 2.1. Long Search Duration . . . . . . . . . . . . . . . . . . 5
- 2.2. DUT in SUT . . . . . . . . . . . . . . . . . . . . . . . 6
- 2.3. Repeatability and Comparability . . . . . . . . . . . . . 8
- 2.4. Throughput with Non-Zero Loss . . . . . . . . . . . . . . 8
- 2.5. Inconsistent Trial Results . . . . . . . . . . . . . . . 9
- 3. MLRsearch Specification . . . . . . . . . . . . . . . . . . . 10
- 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 10
- 3.2. Measurement Quantities . . . . . . . . . . . . . . . . . 11
- 3.3. Existing Terms . . . . . . . . . . . . . . . . . . . . . 12
- 3.3.1. SUT . . . . . . . . . . . . . . . . . . . . . . . . . 12
- 3.3.2. DUT . . . . . . . . . . . . . . . . . . . . . . . . . 12
- 3.3.3. Trial . . . . . . . . . . . . . . . . . . . . . . . . 12
- 3.4. Trial Terms . . . . . . . . . . . . . . . . . . . . . . . 13
- 3.4.1. Trial Duration . . . . . . . . . . . . . . . . . . . 14
- 3.4.2. Trial Load . . . . . . . . . . . . . . . . . . . . . 14
- 3.4.3. Trial Input . . . . . . . . . . . . . . . . . . . . . 15
- 3.4.4. Traffic Profile . . . . . . . . . . . . . . . . . . . 15
- 3.4.5. Trial Forwarding Ratio . . . . . . . . . . . . . . . 16
- 3.4.6. Trial Loss Ratio . . . . . . . . . . . . . . . . . . 16
- 3.4.7. Trial Forwarding Rate . . . . . . . . . . . . . . . . 17
- 3.4.8. Trial Effective Duration . . . . . . . . . . . . . . 17
- 3.4.9. Trial Output . . . . . . . . . . . . . . . . . . . . 18
- 3.4.10. Trial Result . . . . . . . . . . . . . . . . . . . . 18
- 3.5. Goal Terms . . . . . . . . . . . . . . . . . . . . . . . 19
- 3.5.1. Goal Final Trial Duration . . . . . . . . . . . . . . 19
- 3.5.2. Goal Duration Sum . . . . . . . . . . . . . . . . . . 19
- 3.5.3. Goal Loss Ratio . . . . . . . . . . . . . . . . . . . 20
- 3.5.4. Goal Exceed Ratio . . . . . . . . . . . . . . . . . . 20
- 3.5.5. Goal Width . . . . . . . . . . . . . . . . . . . . . 21
- 3.5.6. Search Goal . . . . . . . . . . . . . . . . . . . . . 21
- 3.5.7. Controller Input . . . . . . . . . . . . . . . . . . 22
- 3.6. Search Goal Examples . . . . . . . . . . . . . . . . . . 23
- 3.6.1. RFC2544 Goal . . . . . . . . . . . . . . . . . . . . 23
- 3.6.2. TST009 Goal . . . . . . . . . . . . . . . . . . . . . 24
- 3.7. Result Terms . . . . . . . . . . . . . . . . . . . . . . 24
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 2]
-
-Internet-Draft MLRsearch July 2024
-
-
- 3.7.1. Relevant Upper Bound . . . . . . . . . . . . . . . . 25
- 3.7.2. Relevant Lower Bound . . . . . . . . . . . . . . . . 25
- 3.7.3. Conditional Throughput . . . . . . . . . . . . . . . 26
- 3.7.4. Goal Result . . . . . . . . . . . . . . . . . . . . . 26
- 3.7.5. Search Result . . . . . . . . . . . . . . . . . . . . 27
- 3.7.6. Controller Output . . . . . . . . . . . . . . . . . . 27
- 3.8. MLRsearch Architecture . . . . . . . . . . . . . . . . . 28
- 3.8.1. Measurer . . . . . . . . . . . . . . . . . . . . . . 28
- 3.8.2. Controller . . . . . . . . . . . . . . . . . . . . . 29
- 3.8.3. Manager . . . . . . . . . . . . . . . . . . . . . . . 29
- 3.9. Implementation Compliance . . . . . . . . . . . . . . . . 30
- 4. Additional Considerations . . . . . . . . . . . . . . . . . . 30
- 4.1. MLRsearch Versions . . . . . . . . . . . . . . . . . . . 31
- 4.2. Stopping Conditions . . . . . . . . . . . . . . . . . . . 31
- 4.3. Load Classification . . . . . . . . . . . . . . . . . . . 32
- 4.4. Loss Ratios . . . . . . . . . . . . . . . . . . . . . . . 32
- 4.5. Loss Inversion . . . . . . . . . . . . . . . . . . . . . 33
- 4.6. Exceed Ratio . . . . . . . . . . . . . . . . . . . . . . 34
- 4.7. Duration Sum . . . . . . . . . . . . . . . . . . . . . . 34
- 4.8. Short Trials . . . . . . . . . . . . . . . . . . . . . . 35
- 4.9. Throughput . . . . . . . . . . . . . . . . . . . . . . . 35
- 4.10. Search Time . . . . . . . . . . . . . . . . . . . . . . . 37
- 4.11. RFC2544 Compliance . . . . . . . . . . . . . . . . . . . 38
- 5. Logic of Load Classification . . . . . . . . . . . . . . . . 38
- 5.1. Introductory Remarks . . . . . . . . . . . . . . . . . . 38
- 5.2. Performance Spectrum . . . . . . . . . . . . . . . . . . 38
- 5.2.1. First Example . . . . . . . . . . . . . . . . . . . . 39
- 5.2.2. Second Example . . . . . . . . . . . . . . . . . . . 40
- 5.2.3. Third Example . . . . . . . . . . . . . . . . . . . . 40
- 5.2.4. Summary . . . . . . . . . . . . . . . . . . . . . . . 40
- 5.3. Trials with Single Duration . . . . . . . . . . . . . . . 40
- 5.4. Trials with Short Duration . . . . . . . . . . . . . . . 42
- 5.4.1. Scenarios . . . . . . . . . . . . . . . . . . . . . . 42
- 5.4.2. Classification Logic . . . . . . . . . . . . . . . . 43
- 5.5. Trials with Longer Duration . . . . . . . . . . . . . . . 45
- 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 45
- 7. Security Considerations . . . . . . . . . . . . . . . . . . . 45
- 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 46
- 9. Appendix A: Load Classification . . . . . . . . . . . . . . . 46
- 10. Appendix B: Conditional Throughput . . . . . . . . . . . . . 47
- 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 49
- 11.1. Normative References . . . . . . . . . . . . . . . . . . 49
- 11.2. Informative References . . . . . . . . . . . . . . . . . 49
- Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 49
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 3]
-
-Internet-Draft MLRsearch July 2024
-
-
-1. Purpose and Scope
-
- The purpose of this document is to describe Multiple Loss Ratio
- search (MLRsearch), a data plane throughput search methodology
- optimized for software networking DUTs.
-
- Applying vanilla [RFC2544] throughput bisection to software DUTs
- results in several problems:
-
- * Binary search takes too long as most trials are done far from the
- eventually found throughput.
-
- * The required final trial duration and pauses between trials
- prolong the overall search duration.
-
- * Software DUTs show noisy trial results, leading to a big spread of
- possible discovered throughput values.
-
- * Throughput requires a loss of exactly zero frames, but the
- industry frequently allows for small but non-zero losses.
-
- * The definition of throughput is not clear when trial results are
- inconsistent.
-
- To address the problems mentioned above, the MLRsearch test
- methodology specification employs the following enhancements:
-
- * Allow multiple short trials instead of one big trial per load.
-
- - Optionally, tolerate a percentage of trial results with higher
- loss.
-
- * Allow searching for multiple Search Goals, with differing loss
- ratios.
-
- - Any trial result can affect each Search Goal in principle.
-
- * Insert multiple coarse targets for each Search Goal, earlier ones
- need to spend less time on trials.
-
- - Earlier targets also aim for lesser precision.
-
- - Use Forwarding Rate (FR) at maximum offered load [RFC2285]
- (section 3.6.2) to initialize the initial targets.
-
- * Take care when dealing with inconsistent trial results.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 4]
-
-Internet-Draft MLRsearch July 2024
-
-
- - Reported throughput is smaller than the smallest load with high
- loss.
-
- - Smaller load candidates are measured first.
-
- * Apply several load selection heuristics to save even more time by
- trying hard to avoid unnecessarily narrow bounds.
-
- Some of these enhancements are formalized as MLRsearch specification,
- the remaining enhancements are treated as implementation details,
- thus achieving high comparability without limiting future
- improvements.
-
- MLRsearch configuration options are flexible enough to support both
- conservative settings and aggressive settings. The conservative
- settings lead to results unconditionally compliant with [RFC2544],
- but longer search duration and worse repeatability. Conversely,
- aggressive settings lead to shorter search duration and better
- repeatability, but the results are not compliant with [RFC2544].
-
- No part of [RFC2544] is intended to be obsoleted by this document.
-
-2. Identified Problems
-
- This chapter describes the problems affecting usability of various
- performance testing methodologies, mainly a binary search for
- [RFC2544] unconditionally compliant throughput.
-
-2.1. Long Search Duration
-
- The emergence of software DUTs, with frequent software updates and a
- number of different frame processing modes and configurations, has
- increased both the number of performance tests required to verify the
- DUT update and the frequency of running those tests. This makes the
- overall test execution time even more important than before.
-
- The current [RFC2544] throughput definition restricts the potential
- for time-efficiency improvements. A more generalized throughput
- concept could enable further enhancements while maintaining the
- precision of simpler methods.
-
- The bisection method, when unconditionally compliant with [RFC2544],
- is excessively slow. This is because a significant amount of time is
- spent on trials with loads that, in retrospect, are far from the
- final determined throughput.
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 5]
-
-Internet-Draft MLRsearch July 2024
-
-
- [RFC2544] does not specify any stopping condition for throughput
- search, so users already have an access to a limited trade-off
- between search duration and achieved precision. However, each full
- 60-second trials doubles the precision, so not many trials can be
- removed without a substantial loss of precision.
-
-2.2. DUT in SUT
-
- [RFC2285] defines: - DUT as - The network forwarding device to which
- stimulus is offered and response measured [RFC2285] (section 3.1.1).
- - SUT as - The collective set of network devices to which stimulus is
- offered as a single entity and response measured [RFC2285] (section
- 3.1.2).
-
- [RFC2544] specifies a test setup with an external tester stimulating
- the networking system, treating it either as a single DUT, or as a
- system of devices, an SUT.
-
- In the case of software networking, the SUT consists of not only the
- DUT as a software program processing frames, but also of server
- hardware and operating system functions, with that server hardware
- resources shared across all programs including the operating system.
-
- Given that the SUT is a shared multi-tenant environment encompassing
- the DUT and other components, the DUT might inadvertently experience
- interference from the operating system or other software operating on
- the same server.
-
- Some of this interference can be mitigated. For instance, pinning
- DUT program threads to specific CPU cores and isolating those cores
- can prevent context switching.
-
- Despite taking all feasible precautions, some adverse effects may
- still impact the DUT's network performance. In this document, these
- effects are collectively referred to as SUT noise, even if the
- effects are not as unpredictable as what other engineering
- disciplines call noise.
-
- DUT can also exhibit fluctuating performance itself, for reasons not
- related to the rest of SUT. For example due to pauses in execution
- as needed for internal stateful processing. In many cases this may
- be an expected per-design behavior, as it would be observable even in
- a hypothetical scenario where all sources of SUT noise are
- eliminated. Such behavior affects trial results in a way similar to
- SUT noise. As the two phenomenons are hard to distinguish, in this
- document the term 'noise' is used to encompass both the internal
- performance fluctuations of the DUT and the genuine noise of the SUT.
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 6]
-
-Internet-Draft MLRsearch July 2024
-
-
- A simple model of SUT performance consists of an idealized noiseless
- performance, and additional noise effects. For a specific SUT, the
- noiseless performance is assumed to be constant, with all observed
- performance variations being attributed to noise. The impact of the
- noise can vary in time, sometimes wildly, even within a single trial.
- The noise can sometimes be negligible, but frequently it lowers the
- observed SUT performance as observed in trial results.
-
- In this model, SUT does not have a single performance value, it has a
- spectrum. One end of the spectrum is the idealized noiseless
- performance value, the other end can be called a noiseful
- performance. In practice, trial result close to the noiseful end of
- the spectrum happens only rarely. The worse the performance value
- is, the more rarely it is seen in a trial. Therefore, the extreme
- noiseful end of the SUT spectrum is not observable among trial
- results. Also, the extreme noiseless end of the SUT spectrum is
- unlikely to be observable, this time because some small noise effects
- are likely to occur multiple times during a trial.
-
- Unless specified otherwise, this document's focus is on the
- potentially observable ends of the SUT performance spectrum, as
- opposed to the extreme ones.
-
- When focusing on the DUT, the benchmarking effort should ideally aim
- to eliminate only the SUT noise from SUT measurements. However, this
- is currently not feasible in practice, as there are no realistic
- enough models available to distinguish SUT noise from DUT
- fluctuations, based on authors' experience and available literature.
-
- Assuming a well-constructed SUT, the DUT is likely its primary
- performance bottleneck. In this case, we can define the DUT's ideal
- noiseless performance as the noiseless end of the SUT performance
- spectrum, especially for throughput. However, other performance
- metrics, such as latency, may require additional considerations.
-
- Note that by this definition, DUT noiseless performance also
- minimizes the impact of DUT fluctuations, as much as realistically
- possible for a given trial duration.
-
- MLRsearch methodology aims to solve the DUT in SUT problem by
- estimating the noiseless end of the SUT performance spectrum using a
- limited number of trial results.
-
- Any improvements to the throughput search algorithm, aimed at better
- dealing with software networking SUT and DUT setup, should employ
- strategies recognizing the presence of SUT noise, allowing the
- discovery of (proxies for) DUT noiseless performance at different
- levels of sensitivity to SUT noise.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 7]
-
-Internet-Draft MLRsearch July 2024
-
-
-2.3. Repeatability and Comparability
-
- [RFC2544] does not suggest to repeat throughput search. And from
- just one discovered throughput value, it cannot be determined how
- repeatable that value is. Poor repeatability then leads to poor
- comparability, as different benchmarking teams may obtain varying
- throughput values for the same SUT, exceeding the expected
- differences from search precision.
-
- [RFC2544] throughput requirements (60 seconds trial and no tolerance
- of a single frame loss) affect the throughput results in the
- following way. The SUT behavior close to the noiseful end of its
- performance spectrum consists of rare occasions of significantly low
- performance, but the long trial duration makes those occasions not so
- rare on the trial level. Therefore, the binary search results tend
- to wander away from the noiseless end of SUT performance spectrum,
- more frequently and more widely than short trials would, thus causing
- poor throughput repeatability.
-
- The repeatability problem can be addressed by defining a search
- procedure that identifies a consistent level of performance, even if
- it does not meet the strict definition of throughput in [RFC2544].
-
- According to the SUT performance spectrum model, better repeatability
- will be at the noiseless end of the spectrum. Therefore, solutions
- to the DUT in SUT problem will help also with the repeatability
- problem.
-
- Conversely, any alteration to [RFC2544] throughput search that
- improves repeatability should be considered as less dependent on the
- SUT noise.
-
- An alternative option is to simply run a search multiple times, and
- report some statistics (e.g. average and standard deviation). This
- can be used for a subset of tests deemed more important, but it makes
- the search duration problem even more pronounced.
-
-2.4. Throughput with Non-Zero Loss
-
- [RFC1242] (section 3.17 Throughput) defines throughput as: The
- maximum rate at which none of the offered frames are dropped by the
- device.
-
- Then, it says: Since even the loss of one frame in a data stream can
- cause significant delays while waiting for the higher level protocols
- to time out, it is useful to know the actual maximum data rate that
- the device can support.
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 8]
-
-Internet-Draft MLRsearch July 2024
-
-
- However, many benchmarking teams accept a small, non-zero loss ratio
- as the goal for their load search.
-
- Motivations are many:
-
- * Modern protocols tolerate frame loss better, compared to the time
- when [RFC1242] and [RFC2544] were specified.
-
- * Trials nowadays send way more frames within the same duration,
- increasing the chance of a small SUT performance fluctuation being
- enough to cause frame loss.
-
- * Small bursts of frame loss caused by noise have otherwise smaller
- impact on the average frame loss ratio observed in the trial, as
- during other parts of the same trial the SUT may work more closely
- to its noiseless performance, thus perhaps lowering the Trial Loss
- Ratio below the Goal Loss Ratio value.
-
- * If an approximation of the SUT noise impact on the Trial Loss
- Ratio is known, it can be set as the Goal Loss Ratio.
-
- Regardless of the validity of all similar motivations, support for
- non-zero loss goals makes any search algorithm more user-friendly.
- [RFC2544] throughput is not user-friendly in this regard.
-
- Furthermore, allowing users to specify multiple loss ratio values,
- and enabling a single search to find all relevant bounds,
- significantly enhances the usefulness of the search algorithm.
-
- Searching for multiple Search Goals also helps to describe the SUT
- performance spectrum better than the result of a single Search Goal.
- For example, the repeated wide gap between zero and non-zero loss
- loads indicates the noise has a large impact on the observed
- performance, which is not evident from a single goal load search
- procedure result.
-
- It is easy to modify the vanilla bisection to find a lower bound for
- the intended load that satisfies a non-zero Goal Loss Ratio. But it
- is not that obvious how to search for multiple goals at once, hence
- the support for multiple Search Goals remains a problem.
-
-2.5. Inconsistent Trial Results
-
- While performing throughput search by executing a sequence of
- measurement trials, there is a risk of encountering inconsistencies
- between trial results.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 9]
-
-Internet-Draft MLRsearch July 2024
-
-
- The plain bisection never encounters inconsistent trials. But
- [RFC2544] hints about the possibility of inconsistent trial results,
- in two places in its text. The first place is section 24, where full
- trial durations are required, presumably because they can be
- inconsistent with the results from short trial durations. The second
- place is section 26.3, where two successive zero-loss trials are
- recommended, presumably because after one zero-loss trial there can
- be a subsequent inconsistent non-zero-loss trial.
-
- Examples include:
-
- * A trial at the same load (same or different trial duration)
- results in a different Trial Loss Ratio.
-
- * A trial at a higher load (same or different trial duration)
- results in a smaller Trial Loss Ratio.
-
- Any robust throughput search algorithm needs to decide how to
- continue the search in the presence of such inconsistencies.
- Definitions of throughput in [RFC1242] and [RFC2544] are not specific
- enough to imply a unique way of handling such inconsistencies.
-
- Ideally, there will be a definition of a new quantity which both
- generalizes throughput for non-zero-loss (and other possible
- repeatability enhancements), while being precise enough to force a
- specific way to resolve trial result inconsistencies. But until such
- a definition is agreed upon, the correct way to handle inconsistent
- trial results remains an open problem.
-
-3. MLRsearch Specification
-
- This section describes MLRsearch specification including all
- technical definitions needed for evaluating whether a particular test
- procedure complies with MLRsearch specification.
-
-3.1. Overview
-
- MLRsearch specification describes a set of abstract system
- components, acting as functions with specified inputs and outputs.
-
- A test procedure is said to comply with MLRsearch specification if it
- can be conceptually divided into analogous components, each
- satisfying requirements for the corresponding MLRsearch component.
-
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 10]
-
-Internet-Draft MLRsearch July 2024
-
-
- The Measurer component is tasked to perform trials, the Controller
- component is tasked to select trial loads and durations, the Manager
- component is tasked to pre-configure everything and to produce the
- test report. The test report explicitly states Search Goals (as the
- Controller Inputs) and corresponding Goal Results (Controller
- Outputs).
-
- The Manager calls the Controller once, the Controller keeps calling
- the Measurer until all stopping conditions are met.
-
- The part where Controller calls the Measurer is called the search.
- Any activity done by the Manager before it calls the Controller (or
- after Controller returns) is not considered to be part of the search.
-
- MLRsearch specification prescribes regular search results and
- recommends their stopping conditions. Irregular search results are
- also allowed, they may have different requirements and stopping
- conditions.
-
- Search results are based on load classification. When measured
- enough, any chosen load either achieves of fails each search goal,
- thus becoming a lower or an upper bound for that goal. When the
- relevant bounds are at loads that are close enough (according to goal
- precision), the regular result is found. Search stops when all
- regular results are found (or if some goals are proven to have only
- irregular results).
-
-3.2. Measurement Quantities
-
- MLRsearch specification uses a number of measurement quantities.
-
- In general, MLRsearch specification does not require particular units
- to be used, but it is REQUIRED for the test report to state all the
- units. For example, ratio quantities can be dimensionless numbers
- between zero and one, but may be expressed as percentages instead.
-
- For convenience, a group of quantities can be treated as a composite
- quantity, One constituent of a composite quantity is called an
- attribute, and a group of attribute values is called an instance of
- that composite quantity.
-
- Some attributes are not independent from others, and they can be
- calculated from other attributes. Such quantites are called derived
- quantities.
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 11]
-
-Internet-Draft MLRsearch July 2024
-
-
-3.3. Existing Terms
-
- RFC 1242 "Benchmarking Terminology for Network Interconnect Devices"
- contains basic definitions, and RFC 2544 "Benchmarking Methodology
- for Network Interconnect Devices" contains discussions of a number of
- terms and additional methodology requirements. RFC 2285 adds more
- terms and discussions, describing some known situations in more
- precise way.
-
- All three documents should be consulted before attempting to make use
- of this document.
-
- Definitions of some central terms are copied and discussed in
- subsections.
-
-3.3.1. SUT
-
- Defined in [RFC2285] (section 3.1.2 System Under Test (SUT)) as
- follows.
-
- Definition:
-
- The collective set of network devices to which stimulus is offered as
- a single entity and response measured.
-
- Discussion:
-
- An SUT consisting of a single network device is also allowed.
-
-3.3.2. DUT
-
- Defined in [RFC2285] (section 3.1.1 Device Under Test (DUT)) as
- follows.
-
- Definition:
-
- The network forwarding device to which stimulus is offered and
- response measured.
-
- Discussion:
-
- DUT, as a sub-component of SUT, is only indirectly mentioned in
- MLRsearch specification, but is of key relevance for its motivation.
-
-3.3.3. Trial
-
- A trial is the part of the test described in [RFC2544] (section 23.
- Trial description).
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 12]
-
-Internet-Draft MLRsearch July 2024
-
-
- Definition:
-
- A particular test consists of multiple trials. Each trial returns
- one piece of information, for example the loss rate at a particular
- input frame rate. Each trial consists of a number of phases:
-
- a) If the DUT is a router, send the routing update to the "input"
- port and pause two seconds to be sure that the routing has settled.
-
- b) Send the "learning frames" to the "output" port and wait 2 seconds
- to be sure that the learning has settled. Bridge learning frames are
- frames with source addresses that are the same as the destination
- addresses used by the test frames. Learning frames for other
- protocols are used to prime the address resolution tables in the DUT.
- The formats of the learning frame that should be used are shown in
- the Test Frame Formats document.
-
- c) Run the test trial.
-
- d) Wait for two seconds for any residual frames to be received.
-
- e) Wait for at least five seconds for the DUT to restabilize.
-
- Discussion:
-
- The definition describes some traits, it is not clear whether all of
- them are REQUIRED, or some of them are only RECOMMENDED.
-
- For the purposes of the MLRsearch specification, it is ALLOWED for
- the test procedure to deviate from the [RFC2544] description, but any
- such deviation MUST be made explicit in the test report.
-
- Trials are the only stimuli the SUT is expected to experience during
- the search.
-
- In some discussion paragraphs, it is useful to consider the traffic
- as sent and received by a tester, as implicitly defined in [RFC2544]
- (section 6. Test set up).
-
- An example of deviation from [RFC2544] is using shorter wait times.
-
-3.4. Trial Terms
-
- This section defines new and redefine existing terms for quantities
- relevant as inputs or outputs of trial, as used by the Measurer
- component.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 13]
-
-Internet-Draft MLRsearch July 2024
-
-
-3.4.1. Trial Duration
-
- Definition:
-
- Trial duration is the intended duration of the traffic for a trial.
-
- Discussion:
-
- In general, this quantity does not include any preparation nor
- waiting described in section 23 of [RFC2544] (section 23. Trial
- description).
-
- While any positive real value may be provided, some Measurer
- implementations MAY limit possible values, e.g. by rounding down to
- neared integer in seconds. In that case, it is RECOMMENDED to give
- such inputs to the Controller so the Controller only proposes the
- accepted values. Alternatively, the test report MUST present the
- rounded values as Search Goal attributes.
-
-3.4.2. Trial Load
-
- Definition:
-
- The trial load is the intended load for a trial
-
- Discussion:
-
- For test report purposes, it is assumed that this is a constant load
- by default. This MAY be only an average load, e.g. when the traffic
- is intended to be busty, e.g. as suggested in [RFC2544] (section 21.
- Bursty traffic), but the test report MUST explicitly mention how non-
- constant the traffic is.
-
- Trial load is the quantity defined as Constant Load of [RFC1242]
- (section 3.4 Constant Load), Data Rate of [RFC2544] (section 14.
- Bidirectional traffic) and Intended Load of [RFC2285] (section 3.5.1
- Intended load (Iload)). All three definitions specify that this
- value applies to one (input or output) interface.
-
- For test report purposes, multi-interface aggregate load MAY be
- reported, this is understood as the same quantity expressed using
- different units. From the report it MUST be clear whether a
- particular trial load value is per one interface, or an aggregate
- over all interfaces.
-
- Similarly to trial duration, some Measurers may limit the possible
- values of trial load. Contrary to trial duration, the test report is
- NOT REQUIRED to document such behavior.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 14]
-
-Internet-Draft MLRsearch July 2024
-
-
- It is ALLOWED to combine trial load and trial duration in a way that
- would not be possible to achieve using any integer number of data
- frames.
-
-3.4.3. Trial Input
-
- Definition:
-
- Trial Input is a composite quantity, consisting of two attributes:
- trial duration and trial load.
-
- Discussion:
-
- When talking about multiple trials, it is common to say "Trial
- Inputs" to denote all corresponding Trial Input instances.
-
- A Trial Input instance acts as the input for one call of the Measurer
- component.
-
- Contrary to other composite quantities, MLRsearch implementations are
- NOT ALLOWED to add optional attributes here. This improves
- interoperability between various implementations of the Controller
- and the Measurer.
-
-3.4.4. Traffic Profile
-
- Definition:
-
- Traffic profile is a composite quantity containing attributes other
- than trial load and trial duration, needed for unique determination
- of the trial to be performed.
-
- Discussion:
-
- All its attributes are assumed to be constant during the search, and
- the composite is configured on the Measurer by the Manager before the
- search starts. This is why the traffic profile is not part of the
- Trial Input.
-
- As a consequence, implementations of the Manager and the Measurer
- must be aware of their common set of capabilities, so that the
- traffic profile uniquely defines the traffic during the search. The
- important fact is that none of those capabilities have to be known by
- the Controller implementations.
-
- The traffic profile SHOULD contain some specific quantities, for
- example [RFC2544] (section 9. Frame sizes) governs data link frame
- size as defined in [RFC1242] (section 3.5 Data link frame size).
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 15]
-
-Internet-Draft MLRsearch July 2024
-
-
- Several more specific quantities may be RECOMMENDED, depending on
- media type. For example, [RFC2544] (Appendix C) lists frame formats
- and protocol addresses, as recommended from [RFC2544] (section 8.
- Frame formats) and [RFC2544] (section 12. Protocol addresses).
-
- Depending on SUT configuration, e.g. when testing specific protocols,
- additional attributes MUST be included in the traffic profile and in
- the test report.
-
- Example: [RFC8219] (section 5.3. Traffic Setup) introduces traffic
- setups consisting of a mix of IPv4 and IPv6 traffic - the implied
- traffic profile therefore must include an attribute for their
- percentage.
-
- Other traffic properties that need to be somehow specified in Traffic
- Profile include: [RFC2544] (section 14. Bidirectional traffic),
- [RFC2285] (section 3.3.3 Fully meshed traffic), and [RFC2544]
- (section 11. Modifiers).
-
-3.4.5. Trial Forwarding Ratio
-
- Definition:
-
- The trial forwarding ratio is a dimensionless floating point value.
- It MUST range between 0.0 and 1.0, both inclusive. It is calculated
- by dividing the number of frames successfully forwarded by the SUT by
- the total number of frames expected to be forwarded during the trial
-
- Discussion:
-
- For most traffic profiles, "expected to be forwarded" means "intended
- to get transmitted from Tester towards SUT".
-
- Trial forwarding ratio MAY be expressed in other units (e.g. as a
- percentage) in the test report.
-
- Note that, contrary to loads, frame counts used to compute trial
- forwarding ratio are aggregates over all SUT output interfaces.
-
- Questions around what is the correct number of frames that should
- have been forwarded is generally outside of the scope of this
- document.
-
-3.4.6. Trial Loss Ratio
-
- Definition:
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 16]
-
-Internet-Draft MLRsearch July 2024
-
-
- The Trial Loss Ratio is equal to one minus the trial forwarding
- ratio.
-
- Discussion:
-
- 100% minus the trial forwarding ratio, when expressed as a
- percentage.
-
- This is almost identical to Frame Loss Rate of [RFC1242] (section 3.6
- Frame Loss Rate), the only minor difference is that Trial Loss Ratio
- does not need to be expressed as a percentage.
-
-3.4.7. Trial Forwarding Rate
-
- Definition:
-
- The trial forwarding rate is a derived quantity, calculated by
- multiplying the trial load by the trial forwarding ratio.
-
- Discussion:
-
- It is important to note that while similar, this quantity is not
- identical to the Forwarding Rate as defined in [RFC2285] (section
- 3.6.1 Forwarding rate (FR)). The latter is specific to one output
- interface only, whereas the trial forwarding ratio is based on frame
- counts aggregated over all SUT output interfaces.
-
-3.4.8. Trial Effective Duration
-
- Definition:
-
- Trial effective duration is a time quantity related to the trial, by
- default equal to the trial duration.
-
- Discussion:
-
- This is an optional feature. If the Measurer does not return any
- trial effective duration value, the Controller MUST use the trial
- duration value instead.
-
- Trial effective duration may be any time quantity chosen by the
- Measurer to be used for time-based decisions in the Controller.
-
- The test report MUST explain how the Measurer computes the returned
- trial effective duration values, if they are not always equal to the
- trial duration.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 17]
-
-Internet-Draft MLRsearch July 2024
-
-
- This feature can be beneficial for users who wish to manage the
- overall search duration, rather than solely the traffic portion of
- it. Simply measure the duration of the whole trial (waits including)
- and use that as the trial effective duration.
-
- Also, this is a way for the Measurer to inform the Controller about
- its surprising behavior, for example when rounding the trial duration
- value.
-
-3.4.9. Trial Output
-
- Definition:
-
- Trial Output is a composite quantity. The REQUIRED attributes are
- Trial Loss Ratio, trial effective duration and trial forwarding rate.
-
- Discussion:
-
- When talking about multiple trials, it is common to say "Trial
- Outputs" to denote all corresponding Trial Output instances.
-
- Implementations may provide additional (optional) attributes. The
- Controller implementations MUST ignore values of any optional
- attribute they are not familiar with, except when passing Trial
- Output instance to the Manager.
-
- Example of an optional attribute: The aggregate number of frames
- expected to be forwarded during the trial, especially if it is not
- just (a rounded-up value) implied by trial load and trial duration.
-
- While [RFC2285] (Section 3.5.2 Offered load (Oload)) requires the
- offered load value to be reported for forwarding rate measurements,
- it is NOT REQUIRED in MLRsearch specification.
-
-3.4.10. Trial Result
-
- Definition:
-
- Trial result is a composite quantity, consisting of the Trial Input
- and the Trial Output.
-
- Discussion:
-
- When talking about multiple trials, it is common to say "trial
- results" to denote all corresponding trial result instances.
-
- While implementations SHOULD NOT include additional attributes with
- independent values, they MAY include derived quantities.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 18]
-
-Internet-Draft MLRsearch July 2024
-
-
-3.5. Goal Terms
-
- This section defines new and redefine existing terms for quantities
- indirectly relevant for inputs or outputs of the Controller
- component.
-
- Several goal attributes are defined before introducing the main
- component quantity: the Search Goal.
-
-3.5.1. Goal Final Trial Duration
-
- Definition:
-
- A threshold value for trial durations.
-
- Discussion:
-
- This attribute value MUST be positive.
-
- A trial with Trial Duration at least as long as the Goal Final Trial
- Duration is called a full-length trial (with respect to the given
- Search Goal).
-
- A trial that is not full-length is called a short trial.
-
- Informally, while MLRsearch is allowed to perform short trials, the
- results from such short trials have only limited impact on search
- results.
-
- One trial may be full-length for some Search Goals, but not for
- others.
-
- The full relation of this goal to Controller Output is defined later
- in this document in subsections of [Goal Result] (#Goal-Result). For
- example, the Conditional Throughput for this goal is computed only
- from full-length trial results.
-
-3.5.2. Goal Duration Sum
-
- Definition:
-
- A threshold value for a particular sum of trial effective durations.
-
- Discussion:
-
- This attribute value MUST be positive.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 19]
-
-Internet-Draft MLRsearch July 2024
-
-
- Informally, even when looking only at full-length trials, MLRsearch
- may spend up to this time measuring the same load value.
-
- If the Goal Duration Sum is larger than the Goal Final Trial
- Duration, multiple full-length trials may need to be performed at the
- same load.
-
- See [TST009 Example] (#TST009-Example) for an example where
- possibility of multiple full-length trials at the same load is
- intended.
-
- A Goal Duration Sum value lower than the Goal Final Trial Duration
- (of the same goal) could save some search time, but is NOT
- RECOMMENDED. See [Relevant Upper Bound] (#Relevant-Upper-Bound) for
- partial explanation.
-
-3.5.3. Goal Loss Ratio
-
- Definition:
-
- A threshold value for Trial Loss Ratios.
-
- Discussion:
-
- Attribute value MUST be non-negative and smaller than one.
-
- A trial with Trial Loss Ratio larger than a Goal Loss Ratio value is
- called a lossy trial, with respect to given Search Goal.
-
- Informally, if a load causes too many lossy trials, the Relevant
- Lower Bound for this goal will be smaller than that load.
-
- If a trial is not lossy, it is called a low-loss trial, or
- (specifically for zero Goal Loss Ratio value) zero-loss trial.
-
-3.5.4. Goal Exceed Ratio
-
- Definition:
-
- A threshold value for a particular ratio of sums of Trial Effective
- Durations.
-
- Discussion:
-
- Attribute value MUST be non-negative and smaller than one.
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 20]
-
-Internet-Draft MLRsearch July 2024
-
-
- See later sections for details on which sums. Specifically, the
- direct usage is only in [Appendix A: Load Classification] (#Appendix-
- A:-Load-Classification) and [Appendix B: Conditional Throughput]
- (#Appendix-B:-Conditional-Throughput). The impact of that usage is
- discussed in subsections leading to [Goal Result] (#Goal-Result).
-
- Informally, the impact of lossy trials is controlled by this value.
- Effectively, Goal Exceed Ratio is a percentage of full-length trials
- that may be lossy without the load being classified as the [Relevant
- Upper Bound] (#Relevant-Upper-Bound).
-
-3.5.5. Goal Width
-
- Definition:
-
- A value used as a threshold for deciding whether two trial load
- values are close enough.
-
- Discussion:
-
- If present, the value MUST be positive.
-
- Informally, this acts as a stopping condition, controlling the
- precision of the search. The search stops if every goal has reached
- its precision.
-
- Implementations without this attribute MUST give the Controller other
- ways to control the search stopping conditions.
-
- Absolute load difference and relative load difference are two popular
- choices, but implementations may choose a different way to specify
- width.
-
- The test report MUST make it clear what specific quantity is used as
- Goal Width.
-
- It is RECOMMENDED to set the Goal Width (as relative difference)
- value to a value no smaller than the Goal Loss Ratio. (The reason is
- not obvious, see [Throughput] (#Throughput) if interested.)
-
-3.5.6. Search Goal
-
- Definition:
-
- The Search Goal is a composite quantity consisting of several
- attributes, some of them are required.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 21]
-
-Internet-Draft MLRsearch July 2024
-
-
- Required attributes: - Goal Final Trial Duration - Goal Duration Sum
- - Goal Loss Ratio - Goal Exceed Ratio
-
- Optional attribute: - Goal Width
-
- Discussion:
-
- Implementations MAY add their own attributes. Those additional
- attributes may be required by the implementation even if they are not
- required by MLRsearch specification. But it is RECOMMENDED for those
- implementations to support missing values by computing reasonable
- defaults.
-
- The meaning of listed attributes is formally given only by their
- indirect effect on the search results.
-
- Informally, later sections provide additional intuitions and examples
- of the Search Goal attribute values.
-
- An example of additional attributes required by some implementations
- is Goal Initial Trial Duration, together with another attribute that
- controls possible intermediate Trial Duration values. The reasonable
- default in this case is using the Goal Final Trial Duration and no
- intermediate values.
-
-3.5.7. Controller Input
-
- Definition:
-
- Controller Input is a composite quantity required as an input for the
- Controller. The only REQUIRED attribute is a list of Search Goal
- instances.
-
- Discussion:
-
- MLRsearch implementations MAY use additional attributes. Those
- additional attributes may be required by the implementation even if
- they are not required by MLRsearch specification.
-
- Formally, the Manager does not apply any Controller configuration
- apart from one Controller Input instance.
-
- For example, Traffic Profile is configured on the Measurer by the
- Manager (without explicit assistance of the Controller).
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 22]
-
-Internet-Draft MLRsearch July 2024
-
-
- The order of Search Goal instances in a list SHOULD NOT have a big
- impact on Controller Output (see section [Controller Output]
- (#Controller-Output) , but MLRsearch implementations MAY base their
- behavior on the order of Search Goal instances in a list.
-
- An example of an optional attribute (outside the list of Search
- Goals) required by some implementations is Max Load. While this is a
- frequently used configuration parameter, already governed by
- [RFC2544] (section 20. Maximum frame rate) and [RFC2285] (3.5.3
- Maximum offered load (MOL)), some implementations may detect or
- discover it instead.
-
- In MLRsearch specification, the [Relevant Upper Bound] (#Relevant-
- Upper-Bound) is added as a required attribute precisely because it
- makes the search result independent of Max Load value.
-
-3.6. Search Goal Examples
-
-3.6.1. RFC2544 Goal
-
- The following set of values makes the search result unconditionally
- compliant with [RFC2544] (section 24 Trial duration)
-
- * Goal Final Trial Duration = 60 seconds
-
- * Goal Duration Sum = 60 seconds
-
- * Goal Loss Ratio = 0%
-
- * Goal Exceed Ratio = 0%
-
- The latter two attributes are enough to make the search goal
- conditionally compliant, adding the first attribute makes it
- unconditionally compliant.
-
- The second attribute (Goal Duration Sum) only prevents MLRsearch from
- repeating zero-loss full-length trials.
-
- Non-zero exceed ratio could prolong the search and allow loss
- inversion between lower-load lossy short trial and higher-load full-
- length zero-loss trial. From [RFC2544] alone, it is not clear
- whether that higher load could be considered as compliant throughput.
-
-
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 23]
-
-Internet-Draft MLRsearch July 2024
-
-
-3.6.2. TST009 Goal
-
- One of the alternatives to RFC2544 is described in [TST009] (section
- 12.3.3 Binary search with loss verification). The idea there is to
- repeat lossy trials, hoping for zero loss on second try, so the
- results are closer to the noiseless end of performance sprectum, and
- more repeatable and comparable.
-
- Only the variant with "z = infinity" is achievable with MLRsearch.
-
- For example, for "r = 2" variant, the following search goal should be
- used:
-
- * Goal Final Trial Duration = 60 seconds
-
- * Goal Duration Sum = 120 seconds
-
- * Goal Loss Ratio = 0%
-
- * Goal Exceed Ratio = 50%
-
- If the first 60s trial has zero loss, it is enough for MLRsearch to
- stop measuring at that load, as even a second lossy trial would still
- fit within the exceed ratio.
-
- But if the first trial is lossy, MLRsearch needs to perform also the
- second trial to classify that load. As Goal Duration Sum is twice as
- long as Goal Final Trial Duration, third full-length trial is never
- needed.
-
-3.7. Result Terms
-
- Before defining the output of the Controller, it is useful to define
- what the Goal Result is.
-
- The Goal Result is a composite quantity.
-
- Following subsections define its attribute first, before describing
- the Goal Result quantity.
-
- There is a correspondence between Search Goals and Goal Results.
- Most of the following subsections refer to a given Search Goal, when
- defining attributes of the Goal Result. Conversely, at the end of
- the search, each Search Goal has its corresponding Goal Result.
-
- Conceptually, the search can be seen as a process of load
- classification, where the Controller attempts to classify some loads
- as an Upper Bound or a Lower Bound with respect to some Search Goal.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 24]
-
-Internet-Draft MLRsearch July 2024
-
-
- Before defining real attributes of the goal result, it is useful to
- define bounds in general.
-
-3.7.1. Relevant Upper Bound
-
- Definition:
-
- The Relevant Upper Bound is the smallest trial load value that is
- classified at the end of the search as an upper bound (see
- [Appendix A: Load Classification] (#Appendix-A:-Load-Classification))
- for the given Search Goal.
-
- Discussion:
-
- One search goal can have many different load classified as an upper
- bound. At the end of the search, one of those loads will be the
- smallest, becoming the relevant upper bound for that goal.
-
- In more detail, the set of all trial outputs (both short and full-
- length, enough of them according to Goal Duration Sum) performed at
- that smallest load failed to uphold all the requirements of the given
- Search Goal, mainly the Goal Loss Ratio in combination with the Goal
- Exceed Ratio.
-
- If Max Load does not cause enough lossy trials, the Relevant Upper
- Bound does not exist. Conversely, if Relevant Upper Bound exists, it
- is not affected by Max Load value.
-
-3.7.2. Relevant Lower Bound
-
- Definition:
-
- The Relevant Lower Bound is the largest trial load value among those
- smaller than the Relevant Upper Bound, that got classified at the end
- of the search as a lower bound (see [Appendix A: Load Classification]
- (#Appendix-A:-Load-Classification)) for the given Search Goal.
-
- Discussion:
-
- Only among loads smaller that the relevant upper bound, the largest
- load becomes the relevant lower bound. With loss inversion, stricter
- upper bound matters.
-
- In more detail, the set of all trial outputs (both short and full-
- length, enough of them according to Goal Duration Sum) performed at
- that largest load managed to uphold all the requirements of the given
- Search Goal, mainly the Goal Loss Ratio in combination with the Goal
- Exceed Ratio.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 25]
-
-Internet-Draft MLRsearch July 2024
-
-
- Is no load had enough low-loss trials, the relevant lower bound MAY
- not exist.
-
- Strictly speaking, if the Relevant Upper Bound does not exist, the
- Relevant Lower Bound also does not exist. In that case, Max Load is
- classified as a lower bound, but it is not clear whether a higher
- lower bound would be found if the search used a higher Max Load
- value.
-
- For a regular Goal Result, the distance between the Relevant Lower
- Bound and the Relevant Upper Bound MUST NOT be larger than the Goal
- Width, if the implementation offers width as a goal attribute.
-
- Searching for anther search goal may cause a loss inversion
- phenomenon, where a lower load is classified as an upper bound, but
- also a higher load is classified as a lower bound for the same search
- goal. The definition of the Relevant Lower Bound ignores such high
- lower bounds.
-
-3.7.3. Conditional Throughput
-
- Definition:
-
- The Conditional Throughput (see section [Appendix B: Conditional
- Throughput] (#Appendix-B:-Conditional-Throughput)) as evaluated at
- the Relevant Lower Bound of the given Search Goal at the end of the
- search.
-
- Discussion:
-
- Informally, this is a typical trial forwarding rate, expected to be
- seen at the Relevant Lower Bound of the given Search Goal.
-
- But frequently it is only a conservative estimate thereof, as
- MLRsearch implementations tend to stop gathering more data as soon as
- they confirm the value cannot get worse than this estimate within the
- Goal Duration Sum.
-
- This value is RECOMMENDED to be used when evaluating repeatability
- and comparability if different MLRsearch implementations.
-
-3.7.4. Goal Result
-
- Definition:
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 26]
-
-Internet-Draft MLRsearch July 2024
-
-
- The Goal Result is a composite quantity consisting of several
- attributes. Relevant Upper Bound and Relevant Lower Bound are
- REQUIRED attributes, Conditional Throughput is a RECOMMENDED
- attribute.
-
- Discussion:
-
- Depending on SUT behavior, it is possible that one or both relevant
- bounds do not exist. The goal result instance where the required
- attribute values exist is informally called a Regular Goal Result
- instance, so we can say some goals reached Irregular Goal Results.
-
- A typical Irregular Goal Result is when all trials at the Max Load
- have zero loss, as the Relevant Upper Bound does not exist in that
- case.
-
- It is RECOMMENDED that the test report will display such results
- appropriately, although MLRsearch specification does not prescibe
- how.
-
- Anything else regarging Irregular Goal Results, including their role
- in stopping conditions of the search is outside the scope of this
- document.
-
-3.7.5. Search Result
-
- Definition:
-
- The Search Result is a single composite object that maps each Search
- Goal instance to a corresponding Goal Result instance.
-
- Discussion:
-
- Alternatively, the Search Result can be implemented as an ordered
- list of the Goal Result instances, matching the order of Search Goal
- instances.
-
- The Search Result (as a mapping) MUST map from all the Search Goal
- instances present in the Controller Input.
-
-3.7.6. Controller Output
-
- Definition:
-
- The Controller Output is a composite quantity returned from the
- Controller to the Manager at the end of the search. The Search
- Result instance is its only REQUIRED attribute.
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 27]
-
-Internet-Draft MLRsearch July 2024
-
-
- Discussion:
-
- MLRsearch implementation MAY return additional data in the Controller
- Output.
-
-3.8. MLRsearch Architecture
-
- MLRsearch architecture consists of three main system components: the
- Manager, the Controller, and the Measurer.
-
- The architecture also implies the presence of other components, such
- as the SUT and the Tester (as a sub-component of the Measurer).
-
- Protocols of communication between components are generally left
- unspecified. For example, when MLRsearch specification mentions
- "Controller calls Measurer", it is possible that the Controller
- notifies the Manager to call the Measurer indirectly instead. This
- way the Measurer implementations can be fully independent from the
- Controller implementations, e.g. programmed in different programming
- languages.
-
-3.8.1. Measurer
-
- Definition:
-
- The Measurer is an abstract system component that when called with a
- [Trial Input] (#Trial-Input) instance, performs one [Trial] (#Trial),
- and returns a [Trial Output] (#Trial-Output) instance.
-
- Discussion:
-
- This definition assumes the Measurer is already initialized. In
- practice, there may be additional steps before the search, e.g. when
- the Manager configures the traffic profile (either on the Measurer or
- on its tester sub-component directly) and performs a warmup (if the
- tester requires one).
-
- It is the responsibility of the Measurer implementation to uphold any
- requirements and assumptions present in MLRsearch specification, e.g.
- trial forwarding ratio not being larger than one.
-
- Implementers have some freedom. For example [RFC2544] (section 10.
- Verifying received frames) gives some suggestions (but not
- requirements) related to duplicated or reordered frames.
- Implementations are RECOMMENDED to document their behavior related to
- such freedoms in as detailed a way as possible.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 28]
-
-Internet-Draft MLRsearch July 2024
-
-
- It is RECOMMENDED to benchmark the test equipment first, e.g. connect
- sender and receiver directly (without any SUT in the path), find a
- load value that guarantees the offered load is not too far from the
- intended load, and use that value as the Max Load value. When
- testing the real SUT, it is RECOMMENDED to turn any big difference
- between the intended load and the offered load into increased Trial
- Loss Ratio.
-
- Neither of the two recommendations are made into requirements,
- because it is not easy to tell when the difference is big enough, in
- a way thay would be dis-entangled from other Measurer freedoms.
-
-3.8.2. Controller
-
- Definition:
-
- The Controller is an abstract system component that when called with
- a Controller Input instance repeatedly computes Trial Input instance
- for the Measurer, obtains corresponding Trial Output instances, and
- eventually returns a Controller Output instance.
-
- Discussion:
-
- Informally, the Controller has big freedom in selection of Trial
- Inputs, and the implementations want to achieve the Search Goals in
- the shortest expected time.
-
- The Controller's role in optimizing the overall search time
- distinguishes MLRsearch algorithms from simpler search procedures.
-
- Informally, each implementation can have different stopping
- conditions. Goal Width is only one example. In practice,
- implementation details do not matter, as long as Goal Results are
- regular.
-
-3.8.3. Manager
-
- Definition:
-
- The Manager is an abstract system component that is reponsible for
- configuring other components, calling the Controller component once,
- and for creating the test report following the reporting format as
- defined in [RFC2544] (section 26. Benchmarking tests).
-
- Discussion:
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 29]
-
-Internet-Draft MLRsearch July 2024
-
-
- The Manager initializes the SUT, the Measurer (and the Tester if
- independent) with their intended configurations before calling the
- Controller.
-
- The Manager does not need to be able to tweak any Search Goal
- attributes, but it MUST report all applied attribute values even if
- not tweaked.
-
- In principle, there should be a "user" (human or CI) that "starts" or
- "calls" the Manager and receives the report. The Manager MAY be able
- to be called more than once whis way.
-
-3.9. Implementation Compliance
-
- Any networking measurement setup where there can be logically
- delineated system components and there are components satisfying
- requirements for the Measurer, the Controller and the Manager, is
- considered to be compliant with MLRsearch design.
-
- These components can be seen as abstractions present in any testing
- procedure. For example, there can be a single component acting both
- as the Manager and the Controller, but as long as values of required
- attributes of Search Goals and Goal Results are visible in the test
- report, the Controller Input instance and output instance are
- implied.
-
- For example, any setup for conditionally (or unconditionally)
- compliant [RFC2544] throughput testing can be understood as a
- MLRsearch architecture, assuming there is enough data to reconstruct
- the Relevant Upper Bound.
-
- See [RFC2544 Goal] (#RFC2544-Goal) subsection for equivalent Search
- Goal.
-
- Any test procedure that can be understood as (one call to the Manager
- of) MLRsearch architecture is said to be compliant with MLRsearch
- specification.
-
-4. Additional Considerations
-
- This section focuses on additional considerations, intuitions and
- motivations pertaining to MLRsearch methodology.
-
-
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 30]
-
-Internet-Draft MLRsearch July 2024
-
-
-4.1. MLRsearch Versions
-
- The MLRsearch algorithm has been developed in a code-first approach,
- a Python library has been created, debugged, used in production and
- published in PyPI before the first descriptions (even informal) were
- published.
-
- But the code (and hence the description) was evolving over time.
- Multiple versions of the library were used over past several years,
- and later code was usually not compatible with earlier descriptions.
-
- The code in (some version of) MLRsearch library fully determines the
- search process (for a given set of configuration parameters), leaving
- no space for deviations.
-
- This historic meaning of MLRsearch, as a family of search algorithm
- implementations, leaves plenty of space for future improvements, at
- the cost of poor comparability of results of search algoritm
- implementations.
-
- There are two competing needs. There is the need for standardization
- in areas critical to comparability. There is also the need to allow
- flexibility for implementations to innovate and improve in other
- areas. This document defines MLRsearch as a new specification in a
- manner that aims to fairly balance both needs.
-
-4.2. Stopping Conditions
-
- [RFC2544] prescribes that after performing one trial at a specific
- offered load, the next offered load should be larger or smaller,
- based on frame loss.
-
- The usual implementation uses binary search. Here a lossy trial
- becomes a new upper bound, a lossless trial becomes a new lower
- bound. The span of values between the tightest lower bound and the
- tightest upper bound (including both values) forms an interval of
- possible results, and after each trial the width of that interval
- halves.
-
- Usually the binary search implementation tracks only the two tightest
- bounds, simply calling them bounds. But the old values still remain
- valid bounds, just not as tight as the new ones.
-
- After some number of trials, the tightest lower bound becomes the
- throughput. [RFC2544] does not specify when, if ever, should the
- search stop.
-
- MLRsearch introduces a concept of [Goal Width] (#Goal-Width).
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 31]
-
-Internet-Draft MLRsearch July 2024
-
-
- The search stops when the distance between the tightest upper bound
- and the tightest lower bound is smaller than a user-configured value,
- called Goal Width from now on. In other words, the interval width at
- the end of the search has to be no larger than the Goal Width.
-
- This Goal Width value therefore determines the precision of the
- result. Due to the fact that MLRsearch specification requires a
- particular structure of the result (see [Trial Result] (#Trial-
- Result) section), the result itself does contain enough information
- to determine its precision, thus it is not required to report the
- Goal Width value.
-
- This allows MLRsearch implementations to use stopping conditions
- different from Goal Width.
-
-4.3. Load Classification
-
- MLRsearch keeps the basic logic of binary search (tracking tightest
- bounds, measuring at the middle), perhaps with minor technical
- differences.
-
- MLRsearch algorithm chooses an intended load (as opposed to the
- offered load), the interval between bounds does not need to be split
- exactly into two equal halves, and the final reported structure
- specifies both bounds.
-
- The biggest difference is that to classify a load as an upper or
- lower bound, MLRsearch may need more than one trial (depending on
- configuration options) to be performed at the same intended load.
-
- In consequence, even if a load already does have few trial results,
- it still may be classified as undecided, neither a lower bound nor an
- upper bound.
-
- An explanation of the classification logic is given in the next
- section [Logic of Load Classification] (#Logic-of-Load-
- Classification), as it heavily relies on other subsections of this
- section.
-
- For repeatability and comparability reasons, it is important that
- given a set of trial results, all implementations of MLRsearch
- classify the load equivalently.
-
-4.4. Loss Ratios
-
- Another difference between MLRsearch and [RFC2544] binary search is
- in the goals of the search. [RFC2544] has a single goal, based on
- classifying full-length trials as either lossless or lossy.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 32]
-
-Internet-Draft MLRsearch July 2024
-
-
- MLRsearch, as the name suggests, can search for multiple goals,
- differing in their loss ratios. The precise definition of the Goal
- Loss Ratio will be given later. The [RFC2544] throughput goal then
- simply becomes a zero Goal Loss Ratio. Different goals also may have
- different Goal Widths.
-
- A set of trial results for one specific intended load value can
- classify the load as an upper bound for some goals, but a lower bound
- for some other goals, and undecided for the rest of the goals.
-
- Therefore, the load classification depends not only on trial results,
- but also on the goal. The overall search procedure becomes more
- complicated, when compared to binary search with a single goal, but
- most of the complications do not affect the final result, except for
- one phenomenon, loss inversion.
-
-4.5. Loss Inversion
-
- In [RFC2544] throughput search using bisection, any load with a lossy
- trial becomes a hard upper bound, meaning every subsequent trial has
- a smaller intended load.
-
- But in MLRsearch, a load that is classified as an upper bound for one
- goal may still be a lower bound for another goal, and due to the
- other goal MLRsearch will probably perform trials at even higher
- loads. What to do when all such higher load trials happen to have
- zero loss? Does it mean the earlier upper bound was not real? Does
- it mean the later lossless trials are not considered a lower bound?
- Surely we do not want to have an upper bound at a load smaller than a
- lower bound.
-
- MLRsearch is conservative in these situations. The upper bound is
- considered real, and the lossless trials at higher loads are
- considered to be a coincidence, at least when computing the final
- result.
-
- This is formalized using new notions, the [Relevant Upper Bound]
- (#Relevant-Upper-Bound) and the [Relevant Lower Bound] (#Relevant-
- Lower-Bound). Load classification is still based just on the set of
- trial results at a given intended load (trials at other loads are
- ignored), making it possible to have a lower load classified as an
- upper bound, and a higher load classified as a lower bound (for the
- same goal). The Relevant Upper Bound (for a goal) is the smallest
- load classified as an upper bound. But the Relevant Lower Bound is
- not simply the largest among lower bounds. It is the largest load
- among loads that are lower bounds while also being smaller than the
- Relevant Upper Bound.
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 33]
-
-Internet-Draft MLRsearch July 2024
-
-
- With these definitions, the Relevant Lower Bound is always smaller
- than the Relevant Upper Bound (if both exist), and the two relevant
- bounds are used analogously as the two tightest bounds in the binary
- search. When they are less than the Goal Width apart, the relevant
- bounds are used in the output.
-
- One consequence is that every trial result can have an impact on the
- search result. That means if your SUT (or your traffic generator)
- needs a warmup, be sure to warm it up before starting the search.
-
-4.6. Exceed Ratio
-
- The idea of performing multiple trials at the same load comes from a
- model where some trial results (those with high loss) are affected by
- infrequent effects, causing poor repeatability of [RFC2544]
- throughput results. See the discussion about noiseful and noiseless
- ends of the SUT performance spectrum in section [DUT in SUT] (#DUT-
- in-SUT). Stable results are closer to the noiseless end of the SUT
- performance spectrum, so MLRsearch may need to allow some frequency
- of high-loss trials to ignore the rare but big effects near the
- noiseful end.
-
- MLRsearch can do such trial result filtering, but it needs a
- configuration option to tell it how frequent can the infrequent big
- loss be. This option is called the exceed ratio. It tells MLRsearch
- what ratio of trials (more exactly what ratio of trial seconds) can
- have a [Trial Loss Ratio] (#Trial-Loss-Ratio) larger than the Goal
- Loss Ratio and still be classified as a lower bound. Zero exceed
- ratio means all trials have to have a Trial Loss Ratio equal to or
- smaller than the Goal Loss Ratio.
-
- For explainability reasons, the RECOMMENDED value for exceed ratio is
- 0.5, as it simplifies some later concepts by relating them to the
- concept of median.
-
-4.7. Duration Sum
-
- When more than one trial is intended to classify a load, MLRsearch
- also needs something that controls the number of trials needed.
- Therefore, each goal also has an attribute called duration sum.
-
- The meaning of a [Goal Duration Sum] (#Goal-Duration-Sum) is that
- when a load has (full-length) trials whose trial durations when
- summed up give a value at least as big as the Goal Duration Sum
- value, the load is guaranteed to be classified either as an upper
- bound or a lower bound for that goal.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 34]
-
-Internet-Draft MLRsearch July 2024
-
-
- Due to the fact that the duration sum has a big impact on the overall
- search duration, and [RFC2544] prescribes wait intervals around trial
- traffic, the MLRsearch algorithm is allowed to sum durations that are
- different from the actual trial traffic durations.
-
- In the MLRsearch specification, the different duration values are
- called [Trial Effective Duration] (#Trial-Effective-Duration).
-
-4.8. Short Trials
-
- MLRsearch requires each goal to specify its final trial duration.
- Full-length trial is a shorter name for a trial whose intended trial
- duration is equal to (or longer than) the goal final trial duration.
-
- Section 24 of [RFC2544] already anticipates possible time savings
- when short trials (shorter than full-length trials) are used. Full-
- length trials are the opposite of short trials, so they may also be
- called long trials.
-
- Any MLRsearch implementation may include its own configuration
- options which control when and how MLRsearch chooses to use short
- trial durations.
-
- For explainability reasons, when exceed ratio of 0.5 is used, it is
- recommended for the Goal Duration Sum to be an odd multiple of the
- full trial durations, so Conditional Throughput becomes identical to
- a median of a particular set of trial forwarding rates.
-
- The presence of short trial results complicates the load
- classification logic.
-
- Full details are given later in section [Logic of Load
- Classification] (#Logic-of-Load-Classification). In a nutshell,
- results from short trials may cause a load to be classified as an
- upper bound. This may cause loss inversion, and thus lower the
- Relevant Lower Bound, below what would classification say when
- considering full-length trials only.
-
-4.9. Throughput
-
- Due to the fact that testing equipment takes the intended load as an
- input parameter for a trial measurement, any load search algorithm
- needs to deal with intended load values internally.
-
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 35]
-
-Internet-Draft MLRsearch July 2024
-
-
- But in the presence of goals with a non-zero loss ratio, the intended
- load usually does not match the user's intuition of what a throughput
- is. The forwarding rate (as defined in [RFC2285] section 3.6.1) is
- better, but it is not obvious how to generalize it for loads with
- multiple trial results and a non-zero [Goal Loss Ratio] (#Goal-Loss-
- Ratio).
-
- The best example is also the main motivation: hard limit performance.
- Even if the medium allows higher performance, the SUT interfaces may
- have their additional own limitations, e.g. a specific fps limit on
- the NIC (a very common occurance).
-
- Ideally, those should be known and used when computing Max Load. But
- if Max Load is higher that what interface can receive or transmit,
- there will be a "hard limit" observed in trial results. Imagine the
- hard limit is at 100 Mfps, Max Load is higher, and the goal loss
- ratio is 0.5%. If DUT has no additional losses, 0.5% loss ratio will
- be achieved at 100.5025 Mfps (the relevant lower bound). But it is
- not intuitive to report SUT performance as a value that is larger
- than known hard limit. We need a generalization of RFC2544
- throughput, different from just the relevant lower bound.
-
- MLRsearch defines one such generalization, called the Conditional
- Throughput. It is the trial forwarding rate from one of the trials
- performed at the load in question. Determining which trial exactly
- is defined in [MLRsearch Specification] (#MLRsearch-Specification),
- and in [Appendix B: Conditional Throughput] (#Appendix-B:-
- Conditional-Throughput).
-
- In the hard limit example, 100.5 Mfps load will still have only 100.0
- Mfps forwarding rate, nicely confirming the known limitation.
-
- Conditional Throughput is partially related to load classification.
- If a load is classified as a lower bound for a goal, the Conditional
- Throughput can be calculated from trial results, and guaranteed to
- show an loss ratio no larger than the Goal Loss Ratio.
-
- Note that when comparing the best (all zero loss) and worst case (all
- loss just below Goal Loss Ratio), the same Relevant Lower Bound value
- may result in the Conditional Throughput differing up to the Goal
- Loss Ratio.
-
-
-
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 36]
-
-Internet-Draft MLRsearch July 2024
-
-
- Therefore it is rarely needed to set the Goal Width (if expressed as
- the relative difference of loads) below the Goal Loss Ratio. In
- other words, setting the Goal Width below the Goal Loss Ratio may
- cause the Conditional Throughput for a larger loss ratio to become
- smaller than a Conditional Throughput for a goal with a smaller Goal
- Loss Ratio, which is counter-intuitive, considering they come from
- the same search. Therefore it is RECOMMENDED to set the Goal Width
- to a value no smaller than the Goal Loss Ratio.
-
- Overall, this Conditional Throughput does behave well for
- comparability purposes.
-
-4.10. Search Time
-
- MLRsearch was primarily developed to reduce the time required to
- determine a throughput, either the [RFC2544] compliant one, or some
- generalization thereof. The art of achieving short search times is
- mainly in the smart selection of intended loads (and intended
- durations) for the next trial to perform.
-
- While there is an indirect impact of the load selection on the
- reported values, in practice such impact tends to be small, even for
- SUTs with quite a broad performance spectrum.
-
- A typical example of two approaches to load selection leading to
- different Relevant Lower Bounds is when the interval is split in a
- very uneven way. Any implementation choosing loads very close to the
- current Relevant Lower Bound is quite likely to eventually stumble
- upon a trial result with poor performance (due to SUT noise). For an
- implementation choosing loads very close to the current Relevant
- Upper Bound, this is unlikely, as it examines more loads that can see
- a performance close to the noiseless end of the SUT performance
- spectrum.
-
- However, as even splits optimize search duration at give precision,
- MLRsearch implementations that prioritize minimizing search time are
- unlikely to suffer from any such bias.
-
- Therefore, this document remains quite vague on load selection and
- other optimization details, and configuration attributes related to
- them. Assuming users prefer libraries that achieve short overall
- search time, the definition of the Relevant Lower Bound should be
- strict enough to ensure result repeatability and comparability
- between different implementations, while not restricting future
- implementations much.
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 37]
-
-Internet-Draft MLRsearch July 2024
-
-
-4.11. [RFC2544] Compliance
-
- Some Search Goal instances lead to results compliant with RFC2544.
- See [RFC2544 Goal] (#RFC2544-Goal) for more details regarding both
- conditional and unconditional compliance.
-
- The presence of other Search Goals does not affect the compliance of
- this Goal Result. The Relevant Lower Bound and the Conditional
- Throughput are in this case equal to each other, and the value is the
- [RFC2544] throughput.
-
-5. Logic of Load Classification
-
-5.1. Introductory Remarks
-
- This chapter continues with explanations, but this time more precise
- definitions are needed for readers to follow the explanations.
-
- Descriptions in this section are wordy and implementers should read
- [MLRsearch Specification] (#MLRsearch-Specification) section and
- Appendices for more concise definitions.
-
- The two areas of focus here are load classification and the
- Conditional Throughput.
-
- To start with [Performance Spectrum] (#Performance-Spectrum)
- subsection contains definitions needed to gain insight into what
- Conditional Throughput means. Remaining subsections discuss load
- classification.
-
- For load classification, it is useful to define *good trials* and
- *bad trials*:
-
- * *Bad trial*: Trial is called bad (according to a goal) if its
- [Trial Loss Ratio] (#Trial-Loss-Ratio) is larger than the [Goal
- Loss Ratio] (#Goal-Loss-Ratio).
-
- * *Good trial*: Trial that is not bad is called good.
-
-5.2. Performance Spectrum
-
- ### Description
-
- There are several equivalent ways to explain the Conditional
- Throughput computation. One of the ways relies on performance
- spectrum.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 38]
-
-Internet-Draft MLRsearch July 2024
-
-
- Take an intended load value, a trial duration value, and a finite set
- of trial results, with all trials measured at that load value and
- duration value.
-
- The performance spectrum is the function that maps any non-negative
- real number into a sum of trial durations among all trials in the
- set, that has that number, as their trial forwarding rate, e.g. map
- to zero if no trial has that particular forwarding rate.
-
- A related function, defined if there is at least one trial in the
- set, is the performance spectrum divided by the sum of the durations
- of all trials in the set.
-
- That function is called the performance probability function, as it
- satisfies all the requirements for probability mass function of a
- discrete probability distribution, the one-dimensional random
- variable being the trial forwarding rate.
-
- These functions are related to the SUT performance spectrum, as
- sampled by the trials in the set.
-
- Take a set of all full-length trials performed at the Relevant Lower
- Bound, sorted by decreasing trial forwarding rate. The sum of the
- durations of those trials may be less than the Goal Duration Sum, or
- not. If it is less, add an imaginary trial result with zero trial
- forwarding rate, such that the new sum of durations is equal to the
- Goal Duration Sum. This is the set of trials to use.
-
- If the quantile touches two trials,
-
- the larger trial forwarding rate (from the trial result sorted
- earlier) is used.
-
- The resulting quantity is the Conditional Throughput of the goal in
- question.
-
- A set of examples follows.
-
-5.2.1. First Example
-
- * [Goal Exceed Ratio] (#Goal-Exceed-Ratio) = 0 and [Goal Duration
- Sum] (#Goal-Duration-Sum) has been reached.
-
- * Conditional Throughput is the smallest trial forwarding rate among
- the trials.
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 39]
-
-Internet-Draft MLRsearch July 2024
-
-
-5.2.2. Second Example
-
- * Goal Exceed Ratio = 0 and Goal Duration Sum has not been reached
- yet.
-
- * Due to the missing duration sum, the worst case may still happen,
- so the Conditional Throughput is zero.
-
- * This is not reported to the user, as this load cannot become the
- Relevant Lower Bound yet.
-
-5.2.3. Third Example
-
- * Goal Exceed Ratio = 50% and Goal Duration Sum is two seconds.
-
- * One trial is present with the duration of one second and zero
- loss.
-
- * The imaginary trial is added with the duration of one second and
- zero trial forwarding rate.
-
- * The median would touch both trials, so the Conditional Throughput
- is the trial forwarding rate of the one non-imaginary trial.
-
- * As that had zero loss, the value is equal to the offered load.
-
-5.2.4. Summary
-
- While the Conditional Throughput is a generalization of the trial
- forwarding rate, its definition is not an obvious one.
-
- Other than the trial forwarding rate, the other source of intuition
- is the quantile in general, and the median the recommended case.
-
-5.3. Trials with Single Duration
-
- When goal attributes are chosen in such a way that every trial has
- the same intended duration, the load classification is simpler.
-
- The following description follows the motivation of Goal Loss Ratio,
- Goal Exceed Ratio, and Goal Duration Sum.
-
- If the sum of the durations of all trials (at the given load) is less
- than the Goal Duration Sum, imagine two scenarios:
-
- * *best case scenario*: all subsequent trials having zero loss, and
-
- * *worst case scenario*: all subsequent trials having 100% loss.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 40]
-
-Internet-Draft MLRsearch July 2024
-
-
- Here we assume there are as many subsequent trials as needed to make
- the sum of all trials equal to the Goal Duration Sum.
-
- The exceed ratio is defined using sums of durations (and number of
- trials does not matter), so it does not matter whether the
- "subsequent trials" can consist of an integer number of full-length
- trials.
-
- In any of the two scenarios, best case and worst case, we can compute
- the load exceed ratio, as the duration sum of good trials divided by
- the duration sum of all trials, in both cases including the assumed
- trials.
-
- Even if, in the best case scenario, the load exceed ratio is larger
- than the Goal Exceed Ratio, the load is an upper bound.
-
- MKP2 Even if, in the worst case scenario, the load exceed ratio is
- not larger than the Goal Exceed Ratio, the load is a lower bound.
-
- More specifically:
-
- * Take all trials measured at a given load.
-
- * The sum of the durations of all bad full-length trials is called
- the bad sum.
-
- * The sum of the durations of all good full-length trials is called
- the good sum.
-
- * The result of adding the bad sum plus the good sum is called the
- measured sum.
-
- * The larger of the measured sum and the Goal Duration Sum is called
- the whole sum.
-
- * The whole sum minus the measured sum is called the missing sum.
-
- * The optimistic exceed ratio is the bad sum divided by the whole
- sum.
-
- * The pessimistic exceed ratio is the bad sum plus the missing sum,
- that divided by the whole sum.
-
- * If the optimistic exceed ratio is larger than the Goal Exceed
- Ratio, the load is classified as an upper bound.
-
- * If the pessimistic exceed ratio is not larger than the Goal Exceed
- Ratio, the load is classified as a lower bound.
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 41]
-
-Internet-Draft MLRsearch July 2024
-
-
- * Else, the load is classified as undecided.
-
- The definition of pessimistic exceed ratio is compatible with the
- logic in the Conditional Throughput computation, so in this single
- trial duration case, a load is a lower bound if and only if the
- Conditional Throughput loss ratio is not larger than the Goal Loss
- Ratio.
-
- If it is larger, the load is either an upper bound or undecided.
-
-5.4. Trials with Short Duration
-
-5.4.1. Scenarios
-
- Trials with intended duration smaller than the goal final trial
- duration are called short trials. The motivation for load
- classification logic in the presence of short trials is based around
- a counter-factual case: What would the trial result be if a short
- trial has been measured as a full-length trial instead?
-
- There are three main scenarios where human intuition guides the
- intended behavior of load classification.
-
-5.4.1.1. False Good Scenario
-
- The user had their reason for not configuring a shorter goal final
- trial duration. Perhaps SUT has buffers that may get full at longer
- trial durations. Perhaps SUT shows periodic decreases in performance
- the user does not want to be treated as noise.
-
- In any case, many good short trials may become bad full-length trials
- in the counter-factual case.
-
- In extreme cases, there are plenty of good short trials and no bad
- short trials.
-
- In this scenario, we want the load classification NOT to classify the
- load as a lower bound, despite the abundance of good short trials.
-
- Effectively, we want the good short trials to be ignored, so they do
- not contribute to comparisons with the Goal Duration Sum.
-
-5.4.1.2. True Bad Scenario
-
- When there is a frame loss in a short trial, the counter-factual
- full-length trial is expected to lose at least as many frames.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 42]
-
-Internet-Draft MLRsearch July 2024
-
-
- In practice, bad short trials are rarely turning into good full-
- length trials.
-
- In extreme cases, there are no good short trials.
-
- In this scenario, we want the load classification to classify the
- load as an upper bound just based on the abundance of short bad
- trials.
-
- Effectively, we want the bad short trials to contribute to
- comparisons with the Goal Duration Sum, so the load can be classified
- sooner.
-
-5.4.1.3. Balanced Scenario
-
- Some SUTs are quite indifferent to trial duration. Performance
- probability function constructed from short trial results is likely
- to be similar to the performance probability function constructed
- from full-length trial results (perhaps with larger dispersion, but
- without a big impact on the median quantiles overall).
-
- For a moderate Goal Exceed Ratio value, this may mean there are both
- good short trials and bad short trials.
-
- This scenario is there just to invalidate a simple heuristic of
- always ignoring good short trials and never ignoring bad short
- trials, as that simple heuristic would be too biased.
-
- Yes, the short bad trials are likely to turn into full-length bad
- trials in the counter-factual case, but there is no information on
- what would the good short trials turn into.
-
- The only way to decide safely is to do more trials at full length,
- the same as in False Good Scenario.
-
-5.4.2. Classification Logic
-
- MLRsearch picks a particular logic for load classification in the
- presence of short trials, but it is still RECOMMENDED to use
- configurations that imply no short trials, so the possible
- inefficiencies in that logic do not affect the result, and the result
- has better explainability.
-
- With that said, the logic differs from the single trial duration case
- only in different definition of the bad sum. The good sum is still
- the sum across all good full-length trials.
-
- Few more notions are needed for defining the new bad sum:
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 43]
-
-Internet-Draft MLRsearch July 2024
-
-
- * The sum of durations of all bad full-length trials is called the
- bad long sum.
-
- * The sum of durations of all bad short trials is called the bad
- short sum.
-
- * The sum of durations of all good short trials is called the good
- short sum.
-
- * One minus the Goal Exceed Ratio is called the subceed ratio.
-
- * The Goal Exceed Ratio divided by the subceed ratio is called the
- exceed coefficient.
-
- * The good short sum multiplied by the exceed coefficient is called
- the balancing sum.
-
- * The bad short sum minus the balancing sum is called the excess
- sum.
-
- * If the excess sum is negative, the bad sum is equal to the bad
- long sum.
-
- * Otherwise, the bad sum is equal to the bad long sum plus the
- excess sum.
-
- Here is how the new definition of the bad sum fares in the three
- scenarios, where the load is close to what would the relevant bounds
- be if only full-length trials were used for the search.
-
-5.4.2.1. False Good Scenario
-
- If the duration is too short, we expect to see a higher frequency of
- good short trials. This could lead to a negative excess sum, which
- has no impact, hence the load classification is given just by full-
- length trials. Thus, MLRsearch using too short trials has no
- detrimental effect on result comparability in this scenario. But
- also using short trials does not help with overall search duration,
- probably making it worse.
-
-5.4.2.2. True Bad Scenario
-
- Settings with a small exceed ratio have a small exceed coefficient,
- so the impact of the good short sum is small, and the bad short sum
- is almost wholly converted into excess sum, thus bad short trials
- have almost as big an impact as full-length bad trials. The same
- conclusion applies to moderate exceed ratio values when the good
- short sum is small. Thus, short trials can cause a load to get
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 44]
-
-Internet-Draft MLRsearch July 2024
-
-
- classified as an upper bound earlier, bringing time savings (while
- not affecting comparability).
-
-5.4.2.3. Balanced Scenario
-
- Here excess sum is small in absolute value, as the balancing sum is
- expected to be similar to the bad short sum. Once again, full-length
- trials are needed for final load classification; but usage of short
- trials probably means MLRsearch needed a shorter overall search time
- before selecting this load for measurement, thus bringing time
- savings (while not affecting comparability).
-
- Note that in presence of short trial results, the comparibility
- between the load classification and the Conditional Throughput is
- only partial. The Conditional Throughput still comes from a good
- long trial, but a load higher than the Relevant Lower Bound may also
- compute to a good value.
-
-5.5. Trials with Longer Duration
-
- If there are trial results with an intended duration larger than the
- goal trial duration, the precise definitions in Appendix A and
- Appendix B treat them in exactly the same way as trials with duration
- equal to the goal trial duration.
-
- But in configurations with moderate (including 0.5) or small Goal
- Exceed Ratio and small Goal Loss Ratio (especially zero), bad trials
- with longer than goal durations may bias the search towards the lower
- load values, as the noiseful end of the spectrum gets a larger
- probability of causing the loss within the longer trials.
-
-6. IANA Considerations
-
- No requests of IANA.
-
-7. Security Considerations
-
- Benchmarking activities as described in this memo are limited to
- technology characterization of a DUT/SUT using controlled stimuli in
- a laboratory environment, with dedicated address space and the
- constraints specified in the sections above.
-
- The benchmarking network topology will be an independent test setup
- and MUST NOT be connected to devices that may forward the test
- traffic into a production network or misroute traffic to the test
- management network.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 45]
-
-Internet-Draft MLRsearch July 2024
-
-
- Further, benchmarking is performed on a "black-box" basis, relying
- solely on measurements observable external to the DUT/SUT.
-
- Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
- benchmarking purposes. Any implications for network security arising
- from the DUT/SUT SHOULD be identical in the lab and in production
- networks.
-
-8. Acknowledgements
-
- Some phrases and statements in this document were created with help
- of Mistral AI (mistral.ai).
-
- Many thanks to Alec Hothan of the OPNFV NFVbench project for thorough
- review and numerous useful comments and suggestions in the earlier
- versions of this document.
-
- Special wholehearted gratitude and thanks to the late Al Morton for
- his thorough reviews filled with very specific feedback and
- constructive guidelines. Thank you Al for the close collaboration
- over the years, for your continuous unwavering encouragement full of
- empathy and positive attitude. Al, you are dearly missed.
-
-9. Appendix A: Load Classification
-
- This section specifies how to perform the load classification.
-
- Any intended load value can be classified, according to a given
- [Search Goal] (#Search-Goal).
-
- The algorithm uses (some subsets of) the set of all available trial
- results from trials measured at a given intended load at the end of
- the search. All durations are those returned by the Measurer.
-
- The block at the end of this appendix holds pseudocode which computes
- two values, stored in variables named optimistic and pessimistic.
-
- The pseudocode happens to be a valid Python code.
-
- If values of both variables are computed to be true, the load in
- question is classified as a lower bound according to the given Search
- Goal. If values of both variables are false, the load is classified
- as an upper bound. Otherwise, the load is classified as undecided.
-
- The pseudocode expects the following variables to hold values as
- follows:
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 46]
-
-Internet-Draft MLRsearch July 2024
-
-
- * goal_duration_sum: The duration sum value of the given Search
- Goal.
-
- * goal_exceed_ratio: The exceed ratio value of the given Search
- Goal.
-
- * good_long_sum: Sum of durations across trials with trial duration
- at least equal to the goal final trial duration and with a Trial
- Loss Ratio not higher than the Goal Loss Ratio.
-
- * bad_long_sum: Sum of durations across trials with trial duration
- at least equal to the goal final trial duration and with a Trial
- Loss Ratio higher than the Goal Loss Ratio.
-
- * good_short_sum: Sum of durations across trials with trial duration
- shorter than the goal final trial duration and with a Trial Loss
- Ratio not higher than the Goal Loss Ratio.
-
- * bad_short_sum: Sum of durations across trials with trial duration
- shorter than the goal final trial duration and with a Trial Loss
- Ratio higher than the Goal Loss Ratio.
-
- The code works correctly also when there are no trial results at a
- given load.
-
- balancing_sum = good_short_sum * goal_exceed_ratio / (1.0 - goal_exceed_ratio)
- effective_bad_sum = bad_long_sum + max(0.0, bad_short_sum - balancing_sum)
- effective_whole_sum = max(good_long_sum + effective_bad_sum, goal_duration_sum)
- quantile_duration_sum = effective_whole_sum * goal_exceed_ratio
- optimistic = effective_bad_sum <= quantile_duration_sum
- pessimistic = (effective_whole_sum - good_long_sum) <= quantile_duration_sum
-
-10. Appendix B: Conditional Throughput
-
- This section specifies how to compute Conditional Throughput, as
- referred to in section [Conditional Throughput] (#Conditional-
- Throughput).
-
- Any intended load value can be used as the basis for the following
- computation, but only the Relevant Lower Bound (at the end of the
- search) leads to the value called the Conditional Throughput for a
- given Search Goal.
-
- The algorithm uses (some subsets of) the set of all available trial
- results from trials measured at a given intended load at the end of
- the search. All durations are those returned by the Measurer.
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 47]
-
-Internet-Draft MLRsearch July 2024
-
-
- The block at the end of this appendix holds pseudocode which computes
- a value stored as variable conditional_throughput.
-
- The pseudocode happens to be a valid Python code.
-
- The pseudocode expects the following variables to hold values as
- follows:
-
- * goal_duration_sum: The duration sum value of the given Search
- Goal.
-
- * goal_exceed_ratio: The exceed ratio value of the given Search
- Goal.
-
- * good_long_sum: Sum of durations across trials with trial duration
- at least equal to the goal final trial duration and with a Trial
- Loss Ratio not higher than the Goal Loss Ratio.
-
- * bad_long_sum: Sum of durations across trials with trial duration
- at least equal to the goal final trial duration and with a Trial
- Loss Ratio higher than the Goal Loss Ratio.
-
- * long_trials: An iterable of all trial results from trials with
- trial duration at least equal to the goal final trial duration,
- sorted by increasing the Trial Loss Ratio. A trial result is a
- composite with the following two attributes available:
-
- - trial.loss_ratio: The Trial Loss Ratio as measured for this
- trial.
-
- - trial.duration: The trial duration of this trial.
-
- The code works correctly only when there if there is at least one
- trial result measured at a given load.
-
- all_long_sum = max(goal_duration_sum, good_long_sum + bad_long_sum)
- remaining = all_long_sum * (1.0 - goal_exceed_ratio)
- quantile_loss_ratio = None
- for trial in long_trials:
- if quantile_loss_ratio is None or remaining > 0.0:
- quantile_loss_ratio = trial.loss_ratio
- remaining -= trial.duration
- else:
- break
- else:
- if remaining > 0.0:
- quantile_loss_ratio = 1.0
- conditional_throughput = intended_load * (1.0 - quantile_loss_ratio)
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 48]
-
-Internet-Draft MLRsearch July 2024
-
-
-11. References
-
-11.1. Normative References
-
- [RFC1242] Bradner, S., "Benchmarking Terminology for Network
- Interconnection Devices", RFC 1242, DOI 10.17487/RFC1242,
- July 1991, <https://www.rfc-editor.org/info/rfc1242>.
-
- [RFC2285] Mandeville, R., "Benchmarking Terminology for LAN
- Switching Devices", RFC 2285, DOI 10.17487/RFC2285,
- February 1998, <https://www.rfc-editor.org/info/rfc2285>.
-
- [RFC2544] Bradner, S. and J. McQuaid, "Benchmarking Methodology for
- Network Interconnect Devices", RFC 2544,
- DOI 10.17487/RFC2544, March 1999,
- <https://www.rfc-editor.org/info/rfc2544>.
-
- [RFC8219] Georgescu, M., Pislaru, L., and G. Lencse, "Benchmarking
- Methodology for IPv6 Transition Technologies", RFC 8219,
- DOI 10.17487/RFC8219, August 2017,
- <https://www.rfc-editor.org/info/rfc8219>.
-
- [RFC9004] Morton, A., "Updates for the Back-to-Back Frame Benchmark
- in RFC 2544", RFC 9004, DOI 10.17487/RFC9004, May 2021,
- <https://www.rfc-editor.org/info/rfc9004>.
-
-11.2. Informative References
-
- [FDio-CSIT-MLRsearch]
- "FD.io CSIT Test Methodology - MLRsearch", October 2023,
- <https://csit.fd.io/cdocs/methodology/measurements/
- data_plane_throughput/mlr_search/>.
-
- [PyPI-MLRsearch]
- "MLRsearch 1.2.1, Python Package Index", October 2023,
- <https://pypi.org/project/MLRsearch/1.2.1/>.
-
- [TST009] "TST 009", n.d., <https://www.etsi.org/deliver/etsi_gs/
- NFV-TST/001_099/009/03.04.01_60/gs_NFV-
- TST009v030401p.pdf>.
-
-Authors' Addresses
-
- Maciek Konstantynowicz
- Cisco Systems
- Email: mkonstan@cisco.com
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 49]
-
-Internet-Draft MLRsearch July 2024
-
-
- Vratko Polak
- Cisco Systems
- Email: vrpolak@cisco.com
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-Konstantynowicz & Polak Expires 18 January 2025 [Page 50]