3 files changed, 504 insertions, 1360 deletions
diff --git a/docs/ietf/draft-ietf-bmwg-mlrsearch-02.md b/docs/ietf/draft-ietf-bmwg-mlrsearch-02.md
deleted file mode 100644
index fef146618c..0000000000
--- a/docs/ietf/draft-ietf-bmwg-mlrsearch-02.md
+++ /dev/null
@@ -1,1359 +0,0 @@
----
-title: Multiple Loss Ratio Search for Packet Throughput (MLRsearch)
-abbrev: Multiple Loss Ratio Search
-docname: draft-ietf-bmwg-mlrsearch-02
-date: 2022-03-07
-
-ipr: trust200902
-area: ops
-wg: Benchmarking Working Group
-kw: Internet-Draft
-cat: info
-
-coding: us-ascii
-pi:    # can use array (if all yes) or hash here
-  toc: yes
-  sortrefs:   # defaults to yes
-  symrefs: yes
-
-author:
-      -
-        ins: M. Konstantynowicz
-        name: Maciek Konstantynowicz
-        org: Cisco Systems
-        role: editor
-        email: mkonstan@cisco.com
-      -
-        ins: V. Polak
-        name: Vratko Polak
-        org: Cisco Systems
-        email: vrpolak@cisco.com
-
-normative:
-  RFC2544:
-
-informative:
-  FDio-CSIT-MLRsearch:
-    target: https://s3-docs.fd.io/csit/rls2110/report/introduction/methodology_data_plane_throughput/methodology_data_plane_throughput.html#mlrsearch-tests
-    title: "FD.io CSIT Test Methodology - MLRsearch"
-    date: 2021-11
-  PyPI-MLRsearch:
-    target: https://pypi.org/project/MLRsearch/0.4.0/
-    title: "MLRsearch 0.4.0, Python Package Index"
-    date: 2021-04
-
---- abstract
-
-TODO: Update after all sections are ready.
-
-This document proposes changes to [RFC2544], specifically to packet
-throughput search methodology, by defining a new search algorithm
-referred to as Multiple Loss Ratio search (MLRsearch for short). Instead
-of relying on binary search with pre-set starting offered load, it
-proposes a novel approach discovering the starting point in the initial
-phase, and then searching for packet throughput based on defined packet
-loss ratio (PLR) input criteria and defined final trial duration time.
-One of the key design principles behind MLRsearch is minimizing the
-total test duration and searching for multiple packet throughput rates
-(each with a corresponding PLR) concurrently, instead of doing it
-sequentially.
-
-The main motivation behind MLRsearch is the new set of challenges and
-requirements posed by NFV (Network Function Virtualization),
-specifically software based implementations of NFV data planes. Using
-[RFC2544] in the experience of the authors yields often not repetitive
-and not replicable end results due to a large number of factors that are
-out of scope for this draft. MLRsearch aims to address this challenge
-in a simple way of getting the same result sooner, so more repetitions
-can be done to describe the replicability.
-
---- middle
-
-{::comment}
-    As we use kramdown to convert from markdown,
-    we use this way of marking comments not to be visible in rendered draft.
-    https://stackoverflow.com/a/42323390
-    If other engine is used, convert to this way:
-    https://stackoverflow.com/a/20885980
-{:/comment}
-
-# Terminology
-
-TODO: Update after most other sections are updated.
-
-{::comment}
-    The following is probably not needed (or defined elsewhere).
-
-    * Frame size: size of an Ethernet Layer-2 frame on the wire, including
-      any VLAN tags (dot1q, dot1ad) and Ethernet FCS, but excluding Ethernet
-      preamble and inter-frame gap. Measured in bytes (octets).
-    * Packet size: same as frame size, both terms used interchangeably.
-    * Device Under Test (DUT): In software networking, "device" denotes a
-      specific piece of software tasked with packet processing. Such device
-      is surrounded with other software components (such as operating system
-      kernel). It is not possible to run devices without also running the
-      other components, and hardware resources are shared between both. For
-      purposes of testing, the whole set of hardware and software components
-      is called "system under test" (SUT). As SUT is the part of the whole
-      test setup performance of which can be measured by [RFC2544] methods,
-      this document uses SUT instead of [RFC2544] DUT. Device under test
-      (DUT) can be re-introduced when analysing test results using whitebox
-      techniques, but this document sticks to blackbox testing.
-    * System Under Test (SUT): System under test (SUT) is a part of the
-      whole test setup whose performance is to be benchmarked. The complete
-      test setup contains other parts, whose performance is either already
-      established, or not affecting the benchmarking result.
-    * Bi-directional throughput tests: involve packets/frames flowing in
-      both transmit and receive directions over every tested interface of
-      SUT/DUT. Packet flow metrics are measured per direction, and can be
-      reported as aggregate for both directions and/or separately
-      for each measured direction. In most cases bi-directional tests
-      use the same (symmetric) load in both directions.
-    * Uni-directional throughput tests: involve packets/frames flowing in
-      only one direction, i.e. either transmit or receive direction, over
-      every tested interface of SUT/DUT. Packet flow metrics are measured
-      and are reported for measured direction.
-    * Packet Throughput Rate: maximum packet offered load DUT/SUT forwards
-      within the specified Packet Loss Ratio (PLR). In many cases the rate
-      depends on the frame size processed by DUT/SUT. Hence packet
-      throughput rate MUST be quoted with specific frame size as received by
-      DUT/SUT during the measurement. For bi-directional tests, packet
-      throughput rate should be reported as aggregate for both directions.
-      Measured in packets-per-second (pps) or frames-per-second (fps),
-      equivalent metrics.
-    * Bandwidth Throughput Rate: a secondary metric calculated from packet
-      throughput rate using formula: bw_rate = pkt_rate * (frame_size +
-      L1_overhead) * 8, where L1_overhead for Ethernet includes preamble (8
-      octets) and inter-frame gap (12 octets). For bi-directional tests,
-      bandwidth throughput rate should be reported as aggregate for both
-      directions. Expressed in bits-per-second (bps).
-    * TODO do we need this as it is identical to RFC2544 Throughput?
-      Non Drop Rate (NDR): maximum packet/bandwidth throughput rate sustained
-      by DUT/SUT at PLR equal zero (zero packet loss) specific to tested
-      frame size(s). MUST be quoted with specific packet size as received by
-      DUT/SUT during the measurement. Packet NDR measured in
-      packets-per-second (or fps), bandwidth NDR expressed in
-      bits-per-second (bps).
-    * TODO if needed, reformulate to make it clear there can be multiple rates
-      for multiple (non-zero) loss ratios.
-      : Partial Drop Rate (PDR): maximum packet/bandwidth throughput rate
-      sustained by DUT/SUT at PLR greater than zero (non-zero packet loss)
-      specific to tested frame size(s). MUST be quoted with specific packet
-      size as received by DUT/SUT during the measurement. Packet PDR
-      measured in packets-per-second (or fps), bandwidth PDR expressed in
-      bits-per-second (bps).
-    * TODO: Refer to FRMOL instead.
-      Maximum Receive Rate (MRR): packet/bandwidth rate regardless of PLR
-      sustained by DUT/SUT under specified Maximum Transmit Rate (MTR)
-      packet load offered by traffic generator. MUST be quoted with both
-      specific packet size and MTR as received by DUT/SUT during the
-      measurement. Packet MRR measured in packets-per-second (or fps),
-      bandwidth MRR expressed in bits-per-second (bps).
-    * TODO just keep using "trial measurement"?
-      Trial: a single measurement step. See [RFC2544] section 23.
-    * TODO already defined in RFC2544:
-      Trial duration: amount of time over which packets are transmitted
-      in a single measurement step.
-{:/comment}
-{::comment}
-{:/comment}
-
-* TODO: The current text uses Throughput for the zero loss ratio load.
-  Is the capital T needed/useful?
-* DUT and SUT: see the definitions in https://gerrit.fd.io/r/c/csit/+/35545
-* Traffic Generator (TG) and Traffic Analyzer (TA): see
-  https://datatracker.ietf.org/doc/html/rfc6894#section-4
-  TODO: Maybe there is an earlier RFC?
-* Overall search time: the time it takes to find all required loads within
-  their precision goals, starting from zero trials measured at given
-  DUT configuration and traffic profile.
-* TODO: traffic profile?
-* Intended load: https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.1
-* Offered load: https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.2
-* Maximum offered load (MOL): see
-  https://datatracker.ietf.org/doc/html/rfc2285#section-3.5.3
-* Forwarding rate at maximum offered load (FRMOL)
-  https://datatracker.ietf.org/doc/html/rfc2285#section-3.6.2
-* Trial Loss Count: the number of frames transmitted
-  minus the number of frames received. Negative count is possible,
-  e.g. when SUT duplicates some frames.
-* Trial Loss Ratio: ratio of frames received relative to frames
-  transmitted over the trial duration.
-  For bi-directional throughput tests, the aggregate ratio is calculated,
-  based on the aggregate number of frames transmitted and received.
-  If the trial loss count is negative, its absolute value MUST be used
-  to keep compliance with RFC2544.
-* Safe load: any value, such that trial measurement at this (or lower)
-  intended load is correcrly handled by both TG and TA, regardless of SUT behavior.
-  Frequently, it is not known what the safe load is.
-* Max load (TODO rename?): Maximal intended load to be used during search.
-  Benchmarking team decides which value is low enough
-  to guarantee values reported by TG and TA are reliable.
-  It has to be a safe load, but it can be lower than a safe load estimate
-  for added safety.
-  See the subsection on unreliable test equipment below.
-  This value MUST NOT be higher than MOL, which itself MUST NOT
-  be higher than Maximum Frame Rate
-  https://datatracker.ietf.org/doc/html/rfc2544#section-20
-* Min load: Minimal intended load to be used during search.
-  Benchmarking team decides which value is high enough
-  to guarantee the trial measurement results are valid.
-  E.g. considerable overall search time can be saved by declaring SUT
-  faulty if min load trial shows too high loss rate.
-  Zero frames per second is a valid min load value
-* Effective loss ratio: a corrected value of trial loss ratio
-  chosen to avoid difficulties if SUT exhibits decreasing loss ratio
-  with increasing load. It is the maximum of trial loss ratios
-  measured at the same duration on all loads smaller than (and including)
-  the current one.
-* Target loss ratio: a loss ratio value acting as an input for the search.
-  The search is finding tight enough lower and upper bounds in intended load,
-  so that the measurement at the lower bound has smaller or equal
-  trial loss ratio, and upper bound has strictly larger trial loss ratio.
-  For the tightest upper bound, the effective loss ratio is the same as
-  trial loss ratio at that upper bound load.
-  For the tightest lower bound, the effective loss ratio can be higher
-  than the trial loss ratio at that lower bound, but still not larger
-  than the target loss ratio.
-* TODO: Search algorithm.
-* TODO: Precision goal.
-* TODO: Define a "benchmarking group".
-* TODO: Upper and lower bound.
-* TODO: Valid and invalid bound?
-* TODO: Interval and interval width?
-
-TODO: Mention NIC/PCI bandwidth/pps limits can be lower than bandwidth of medium.
-
-# Intentions of this document
-
-{::comment}
-    Instead of talking about DUTs being non-deterministic
-    and vendors "gaming" in order to get better Throughput results,
-    Maciek and Vratko currently prefer to talk about result repeatability.
-{:/comment}
-
-The intention of this document is to provide recommendations for:
-* optimizing search for multiple target loss ratios at once,
-* speeding up the overall search time,
-* improve search results repeatability and comparability.
-
-No part of RFC2544 is intended to be obsoleted by this document.
-
-{::comment}
-    This document may contain examples which contradict RFC2544 requirements
-    and suggestions.
-    That is not an ecouragement for benchmarking groups
-    to stop being compliant with RFC2544.
-{:/comment}
-
-# RFC2544
-
-## Throughput search
-
-It is useful to restate the key requirements of RFC2544
-using the new terminology (see section Terminology).
-
-The following sections of RFC2544 are of interest for this document.
-
-* https://datatracker.ietf.org/doc/html/rfc2544#section-20
-  Mentions the max load SHOULD not be larget than the theoretical
-  maximum rate for the frame size on the media.
-
-* https://datatracker.ietf.org/doc/html/rfc2544#section-23
-  Lists the actions to be done for each trial measurement,
-  it also mentions loss rate as an example of trial measurement results.
-  This document uses loss count instead, as that is the quantity
-  that is easier for the current test equipment to measure,
-  e.g. it is not affected by the real traffic duration.
-  TODO: Time uncertainty again.
-
-* https://datatracker.ietf.org/doc/html/rfc2544#section-24
-  Mentions "full length trials" leading to the Throughput found,
-  as opposed to shorter trial durations, allowed in an attempt
-  to "minimize the length of search procedure".
-  This document talks about "final trial duration" and aims to
-  "optimize overal search time".
-
-* https://datatracker.ietf.org/doc/html/rfc2544#section-26.1
-  with https://www.rfc-editor.org/errata/eid422
-  finaly states requirements for the search procedure.
-  It boils down to "increase intended load upon zero trial loss
-  and decrease intended load upon non-zero trial loss".
-
-No additional constraints are placed on the load selection,
-and there is no mention of an exit condition, e.g. when there is enough
-trial measurements to proclaim the largest load with zero trial loss
-(and final trial duration) to be the Throughput found.
-
-{::comment}
-    The following section is probably not useful enough.
-
-    ## Generalized search
-
-    Note that the Throughput search can be restated as a "conditional
-    load search" with a specific condition.
-
-    "increase intended load upon trial result satisfying the condition
-    and decrease intended load upon trial result not satisfying the condition"
-    where the Throughput condition is "trial loss count is zero".
-
-    This works for any condition that can be evaluated from a single
-    trial measurement result, and is likely to be true at low loads
-    and false at high loads.
-
-    MLRsearch can incorporate multiple different conditions,
-    as long as there is total ligical ordering between them
-    (e.g. if a condition for a target loss ratio is not satisfied,
-    it is also not satisfied for any other codition which uses
-    larger target loss ratio).
-
-    TODO: How to call a "load associated with this particular condition"?
-{:/comment}
-
-{::comment}
-
-    TODO: Not sure if this subsection is needed an where.
-
-    ## Simple bisection
-
-    There is one obvious and simple search algorithm which conforms
-    to throughput search requirements: simple bijection.
-
-    Input: target precision, in frames per second.
-
-    Procedure:
-
-    1. Chose min load to be zero.
-       1. No need to measure, loss count has to be zero.
-       2. Use the zero load as the current lower bound.
-    2. Chose max load to be the max value allowed by bandwidth of the medium.
-       1. Perform a trial measurement (at the full length duration) at max load.
-       2. If there is zero trial loss count, return max load as Throughput.
-       3. Use max load as the current upper bound.
-    3. Repeat until the difference between lower bound and upper bound is
-       smaller or equal to the precision goal.
-       1. If it is not larget, return the current lower bound as Throughput.
-       2. Else: Chose new load as the arithmetic average of lower and upper bound.
-       3. Perform a trial measurement (at the full length duration) at this load.
-       4. If the trial loss rate is zero, consider the load as new lower bound.
-       5. Else consider the load as the new upper bound.
-       6. Jump back to the repeat at 3.
-
-    Another possible stop condition is the overal search time so far,
-    but that is not really a different condition, as the time for search to reach
-    the precision goal is just a function of precision goal, trial duration
-    and the difference between max and min load.
-
-    While this algorithm can be accomodated to search for multiple
-    target loss ratios "at the same time (see somewhere below),
-    it is still missing multiple improvement which give MLRsearch
-    considerably better overal search time in practice.
-
-{:/comment}
-
-# Problems
-
-## Repeatability and Comparability
-
-RFC2544 does not suggest to repeat Throughput search,
-{::comment}probably because the full set of tests already takes long{:/comment}
-and from just one Throughput value, it cannot be determined
-how repeatable that value is (how likely it is for a repeated Throughput search
-to end up with a value less then the precision goal away from the first value).
-
-Depending on SUT behavior, different benchmark groups
-can report significantly different Througput values,
-even when using identical SUT and test equipment,
-just because of minor differences in their search algorithm
-(e.g. different max load value).
-
-While repeatability can be addressed by repeating the search several times,
-the differences in the comparability scenario may be systematic,
-e.g. seeming like a bias in one or both benchmark groups.
-
-MLRsearch algorithm does not really help with the repeatability problem.
-This document RECOMMENDS to repeat a selection of "important" tests
-ten times, so users can ascertain the repeatability of the results.
-
-TODO: How to report? Average and standard deviation?
-
-Following MLRsearch algorithm leaves less freedom for the benchmark groups
-to encounter the comparability problem,
-alghough more research is needed to determine the effect
-of MLRsearch's tweakable parameters.
-
-{::comment}
-    Possibly, the old DUTs were quite sharply consistent in their performance,
-    and/or precision goals were quite large in order to save overal search time.
-
-    With software DUTs and with time-efficient search algorithms,
-    nowadays the repeatability of Throughput can be quite low,
-    as in standard deviation of repeated Througput results
-    is considerably higher than the precision goal.
-{:/comment}
-
-{::comment}
-    TODO: Unify with PLRsearch draft.
-    TODO: No-loss region, random region, lossy region.
-    TODO: Tweaks with respect to non-zero loss ratio goal.
-    TODO: Duration dependence?
-
-    Both RFC2544 and MLRsearch return Throughput somewhere inside the random region,
-    or at most the precision goal below it.
-{:/comment}
-
-{::comment}
-    TODO: Make sure this is covered elsewhere, then delete.
-
-    ## Search repeatability
-
-    The goal of RFC1242 and RFC2544 is to limit how vendors benchmark their DUTs,
-    in order to force them to report values that have higher chance
-    to be confirmed by independent benchmarking groups following the same RFCs.
-
-    This works well for deterministic DUTs.
-
-    But for non-deterministic DUTs, the RFC2544 Throughput value
-    is only guaranteed to fall somewhere below the lossy region (TODO define).
-    It is possible to arrive at a value positioned likely high in the random region
-    at the cost of increased overall search duration,
-    simply by lowering the load by very small amounts (instead of exact halving)
-    upon lossy trial and increasing by large amounts upon lossless trial.
-
-    Prescribing an exact search algorithm (bisection or MLRsearch or other)
-    will force vendors to report less "gamey" Throughput values.
-{:/comment}
-
-{::comment}
-    ## Extensions
-
-    The following two sections are probably out of scope,
-    as they does not affect MLRsearch design choices.
-
-    ### Direct and inverse measurements
-
-    TODO expand: Direct measurement is single trial measurement,
-    with predescribed inputs and outputs turned directly into the quality of interest
-    Examples:
-    Latency https://datatracker.ietf.org/doc/html/rfc2544#section-26.2
-    is a single direct measurement.
-    Frame loss rate https://datatracker.ietf.org/doc/html/rfc2544#section-26.3
-    is a sequence of direct measurements.
-
-    TODO expand: Indirect measurement aims to solve an "inverse function problem",
-    meaning (a part of) trial measurement output is prescribed, and the quantity
-    of interest is (derived from) the input parameters of trial measurement
-    that achieves the prescribed output.
-    In general this is a hard problem, but if the unknown input parameter
-    is just one-dimensional quantity, algorithms such as bisection
-    do converge regardless of outputs seen.
-    We call any such algorithm examining one-dimensional input as "search".
-    Of course, some exit condition is needed for the search to end.
-    In case of Throughput, bisection algorithm tracks both upper bound
-    and lower bound, with lower bound at the end of search is the quantity
-    satisfying the definition of Throughput.
-
-    ### Metrics other than frames
-
-    TODO expand: Small TCP transaction can succeed even if some frames are lost.
-
-    TODO expand: It is possible for loss ratio to use different metric than load.
-    E.g. pps loss ratio when traffic profile uses higher level transactions per second.
-
-    ### TODO: Stateful DUT
-
-    ### TODO: Stateful traffic
-{:/comment}
-
-## Non-Zero Target Loss Ratios
-
-https://datatracker.ietf.org/doc/html/rfc1242#section-3.17
-defines Throughput as:
-    The maximum rate at which none of the offered frames
-    are dropped by the device.
-
-and then it says:
-    Since even the loss of one frame in a
-    data stream can cause significant delays while
-    waiting for the higher level protocols to time out,
-    it is useful to know the actual maximum data
-    rate that the device can support.
-
-{::comment}
-
-    While this may still be true for some protocols,
-    research has been performed...
-
-    TODO: Add this link properly: https://www.itu.int/rec/dologin_pub.asp?lang=e&id=T-REC-Y.1541-201112-I!!PDF-E&type=items
-    TODO: List values from that document, from 10^-3 to 4*10^-6.
-
-    ...on other protocols and use cases,
-    resulting in some small but non-zero loss ratios being considered
-    as acceptable. Unfortunately, the acceptable value depends on use case
-    and properties such as TCP window size and round trip time,
-    so no single value of target loss rate (other than zero)
-    is considered to be universally applicable.
-
-{:/comment}
-
-New "software DUTs" (traffic forwarding programs running on
-commercial-off-the-shelf compute server hardware) frequently exhibit quite
-low repeatability of Throughput results per above definition.
-
-This is due to, in general, throughput rates of software DUTs (programs)
-being sensitive to server resource allocation by OS during runtime,
-as well as any interrupts or blocking of software threads involved
-in packet processing.
-
-To deal with this, this document recommends discovery of multiple throughput rates of interest for software DUTs that run on general purpose COTS servers (with x86, AArch64 Instruction Set Architectures):
-* throughput rate with target of zero packet loss ratio.
-* at least one throughput rate with target of non-zero packet loss ratio.
-
-
-In our experience, the higher the target loss ratio is,
-the better is the repeatability of the corresponding load found.
-
-TODO: Define a good name for a load corresponding to a specific non-zero
-target loss ration, while keeping Throughput for the load corresponding
-to zero target loss ratio.
-
-This document RECOMMENDS the benchmark groups to search for corresponding loads
-to at least one non-zero target loss ratio.
-This document does not suggest any particular non-zero target loss ratio value
-to search the corresponding load for.
-
-{::comment}
-    What is worse, some benchmark groups (which groups?; citation needed)
-    started reporting loads that achieved only "approximate zero loss",
-    while still calling that a Throughput (and thus becoming non-compliant
-    with RFC2544).
-{:/comment}
-
-# Solution ideas
-
-This document gives several independent ideas on how to lower the (average)
-overall search time, while remaining unconditionally compliant with RFC2544
-(and adding some of extensions).
-
-This document also specifies one particular way to combine all the ideas
-into a single search algorithm class (single logic with few tweakable parameters).
-
-Little to no research has been done into the question of which combination
-of ideas achieves the best compromise with respect to overal search time,
-high repeatability and high comparability.
-
-TODO: How important it is to discuss particular implementation choices,
-especially when motivated by non-deterministic SUT behavior?
-
-## Short duration trials
-
-https://datatracker.ietf.org/doc/html/rfc2544#section-24
-already mentions the possibity of using shorter duration
-for trials that are not part of "final determination".
-
-Obviously, the upper and lower bound from a smaller duration trial
-can be used as the initial upper and lower bound for the final determination.
-
-MLRsearch makes it clear a re-measurement is always needed
-(new trial measurement with the same load but longer duration).
-It also specifes what to do if the longer trial is no longer a valid bound
-(TODO define?), e.g. start an external search.
-Additionaly one halving can be saved during the shorter duration search.
-
-## FRMOL as reasonable start
-
-TODO expand: Overal search ends with "final determination" search,
-preceded by "shorter duration search" preceded by "bound initialization",
-where the bounds can be considerably different from min and max load.
-
-For SUTs with high repeatability, the FRMOL is usually a good approximation
-of Throughput. But for less repeatable SUTs, forwarding rate (TODO define)
-is frequently a bad approximation to Throughput, therefore halving
-and other robust-to-worst-case approaches have to be used.
-Still, forwarding rate at FRMOL load can be a good initial bound.
-
-## Non-zero loss ratios
-
-See the "Popularity of non-zero target loss ratios" section above.
-
-TODO: Define "trial measurement result classification criteria",
-or keep reusing long phrases without definitions?
-
-A search for a load corresponding to a non-zero target loss rate
-is very similar to a search for Throughput,
-just the criterion when to increase or decrease the intended load
-for the next trial measurement uses the comparison of trial loss ratio
-to the target loss ratio (instead of comparing loss count to zero)
-Any search algorithm that works for Throughput can be easily used also for
-non-zero target loss rates, perhaps with small modifications
-in places where the measured forwarding rate is used.
-
-Note that it is possible to search for multiple loss ratio goals if needed.
-
-## Concurrent ratio search
-
-A single trial measurement result can act as an upper bound for a lower
-target loss ratio, and as a lower bound for a higher target loss ratio
-at the same time. This is an example of how
-it can be advantageous to search for all loss ratio goals "at once",
-or at least "reuse" trial measurement result done so far.
-
-Even when a search algorithm is fully deterministic in load selection
-while focusing on a single loss ratio and trial duration,
-the choice of iteration order between target loss ratios and trial durations
-can affect the obtained results in subtle ways.
-MLRsearch offers one particular ordering.
-
-{::comment}
-    It is not clear if the current ordering is "best",
-    it is not even clear how to measure how good an ordering is.
-    We would need several models for bad SUT behaviors,
-    bug-free implementations of different orderings,
-    simulator to show the distribution of rates found,
-    distribution of overall durations,
-    and a criterion of which rate distribution is "bad"
-    and whether it is worth the time saved.
-{:/comment}
-{::comment}
-{:/comment}
-
-## Load selection heuristics and shortcuts
-
-Aside of the two heuristics already mentioned (FRMOL based initial bounds
-and saving one halving when increasing trial duration),
-there are other tricks that can save some overall search time
-at the cost of keeping the difference between final lower and upper bound
-intentionally large (but still within the precision goal).
-
-TODO: Refer implementation subsections on:
-* Uneven splits.
-* Rounding the interval width up.
-* Using old invalid bounds for interval width guessing.
-
-The impact on overall duration is probably small,
-and the effect on result distribution maybe even smaller.
-TODO: Is the two-liner above useful at all?
-
-# Non-compliance with RFC2544
-
-It is possible to achieve even faster search times by abandoning
-some requirements and suggestions of RFC2544,
-mainly by reducing the wait times at start and end of trial.
-
-Such results are therefore no longer compliant with RFC2544
-(or at least not unconditionally),
-but they may still be useful for internal usage, or for comparing
-results of different DUTs achieved with an identical non-compliant algorithm.
-
-TODO: Refer to the subsection with CSIT customizations.
-
-# Additional Requirements
-
-RFC2544 can be understood as having a number of implicit requirements.
-They are made explicit in this section
-(as requirements for this document, not for RFC2544).
-
-Recommendations on how to properly address the implicit requirements
-are out of scope of this document.
-
-{::comment}
-
-    Although some (insufficient) ideas are proposed.
-
-{:/comment}
-
-## TODO: Search Stop Criteria
-
-TODO: Mention the timeout parameter?
-
-{::comment}
-
-    TODO: highlight importance of results consistency
-    for SUT performance trending and anomaly detection.
-
-{:/comment}
-
-## Reliability of Test Equipment
-
-Both TG and TA MUST be able to handle correctly
-every intended load used during the search.
-
-On TG side, the difference between Intended Load and Offered Load
-MUST be small.
-
-TODO: How small? Difference of one packet may not be measurable
-due to time uncertainties.
-
-{::comment}
-
-    Maciek: 1 packet out of 10M, that's 10**-7 accuracy.
-
-    Vratko: For example, TRex uses several "worker" threads, each doing its own
-    rounding on how many packets to send, separately per each traffic stream.
-    For high loads and durations, the observed number of frames transmitted
-    can differ from the expected (fractional) value by tens of frames.
-
-{:/comment}
-
-TODO expand: time uncertainty.
-
-To ensure that, max load (see Terminology) has to be set to low enough value.
-Benchmark groups MAY list the max load value used,
-especially if the Throughput value is equal (or close) to the max load.
-
-{::comment}
-
-    The following is probably out of scope of this document,
-    but can be useful when put into a separate document.
-
-    TODO expand: If it results in smaller Throughput reported,
-    it is not a big issue. Treat similarly to bandwidth and PPS limits of NICs.
-
-    TODO expand: TA dropping packets when loaded only lowers Throughput,
-    so not an issue.
-
-    TODO expand: TG sending less packets but stopping at target duration
-    is also fine, as long as the forwarding rate is used as Throughput value,
-    not the higher intended load.
-
-    TODO expand: Duration stretching is not fine.
-    Neither "check for actual duration" nor "start+sleep+stop"
-    are reliable solutions due to time overheads and uncertainty
-    of TG starting/stopping traffic (and TA stopping counting packets).
-
-{:/comment}
-
-Solutions (even problem formulations) for the following open problems
-are outside of the scope of this document:
-* Detecting when the test equipment operates above its safe load.
-* Finding a large but safe load value.
-* Correcting any result affected by max load value not being a safe load.
-
-{::comment}
-
-    TODO: Mention 90% of self-test as an idea:
-    https://datatracker.ietf.org/doc/html/rfc8219#section-9.2.1
-
-    This is pointing to DNS testing, nothing to do with throughput,
-    so how is it relevant here?
-
-{:/comment}
-
-{::comment}
-
-    Part of discussion on BMWG mailing list (with small edits):
-
-    This is a hard issue.
-    The algorithm as described has no way of knowing
-    which part of the whole system is limiting the performance.
-
-    It could be SUT only (no problem, testing SUT as expected),
-    it could be TG only (can be mitigated by TG self-test
-    and using small enough loads).
-
-    But it could also be an interaction between DUT and TG.
-    Imagine a TG (the Traffic Analyzer part) which is only able
-    to handle incoming traffic up to some rate,
-    but passes the self-test as the Generator part has maximal rate
-    not larger than that. But what if SUT turns that steady rate
-    into long-enough bursts of a higher rate (with delays between bursts
-    large enough, so average forwarding rate matches the load).
-    This way TA will see some packets as missing (when its buffers
-    fill up), even though SUT has processed them correctly.
-
-{:/comment}
-
-### Very late frames
-
-{::comment}
-
-    In CSIT we are aggressive at skipping all wait times around trial,
-    but few of DUTs have large enough buffers.
-    Or there is another reason why we are seeing negative loss counts.
-
-{:/comment}
-
-
-RFC2544 requires quite conservative time delays
-see https://datatracker.ietf.org/doc/html/rfc2544#section-23
-to prevent frames buffered in one trial measurement
-to be counted as received in a subsequent trial measurement.
-
-However, for some SUTs it may still be possible to buffer enough frames,
-so they are still sending them (perhaps in bursts)
-when the next trial measurement starts.
-Sometimes, this can be detected as a negative trial loss count, e.g. TA receiving
-more frames than TG has sent during this trial measurement. Frame duplication
-is another way of causing the negative trial loss count.
-
-https://datatracker.ietf.org/doc/html/rfc2544#section-10
-recommends to use sequence numbers in frame payloads,
-but generating and verifying them requires test equipment resources,
-which may be not plenty enough to suport at high loads.
-(Using low enough max load would work, but frequently that would be
-smaller than SUT's sctual Throughput.)
-
-RFC2544 does not offer any solution to the negative loss problem,
-except implicitly treating negative trial loss counts
-the same way as positive trial loss counts.
-
-This document also does not offer any practical solution.
-
-Instead, this document SUGGESTS the search algorithm to take any precaution
-necessary to avoid very late frames.
-
-This document also REQUIRES any detected duplicate frames to be counted
-as additional lost frames.
-This document also REQUIRES, any negative trial loss ratio
-to be treated as positive trial loss ratio of the same absolute value.
-
-{::comment}
-
-    !!! Make sure this is covered elsewere, at least in better comments. !!!
-
-    ## TODO: Bad behavior of SUT
-
-    (Highest load with always zero loss can be quite far from lowest load
-    with always nonzero loss.)
-    (Non-determinism: warm up, periodic "stalls", perf decrease over time, ...)
-
-    Big buffers:
-    http://www.hit.bme.hu/~lencse/publications/ECC-2017-B-M-DNS64-revised.pdf
-    See page 8 and search for the word "gaming".
-
-{:/comment}
-
-!!! Nothing below is up-to-date with draft v02. !!!
-
-# MLRsearch Background
-
-TODO: Old section, probably obsoleted by preceding section(s).
-
-Multiple Loss Ratio search (MLRsearch) is a packet throughput search
-algorithm suitable for deterministic systems (as opposed to
-probabilistic systems). MLRsearch discovers multiple packet throughput
-rates in a single search, each rate is associated with a distinct
-Packet Loss Ratio (PLR) criterion.
-
-For cases when multiple rates need to be found, this property makes
-MLRsearch more efficient in terms of time execution, compared to
-traditional throughput search algorithms that discover a single packet
-rate per defined search criteria (e.g. a binary search specified by
-[RFC2544]). MLRsearch reduces execution time even further by relying on
-shorter trial durations of intermediate steps, with only the final
-measurements conducted at the specified final trial duration. This
-results in the shorter overall search execution time when compared to a
-traditional binary search, while guaranteeing the same results for
-deterministic systems.
-
-In practice, two rates with distinct PLRs are commonly used for packet
-throughput measurements of NFV systems: Non Drop Rate (NDR) with PLR=0
-and Partial Drop Rate (PDR) with PLR>0. The rest of this document
-describes MLRsearch with NDR and PDR pair as an example.
-
-Similarly to other throughput search approaches like binary search,
-MLRsearch is effective for SUTs/DUTs with PLR curve that is
-non-decreasing with growing offered load. It may not be as
-effective for SUTs/DUTs with abnormal PLR curves, although
-it will always converge to some value.
-
-MLRsearch relies on traffic generator to qualify the received packet
-stream as error-free, and invalidate the results if any disqualifying
-errors are present e.g. out-of-sequence frames.
-
-MLRsearch can be applied to both uni-directional and bi-directional
-throughput tests.
-
-For bi-directional tests, MLRsearch rates and ratios are aggregates of
-both directions, based on the following assumptions:
-
-* Traffic transmitted by traffic generator and received by SUT/DUT
-  has the same packet rate in each direction,
-  in other words the offered load is symmetric.
-* SUT/DUT packet processing capacity is the same in both directions,
-  resulting in the same packet loss under load.
-
-MLRsearch can be applied even without those assumptions,
-but in that case the aggregate loss ratio is less useful as a metric.
-
-MLRsearch can be used for network transactions consisting of more than
-just one packet, or anything else that has intended load as input
-and loss ratio as output (duration as input is optional).
-This text uses mostly packet-centric language.
-
-# MLRsearch Overview
-
-The main properties of MLRsearch:
-
-* MLRsearch is a duration aware multi-phase multi-rate search algorithm:
-  * Initial Phase determines promising starting interval for the search.
-  * Intermediate Phases progress towards defined final search criteria.
-  * Final Phase executes measurements according to the final search
-    criteria.
-  * Final search criteria are defined by following inputs:
-    * Target PLRs (e.g. 0.0 and 0.005 when searching for NDR and PDR).
-    * Final trial duration.
-    * Measurement resolution.
-* Initial Phase:
-  * Measure MRR over initial trial duration.
-  * Measured MRR is used as an input to the first intermediate phase.
-* Multiple Intermediate Phases:
-  * Trial duration:
-    * Start with initial trial duration in the first intermediate phase.
-    * Converge geometrically towards the final trial duration.
-  * Track all previous trial measurement results:
-    * Duration, offered load and loss ratio are tracked.
-    * Effective loss ratios are tracked.
-      * While in practice, real loss ratios can decrease with increasing load,
-        effective loss ratios never decrease. This is achieved by sorting
-        results by load, and using the effective loss ratio of the previous load
-        if the current loss ratio is smaller than that.
-    * The algorithm queries the results to find best lower and upper bounds.
-      * Effective loss ratios are always used.
-    * The phase ends if all target loss ratios have tight enough bounds.
-  * Search:
-    * Iterate over target loss ratios in increasing order.
-    * If both upper and lower bound are in measurement results for this duration,
-      apply bisect until the bounds are tight enough,
-      and continue with next loss ratio.
-    * If a bound is missing for this duration, but there exists a bound
-      from the previous duration (compatible with the other bound
-      at this duration), re-measure at the current duration.
-    * If a bound in one direction (upper or lower) is missing for this duration,
-      and the previous duration does not have a compatible bound,
-      compute the current "interval size" from the second tightest bound
-      in the other direction (lower or upper respectively)
-      for the current duration, and choose next offered load for external search.
-    * The logic guarantees that a measurement is never repeated with both
-      duration and offered load being the same.
-    * The logic guarantees that measurements for higher target loss ratio
-      iterations (still within the same phase duration) do not affect validity
-      and tightness of bounds for previous target loss ratio iterations
-      (at the same duration).
-  * Use of internal and external searches:
-    * External search:
-      * It is a variant of "exponential search".
-      * The "interval size" is multiplied by a configurable constant
-        (powers of two work well with the subsequent internal search).
-    * Internal search:
-      * A variant of binary search that measures at offered load between
-        the previously found bounds.
-      * The interval does not need to be split into exact halves,
-        if other split can get to the target width goal faster.
-        * The idea is to avoid returning interval narrower than the current
-          width goal. See sample implementation details, below.
-* Final Phase:
-  * Executed with the final test trial duration, and the final width
-    goal that determines resolution of the overall search.
-* Intermediate Phases together with the Final Phase are called
-  Non-Initial Phases.
-* The returned bounds stay within prescribed min_rate and max_rate.
-  * When returning min_rate or max_rate, the returned bounds may be invalid.
-    * E.g. upper bound at max_rate may come from a measurement
-      with loss ratio still not higher than the target loss ratio.
-
-The main benefits of MLRsearch vs. binary search include:
-
-* In general, MLRsearch is likely to execute more trials overall, but
-  likely less trials at a set final trial duration.
-* In well behaving cases, e.g. when results do not depend on trial
-  duration, it greatly reduces (>50%) the overall duration compared to a
-  single PDR (or NDR) binary search over duration, while finding
-  multiple drop rates.
-* In all cases MLRsearch yields the same or similar results to binary
-  search.
-* Note: both binary search and MLRsearch are susceptible to reporting
-  non-repeatable results across multiple runs for very bad behaving
-  cases.
-
-Caveats:
-
-* Worst case MLRsearch can take longer than a binary search, e.g. in case of
-  drastic changes in behaviour for trials at varying durations.
-  * Re-measurement at higher duration can trigger a long external search.
-    That never happens in binary search, which uses the final duration
-    from the start.
-
-# Sample Implementation
-
-Following is a brief description of a sample MLRsearch implementation,
-which is a simplified version of the existing implementation.
-
-## Input Parameters
-
-1. **max_rate** - Maximum Transmit Rate (MTR) of packets to
-   be used by external traffic generator implementing MLRsearch,
-   limited by the actual Ethernet link(s) rate, NIC model or traffic
-   generator capabilities.
-2. **min_rate** - minimum packet transmit rate to be used for
-   measurements. MLRsearch fails if lower transmit rate needs to be
-   used to meet search criteria.
-3. **final_trial_duration** - required trial duration for final rate
-   measurements.
-4. **initial_trial_duration** - trial duration for initial MLRsearch phase.
-5. **final_relative_width** - required measurement resolution expressed as
-   (lower_bound, upper_bound) interval width relative to upper_bound.
-6. **packet_loss_ratios** - list of maximum acceptable PLR search criteria.
-7. **number_of_intermediate_phases** - number of phases between the initial
-   phase and the final phase. Impacts the overall MLRsearch duration.
-   Less phases are required for well behaving cases, more phases
-   may be needed to reduce the overall search duration for worse behaving cases.
-
-## Initial Phase
-
-1. First trial measures at configured maximum transmit rate (MTR) and
-   discovers maximum receive rate (MRR).
-   * IN: trial_duration = initial_trial_duration.
-   * IN: offered_transmit_rate = maximum_transmit_rate.
-   * DO: single trial.
-   * OUT: measured loss ratio.
-   * OUT: MRR = measured receive rate.
-   Received rate is computed as intended load multiplied by pass ratio
-   (which is one minus loss ratio). This is useful when loss ratio is computed
-   from a different metric than intended load. For example, intended load
-   can be in transactions (multiple packets each), but loss ratio is computed
-   on level of packets, not transactions.
-
-   * Example: If MTR is 10 transactions per second, and each transaction has
-     10 packets, and receive rate is 90 packets per second, then loss rate
-     is 10%, and MRR is computed to be 9 transactions per second.
-
-   If MRR is too close to MTR, MRR is set below MTR so that interval width
-   is equal to the width goal of the first intermediate phase.
-   If MRR is less than min_rate, min_rate is used.
-2. Second trial measures at MRR and discovers MRR2.
-   * IN: trial_duration = initial_trial_duration.
-   * IN: offered_transmit_rate = MRR.
-   * DO: single trial.
-   * OUT: measured loss ratio.
-   * OUT: MRR2 = measured receive rate.
-   If MRR2 is less than min_rate, min_rate is used.
-   If loss ratio is less or equal to the smallest target loss ratio,
-   MRR2 is set to a value above MRR, so that interval width is equal
-   to the width goal of the first intermediate phase.
-   MRR2 could end up being equal to MTR (for example if both measurements so far
-   had zero loss), which was already measured, step 3 is skipped in that case.
-3. Third trial measures at MRR2.
-   * IN: trial_duration = initial_trial_duration.
-   * IN: offered_transmit_rate = MRR2.
-   * DO: single trial.
-   * OUT: measured loss ratio.
-   * OUT: MRR3 = measured receive rate.
-   If MRR3 is less than min_rate, min_rate is used.
-   If step 3 is not skipped, the first trial measurement is forgotten.
-   This is done because in practice (if MRR2 is above MRR), external search
-   from MRR and MRR2 is likely to lead to a faster intermediate phase
-   than a bisect between MRR2 and MTR.
-
-## Non-Initial Phases
-
-1. Main phase loop:
-   1. IN: trial_duration for the current phase. Set to
-      initial_trial_duration for the first intermediate phase; to
-      final_trial_duration for the final phase; or to the element of
-      interpolating geometric sequence for other intermediate phases.
-      For example with two intermediate phases, trial_duration of the
-      second intermediate phase is the geometric average of
-      initial_trial_duration and final_trial_duration.
-   2. IN: relative_width_goal for the current phase. Set to
-      final_relative_width for the final phase; doubled for each
-      preceding phase. For example with two intermediate phases, the
-      first intermediate phase uses quadruple of final_relative_width
-      and the second intermediate phase uses double of
-      final_relative_width.
-   3. IN: Measurement results from the previous phase (previous duration).
-   4. Internal target ratio loop:
-      1. IN: Target loss ratio for this iteration of ratio loop.
-      2. IN: Measurement results from all previous ratio loop iterations
-         of current phase (current duration).
-      3. DO: According to the procedure described in point 2:
-         1. either exit the phase (by jumping to 1.5),
-         2. or exit loop iteration (by continuing with next target loss ratio,
-            jumping to 1.4.1),
-         3. or calculate new transmit rate to measure with.
-      4. DO: Perform the trial measurement at the new transmit rate and
-         current trial duration, compute its loss ratio.
-      5. DO: Add the result and go to next iteration (1.4.1),
-         including the added trial result in 1.4.2.
-   5. OUT: Measurement results from this phase.
-   6. OUT: In the final phase, bounds for each target loss ratio
-      are extracted and returned.
-      1. If a valid bound does not exist, use min_rate or max_rate.
-2. New transmit rate (or exit) calculation (for point 1.4.3):
-   1. If the previous duration has the best upper and lower bound,
-      select the middle point as the new transmit rate.
-      1. See 2.5.3. below for the exact splitting logic.
-      2. This can be a no-op if interval is narrow enough already,
-         in that case continue with 2.2.
-      3. Discussion, assuming the middle point is selected and measured:
-         1. Regardless of loss rate measured, the result becomes
-            either best upper or best lower bound at current duration.
-         2. So this condition is satisfied at most once per iteration.
-         3. This also explains why previous phase has double width goal:
-            1. We avoid one more bisection at previous phase.
-            2. At most one bound (per iteration) is re-measured
-               with current duration.
-            3. Each re-measurement can trigger an external search.
-            4. Such surprising external searches are the main hurdle
-               in achieving low overall search durations.
-            5. Even without 1.1, there is at most one external search
-               per phase and target loss ratio.
-            6. But without 1.1 there can be two re-measurements,
-               each coming with a risk of triggering external search.
-   2. If the previous duration has one bound best, select its transmit rate.
-      In deterministic case this is the last measurement needed this iteration.
-   3. If only upper bound exists in current duration results:
-      1. This can only happen for the smallest target loss ratio.
-      2. If the upper bound was measured at min_rate,
-         exit the whole phase early (not investigating other target loss ratios).
-      3. Select new transmit rate using external search:
-         1. For computing previous interval size, use:
-            1. second tightest bound at current duration,
-            2. or tightest bound of previous duration,
-               if compatible and giving a more narrow interval,
-            3. or target interval width if none of the above is available.
-            4. In any case increase to target interval width if smaller.
-         2. Quadruple the interval width.
-         3. Use min_rate if the new transmit rate is lower.
-   4. If only lower bound exists in current duration results:
-      1. If the lower bound was measured at max_rate,
-         exit this iteration (continue with next lowest target loss ratio).
-      2. Select new transmit rate using external search:
-         1. For computing previous interval size, use:
-            1. second tightest bound at current duration,
-            2. or tightest bound of previous duration,
-               if compatible and giving a more narrow interval,
-            3. or target interval width if none of the above is available.
-            4. In any case increase to target interval width if smaller.
-         2. Quadruple the interval width.
-         3. Use max_rate if the new transmit rate is higher.
-   5. The only remaining option is both bounds in current duration results.
-      1. This can happen in two ways, depending on how the lower bound
-         was chosen.
-         1. It could have been selected for the current loss ratio,
-            e.g. in re-measurement (2.2) or in initial bisect (2.1).
-         2. It could have been found as an upper bound for the previous smaller
-            target loss ratio, in which case it might be too low.
-         3. The algorithm does not track which one is the case,
-            as the decision logic works well regardless.
-      2. Compute "extending down" candidate transmit rate exactly as in 2.3.
-      3. Compute "bisecting" candidate transmit rate:
-         1. Compute the current interval width from the two bounds.
-         2. Express the width as a (float) multiple of the target width goal
-            for this phase.
-         3. If the multiple is not higher than one, it means the width goal
-            is met. Exit this iteration and continue with next higher
-            target loss ratio.
-         4. If the multiple is two or less, use half of that
-            for new width if the lower subinterval.
-         5. Round the multiple up to nearest even integer.
-         6. Use half of that for new width if the lower subinterval.
-         7. Example: If lower bound is 2.0 and upper bound is 5.0, and width
-            goal is 1.0, the new candidate transmit rate will be 4.0.
-            This can save a measurement when 4.0 has small loss.
-            Selecting the average (3.5) would never save a measurement,
-            giving more narrow bounds instead.
-      4. If either candidate computation want to exit the iteration,
-         do as bisecting candidate computation says.
-      5. The remaining case is both candidates wanting to measure at some rate.
-         Use the higher rate. This prefers external search down narrow enough
-         interval, competing with perfectly sized lower bisect subinterval.
-
-# FD.io CSIT Implementation
-
-The only known working implementation of MLRsearch is in
-the open-source code running in Linux Foundation
-FD.io CSIT project [FDio-CSIT-MLRsearch] as part of
-a Continuous Integration / Continuous Development (CI/CD) framework.
-
-MLRsearch is also available as a Python package in [PyPI-MLRsearch].
-
-## Additional details
-
-This document so far has been describing a simplified version of
-MLRsearch algorithm. The full algorithm as implemented in CSIT contains
-additional logic, which makes some of the details (but not general
-ideas) above incorrect. Here is a short description of the additional
-logic as a list of principles, explaining their main differences from
-(or additions to) the simplified description, but without detailing
-their mutual interaction.
-
-1. Logarithmic transmit rate.
-   * In order to better fit the relative width goal, the interval
-     doubling and halving is done differently.
-   * For example, the middle of 2 and 8 is 4, not 5.
-2. Timeout for bad cases.
-   * The worst case for MLRsearch is when each phase converges to
-     intervals way different than the results of the previous phase.
-   * Rather than suffer total search time several times larger than pure
-     binary search, the implemented tests fail themselves when the
-     search takes too long (given by argument *timeout*).
-3. Intended count.
-   * The number of packets to send during the trial should be equal to
-     the intended load multiplied by the duration.
-     * Also multiplied by a coefficient, if loss ratio is calculated
-       from a different metric.
-       * Example: If a successful transaction uses 10 packets,
-         load is given in transactions per second, but loss ratio is calculated
-         from packets, so the coefficient to get intended count of packets
-         is 10.
-   * But in practice that does not work.
-     * It could result in a fractional number of packets,
-     * so it has to be rounded in a way traffic generator chooses,
-     * which may depend on the number of traffic flows
-       and traffic generator worker threads.
-4. Attempted count. As the real number of intended packets is not known exactly,
-   the computation uses the number of packets traffic generator reports as sent.
-   Unless overridden by the next point.
-5. Duration stretching.
-   * In some cases, traffic generator may get overloaded,
-     causing it to take significantly longer (than duration) to send all packets.
-   * The implementation uses an explicit stop,
-     * causing lower attempted count in those cases.
-   * The implementation tolerates some small difference between
-     attempted count and intended count.
-     * 10 microseconds worth of traffic is sufficient for our tests.
-   * If the difference is higher, the unsent packets are counted as lost.
-     * This forces the search to avoid the regions of high duration stretching.
-     * The final bounds describe the performance of not just SUT,
-       but of the whole system, including the traffic generator.
-6. Excess packets.
-   * In some test (e.g. using TCP flows) Traffic generator reacts to packet loss
-     by retransmission. Usually, such packet loss is already affecting loss ratio.
-     If a test also wants to treat retransmissions due to heavily delayed packets
-     also as a failure, this is once again visible as a mismatch between
-     the intended count and the attempted count.
-   * The CSIT implementation simply looks at absolute value of the difference,
-     so it offers the same small tolerance before it starts marking a "loss".
-7. For result processing, we use lower bounds and ignore upper bounds.
-
-### FD.io CSIT Input Parameters
-
-1. **max_rate** - Typical values: 2 * 14.88 Mpps for 64B
-   10GE link rate, 2 * 18.75 Mpps for 64B 40GE NIC (specific model).
-2. **min_rate** - Value: 2 * 9001 pps (we reserve 9000 pps
-   for latency measurements).
-3. **final_trial_duration** - Value: 30.0 seconds.
-4. **initial_trial_duration** - Value: 1.0 second.
-5. **final_relative_width** - Value: 0.005 (0.5%).
-6. **packet_loss_ratios** - Value: 0.0, 0.005 (0.0% for NDR, 0.5% for PDR).
-7. **number_of_intermediate_phases** - Value: 2.
-   The value has been chosen based on limited experimentation to date.
-   More experimentation needed to arrive to clearer guidelines.
-8. **timeout** - Limit for the overall search duration (for one search).
-   If MLRsearch oversteps this limit, it immediately declares the test failed,
-   to avoid wasting even more time on a misbehaving SUT.
-   Value: 600.0 (seconds).
-9. **expansion_coefficient** - Width multiplier for external search.
-   Value: 4.0 (interval width is quadroupled).
-   Value of 2.0 is best for well-behaved SUTs, but value of 4.0 has been found
-   to decrease overall search time for worse-behaved SUT configurations,
-   contributing more to the overall set of different SUT configurations tested.
-
-
-## Example MLRsearch Run
-
-
-The following list describes a search from a real test run in CSIT
-(using the default input values as above).
-
-* Initial phase, trial duration 1.0 second.
-
-Measurement 1, intended load 18750000.0 pps (MTR),
-measured loss ratio 0.7089514628479618 (valid upper bound for both NDR and PDR).
-
-Measurement 2, intended load 5457160.071600716 pps (MRR),
-measured loss ratio 0.018650817320118702 (new tightest upper bounds).
-
-Measurement 3, intended load 5348832.933500009 pps (slightly less than MRR2
-in preparation for first intermediate phase target interval width),
-measured loss ratio 0.00964383362905351 (new tightest upper bounds).
-
-* First intermediate phase starts, trial duration still 1.0 seconds.
-
-Measurement 4, intended load 4936605.579021453 pps (no lower bound,
-performing external search downwards, for NDR),
-measured loss ratio 0.0 (valid lower bound for both NDR and PDR).
-
-Measurement 5, intended load 5138587.208637197 pps (bisecting for NDR),
-measured loss ratio 0.0 (new tightest lower bounds).
-
-Measurement 6, intended load 5242656.244044665 pps (bisecting),
-measured loss ratio 0.013523745379347257 (new tightest upper bounds).
-
-* Both intervals are narrow enough.
-* Second intermediate phase starts, trial duration 5.477225575051661 seconds.
-
-Measurement 7, intended load 5190360.904111567 pps (initial bisect for NDR),
-measured loss ratio 0.0023533920869969953 (NDR upper bound, PDR lower bound).
-
-Measurement 8, intended load 5138587.208637197 pps (re-measuring NDR lower bound),
-measured loss ratio 1.2080222912800403e-06 (new tightest NDR upper bound).
-
-* The two intervals have separate bounds from now on.
-
-Measurement 9, intended load 4936605.381062318 pps (external NDR search down),
-measured loss ratio 0.0 (new valid NDR lower bound).
-
-Measurement 10, intended load 5036583.888432355 pps (NDR bisect),
-measured loss ratio 0.0 (new tightest NDR lower bound).
-
-Measurement 11, intended load 5087329.903232804 pps (NDR bisect),
-measured loss ratio 0.0 (new tightest NDR lower bound).
-
-* NDR interval is narrow enough, PDR interval not ready yet.
-
-Measurement 12, intended load 5242656.244044665 pps (re-measuring PDR upper bound),
-measured loss ratio 0.0101174866190136 (still valid PDR upper bound).
-
-* Also PDR interval is narrow enough, with valid bounds for this duration.
-* Final phase starts, trial duration 30.0 seconds.
-
-Measurement 13, intended load 5112894.3238511775 pps (initial bisect for NDR),
-measured loss ratio 0.0 (new tightest NDR lower bound).
-
-Measurement 14, intended load 5138587.208637197 (re-measuring NDR upper bound),
-measured loss ratio 2.030389804256833e-06 (still valid PDR upper bound).
-
-* NDR interval is narrow enough, PDR interval not yet.
-
-Measurement 15, intended load 5216443.04126728 pps (initial bisect for PDR),
-measured loss ratio 0.005620871287975237 (new tightest PDR upper bound).
-
-Measurement 16, intended load 5190360.904111567 (re-measuring PDR lower bound),
-measured loss ratio 0.0027629971184465604 (still valid PDR lower bound).
-
-* PDR interval is also narrow enough.
-* Returning bounds:
-* NDR_LOWER = 5112894.3238511775 pps; NDR_UPPER = 5138587.208637197 pps;
-* PDR_LOWER = 5190360.904111567 pps; PDR_UPPER = 5216443.04126728 pps.
-
-# IANA Considerations
-
-No requests of IANA.
-
-# Security Considerations
-
-Benchmarking activities as described in this memo are limited to
-technology characterization of a DUT/SUT using controlled stimuli in a
-laboratory environment, with dedicated address space and the constraints
-specified in the sections above.
-
-The benchmarking network topology will be an independent test setup and
-MUST NOT be connected to devices that may forward the test traffic into
-a production network or misroute traffic to the test management network.
-
-Further, benchmarking is performed on a "black-box" basis, relying
-solely on measurements observable external to the DUT/SUT.
-
-Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
-benchmarking purposes. Any implications for network security arising
-from the DUT/SUT SHOULD be identical in the lab and in production
-networks.
-
-# Acknowledgements
-
-Many thanks to Alec Hothan of OPNFV NFVbench project for thorough
-review and numerous useful comments and suggestions.
-
---- back
diff --git a/docs/ietf/draft-ietf-bmwg-mlrsearch-03.md b/docs/ietf/draft-ietf-bmwg-mlrsearch-03.md
new file mode 100644
index 0000000000..40180dc55b
--- /dev/null
+++ b/docs/ietf/draft-ietf-bmwg-mlrsearch-03.md
@@ -0,0 +1,501 @@
+---
+title: Multiple Loss Ratio Search
+abbrev: MLRsearch
+docname: draft-ietf-bmwg-mlrsearch-03
+date: 2022-11-09
+
+ipr: trust200902
+area: ops
+wg: Benchmarking Working Group
+kw: Internet-Draft
+cat: info
+
+coding: us-ascii
+pi:    # can use array (if all yes) or hash here
+  toc: yes
+  sortrefs:   # defaults to yes
+  symrefs: yes
+
+author:
+      -
+        ins: M. Konstantynowicz
+        name: Maciek Konstantynowicz
+        org: Cisco Systems
+        email: mkonstan@cisco.com
+      -
+        ins: V. Polak
+        name: Vratko Polak
+        org: Cisco Systems
+        email: vrpolak@cisco.com
+
+normative:
+  RFC1242:
+  RFC2285:
+  RFC2544:
+  RFC9004:
+
+informative:
+  TST009:
+    target: https://www.etsi.org/deliver/etsi_gs/NFV-TST/001_099/009/03.04.01_60/gs_NFV-TST009v030401p.pdf
+    title: "TST 009"
+  FDio-CSIT-MLRsearch:
+    target: https://s3-docs.fd.io/csit/rls2110/report/introduction/methodology_data_plane_throughput/methodology_data_plane_throughput.html#mlrsearch-tests
+    title: "FD.io CSIT Test Methodology - MLRsearch"
+    date: 2021-11
+  PyPI-MLRsearch:
+    target: https://pypi.org/project/MLRsearch/0.3.0/
+    title: "MLRsearch 0.3.0, Python Package Index"
+    date: 2021-04
+
+--- abstract
+
+This document proposes improvements to [RFC2544] throughput search by
+defining a new methodology called Multiple Loss Ratio search
+(MLRsearch). The main objectives for MLRsearch are to minimize the
+total test duration, search for multiple loss ratios and improve
+results repeatibility and comparability.
+
+The main motivation behind MLRsearch is the new set of challenges and
+requirements posed by testing Network Function Virtualization
+(NFV) systems and other software based network data planes.
+
+MLRsearch offers several ways to address these challenges, giving user
+configuration options to select their way.
+
+--- middle
+
+{::comment}
+    As we use kramdown to convert from markdown,
+    we use this way of marking comments not to be visible in rendered draft.
+    https://stackoverflow.com/a/42323390
+    If other engine is used, convert to this way:
+    https://stackoverflow.com/a/20885980
+{:/comment}
+
+# Purpose and Scope
+
+The purpose of this document is to describe Multiple Loss Ratio search
+(MLRsearch), a throughput search methodology optimized for software
+DUTs.
+
+Applying vanilla [RFC2544] throughput bisection to software DUTs
+results in a number of problems:
+
+- Binary search takes too long as most of trials are done far from the
+  eventually found throughput.
+- The required final trial duration and pauses between trials also
+  prolong the overall search duration.
+- Software DUTs show noisy trial results (noisy neighbor problem),
+  leading to big spread of possible discovered throughput values.
+- Throughput requires loss of exactly zero packets, but the industry
+  frequently allows for small but non-zero losses.
+- The definition of throughput is not clear when trial results are
+  inconsistent.
+
+MLRsearch aims to address these problems by applying the following set
+of enhancements:
+
+- Allow searching with multiple loss ratio goals.
+  - Each trial result can affect any search goal in principle
+    (trial reuse).
+- Multiple phases within one loss ratio goal search, middle ones need
+  to spend less time on trials.
+  - Middle phases also aim at lesser precision.
+  - Use Forwarding Rate (FR) at maximum offered load
+    [RFC2285] (section 3.6.2) to initialize the first middle phase.
+- Take care when dealing with inconsistent trial results.
+  - Loss ratios goals are handled in an order that precludes any
+    interference from later trials to earlier goals.
+- Apply several load selection heuristics to save even more time
+  by trying hard to avoid unnecessarily narrow intervals.
+
+MLRsearch configuration options are flexible enough to
+support both conservative settings (unconditionally compliant with [RFC2544],
+but longer search duration and worse repeatability) and aggressive
+settings (shorter search duration and better repeatability but not
+compliant with [RFC2544]).
+
+No part of [RFC2544] is intended to be obsoleted by this document.
+
+# Problems
+
+## Long Test Duration
+
+Emergence of software DUTs, with frequent software updates and a
+number of different packet processing modes and configurations, drives
+the requirement of continuous test execution and bringing down the test
+execution time.
+
+In the context of characterising particular DUT's network performance, this
+calls for improving the time efficiency of throughput search.
+A vanilla bisection (at 60sec trial duration for unconditional [RFC2544]
+compliance) is slow, because most trials spend time quite far from the
+eventual throughput.
+
+[RFC2544] does not specify any stopping condition for throughput search,
+so users can trade-off between search duration and precision goal.
+But, due to exponential behavior of bisection, small improvement
+in search duration needs relatively big sacrifice in the result precision.
+
+## DUT within SUT
+
+[RFC2285] defines:
+- *DUT* as
+  - The network forwarding device to which stimulus is offered and
+    response measured [RFC2285] (section 3.1.1).
+- *SUT* as
+  - The collective set of network devices to which stimulus is offered
+    as a single entity and response measured [RFC2285] (section 3.1.2).
+
+[RFC2544] specifies a test setup with an external tester stimulating the
+networking system, treating it either as a single DUT, or as a system
+of devices, an SUT.
+
+In case of software networking, the SUT consists of a software program
+processing packets (device of interest, the DUT),
+running on a server hardware and using operating system functions as appropriate,
+with server hardware resources shared across all programs
+and the operating system.
+
+DUT is effectively "nested" within SUT.
+
+Due to a shared multi-tenant nature of SUT, DUT is subject to
+interference (noise) coming from the operating system and any other
+software running on the same server. Some sources of noise can be
+eliminated (e.g. by pinning DUT program threads to specific CPU cores
+and isolating those cores to avoid context switching). But some
+noise remains after all such reasonable precautions are applied. This
+noise does negatively affect DUT's network performance. We refer to it
+as an *SUT noise*.
+
+DUT can also exhibit fluctuating performance itself, e.g. while performing
+some "stop the world" internal stateful processing. In many cases this
+may be an expected per-design behavior, as it would be observable even
+in a hypothetical scenario where all sources of SUT noise are
+eliminated. Such behavior affects trial results in a way similar to SUT
+noise. We use *noise* as a shorthand covering both *DUT fluctuations* and
+genuine SUT noise.
+
+A simple model of SUT performance consists of a baseline *noiseless performance*,
+and an additional noise. The baseline is assumed to be constant (enough).
+The noise varies in time, sometimes wildly. The noise can sometimes be negligible,
+but frequently it lowers the observed SUT performance in a trial.
+
+In this model, SUT does not have a single performance value, it has a spectrum.
+One end of the spectrum is the noiseless baseline,
+the other end is a *noiseful performance*. In practice, trial results
+close to the noiseful end of the spectrum happen only rarely.
+The worse performance, the more rarely it is seen.
+
+Focusing on DUT, the benchmarking effort should aim
+at eliminating only the SUT noise from SUT measurement.
+But that is not really possible, as there are no realistic enough models
+able to distinguish SUT noise from DUT fluctuations.
+
+However, assuming that a well-constructed SUT has the DUT as its
+performance bottleneck, the "DUT noiseless performance" can be defined
+as the noiseless end of SUT performance spectrum. (At least for
+throughput. For other quantities such as latency there will be an
+additive difference.) By this definition, DUT noiseless performance
+also minimizes the impact of DUT fluctuations.
+
+In this document, we reduce the "DUT within SUT" problem to estimating
+the noiseless end of SUT performance spectrum from a limited number of
+trial results.
+
+Any improvements to throughput search algorithm, aimed for better
+dealing with software networking SUT and DUT setup, should employ
+strategies recognizing the presence of SUT noise, and allow discovery of
+(proxies for) DUT noiseless performance
+at different levels of sensitivity to SUT noise.
+
+## Repeatability and Comparability
+
+[RFC2544] does not suggest to repeat throughput search, and from just one
+throughput value, it cannot be determined how repeatable that value is.
+In practice, poor repeatability is also the main cause of poor
+comparability, e.g. different benchmarking teams can test the same DUT
+but get different throughput values.
+
+[RFC2544] throughput requirements (60s trial, no tolerance to single frame loss)
+force the search to converge around the noiseful end of SUT performance
+spectrum. As that end is affected by rare trials of significantly low
+performance, the resulting throughput repeatability is poor.
+
+The repeatability problem is the problem of defining a search procedure
+which reports more stable results
+(even if they can no longer be called "throughput" in [RFC2544] sense).
+According to baseline (noiseless) and noiseful model, better repeatability
+will be at the noiseless end of the spectrum.
+Therefore, solutions to the "DUT within SUT" problem
+will help also with the repeatability problem.
+
+Conversely, any alteration to [RFC2544] throughput search
+that improves repeatability should be considered
+as less dependent on the SUT noise.
+
+An alternative option is to simply run a search multiple times, and report some
+statistics (e.g. average and standard deviation). This can be used
+for "important" tests, but it makes the search duration problem even
+bigger.
+
+## Throughput with Non-Zero Loss
+
+[RFC1242] (section 3.17) defines throughput as:
+    The maximum rate at which none of the offered frames
+    are dropped by the device.
+
+and then it says:
+    Since even the loss of one frame in a
+    data stream can cause significant delays while
+    waiting for the higher level protocols to time out,
+    it is useful to know the actual maximum data
+    rate that the device can support.
+
+Contrary to that, many benchmarking teams settle with non-zero
+(small) loss ratio as the goal for a "throughput rate".
+
+Motivations are many: modern protocols tolerate frame loss better;
+trials nowadays send way more frames within the same duration;
+impact of rare noise bursts is smaller as the baseline performance
+can compensate somewhat by keeping the loss ratio below the goal;
+if SUT noise with "ideal DUT" is known, it can be set as the loss ratio goal.
+
+Regardless of validity of any and all similar motivations,
+support for non-zero loss goals makes any search algorithm more user-friendly.
+[RFC2544] throughput is not friendly in this regard.
+
+Searching for multiple loss ratio goals also helps to describe the SUT
+performance better than a single goal result. Repeated wide gap between
+zero and non-zero loss loads indicates the noise has a large impact on
+the overall SUT performance.
+
+It is easy to modify the vanilla bisection to find a lower bound
+for intended load that satisfies a non-zero-loss goal,
+but it is not that obvious how to search for multiple goals at once,
+hence the support for multiple loss goals remains a problem.
+
+## Inconsistent Trial Results
+
+While performing throughput search by executing a sequence of
+measurement trials, there is a risk of encountering inconsistencies
+between trial results.
+
+The plain bisection never encounters inconsistent trials.
+But [RFC2544] hints about possibility if inconsistent trial results in two places.
+The first place is section 24 where full trial durations are required, presumably
+because they can be inconsistent with results from shorter trial durations.
+The second place is section 26.3 where two successive zero-loss trials
+are recommended, presumably because after one zero-loss trial
+there can be subsequent inconsistent non-zero-loss trial.
+
+Examples include:
+
+- a trial at the same load (same or different trial duration) results
+  in a different packet loss ratio.
+- a trial at higher load (same or different trial duration) results
+  in a smaller packet loss ratio.
+
+Any robust throughput search algorithm needs to decide how to continue
+the search in presence of such inconsistencies.
+Definitions of throughput in [RFC1242] and [RFC2544] are not specific enough
+to imply a unique way of handling such inconsistencies.
+
+Ideally, there will be a definition of a quantity which both generalizes
+throughput for non-zero-loss (and other possible repeatibility enhancements),
+while being precise enough to force a specific way to resolve trial
+inconsistencies.
+But until such definition is agreed upon, the correct way to handle
+inconsistent trial results remains an open problem.
+
+# MLRsearch Approach
+
+The following description intentionally leaves out some important implementation
+details. This is both to hide complexity that is not important for overall
+understanding, and to allow future improvements in the implementation.
+
+## Terminology
+
+- *trial duration*: Amount of time over which frames are transmitted
+  towards SUT and DUT in a single measurement step.
+  - **MLRsearch input parameter** for final MLRsearch measurements.
+- *loss ratio*: Ratio of the count of frames lost to the count of frames
+  transmitted over a trial duration, a.k.a. packet loss ratio. Related
+  to packet loss rate [RFC1242] (section 3.6).
+  In MLRsearch loss ratio can mean either a trial result or a goal:
+  - *trial loss ratio*: Loss ratio measured during a trial.
+  - *loss ratio goal*: **MLRsearch input parameter**.
+    - If *trial loss ratio* is smaller or equal to this,
+      the trial **satisfies** the loss ratio goal.
+- *load*: Constant offered load stimulating the SUT and DUT. Consistent
+  with offered load [RFC2285] (section 3.5.2).
+  - MLRsearch works with intended load instead, as it cannot deal with
+    situations where the offered load is considerably different than
+    intended load.
+- *throughput*: The maximum load at which none of the offered frames are
+  dropped by the SUT and DUT. Consistent with [RFC1242] (section 3.17).
+- *conditional throughput*: The forwarding rate measured at the maximum
+  load at which a list of specified conditions are met i.e. loss ratio
+  goal and trial duration.
+  - Throughput is then a special case of conditional throughput
+    for zero loss ratio goal and long enough trial duration.
+  - Conditional throughput is aligned with forwarding rate (FR)
+    [RFC2285] (section 3.6.1), adding trial duration to offered load
+    required when reporting FR.
+- *lower bound*: One of values tracked by MLRsearch during the search runtime.
+  It is specific to the current trial duration and current loss ratio goal.
+  It represents a load value with at least one trial result available.
+  If the trial satisfies the current loss ratio goal,
+  it is a *valid* bound (else *invalid*).
+- *upper bound*: One of values tracked by MLRsearch during the search runtime.
+  It is specific to the current trial duration and current loss ratio goal.
+  It represents a load value with at least one trial result available.
+  If the trial satisfies the current loss ratio goal,
+  it is an *invalid* bound (else *valid*).
+- *interval*: The span between lower and upper bound loads.
+- *precision goal*: **MLRsearch input parameter**, acting as a search
+  stop condition, given as either absolute or relative width goal. An
+  interval meets precision goal if:
+  - The difference of upper and lower bound loads (in pps)
+    is not more than the absolute width goal.
+  - The difference as above, divided by upper bound load (in pps)
+    is not more than the relative width goal.
+
+## Description
+
+The MLRsearch approach to address the identified problems is based
+on the following main strategies:
+
+- MLRsearch main inputs include the following search goals and parameters:
+  - One or more **loss ratio goals**.
+    - e.g. a zero-loss goal and one (or more) non-zero-loss goals.
+  - **Target trial duration** condition governing required trial duration
+    for final measurements.
+  - **Target precision** condition governing how close final lower and
+    upper bound load values must be to each other for final
+    measurements.
+- Search is executed as a sequence of phases:
+  - *Initial phase* initializes bounds for the first middle phase.
+  - *Middle phase*s narrow down the bounds, using shorter trial
+    durations and lower precision goals. Several middle phases can
+    precede each final phase.
+  - *Final phase* (one per loss ratio goal) finds bounds matching input
+    goals and parameters to serve as the overal search output.
+- Each search phase produces its *ending* upper bound and lower bound:
+  - Initial phase may produce invalid bounds.
+  - Middle and final phases produce valid bounds.
+  - Middle or final phases needs at least two values to act as 
+    *starting* bounds (may be invalid).
+  - Each phase may perform several trial measurements, until phase's
+    ending conditions are all met.
+  - Trial results from previous phases may be re-used.
+- Initial phase establishes the starting values for bounds, using
+  forwarding rates (FR) [RFC2285] (section 3.6.1)
+  from a few trials of minimal duration, as follows:
+  - 1st trial is done at *maximum offered load (MOL)* [RFC2285] (section 3.5.3),
+    resulting in Forwarding rate at maximum offered load (FRMOL)
+    [RFC2285] (section 3.6.2).
+  - 2nd trial is done at *FRMOL*, resulting in forwarding rate at FRMOL (FRFRMOL),
+    newly defined here.
+  - 3rd trial is done at *FRFRMOL*, so its results are available for the next phase.
+  - By default, FRMOL is used as an upper bound, FRFRMOL as a lower bound.
+    - Adjustments may apply here for some cases e.g. when 2nd trial got
+      zero loss or if FRFRMOL is too close to FRMOL.
+- Middle phases are producing ending bounds by improving upon starting bounds:
+  - Each middle phase uses the same loss ratio goal as the final phase it precedes.
+    - Called *current loss ratio goal* for upper and lower bound purposes.
+  - Each middle phase has its own *current trial duration*
+    and *current precision goal* parameters, computed from
+    MLRsearch input parameters.
+    As phases progress, these parameters approach MLRsearch main input values.
+    - Current trial duration starts from a configurable minimum (e.g. 1 sec)
+      and increases in a geometric sequence.
+    - Current precision goal always allows twice as wide intervals
+      as the following phase.
+  - The starting bounds are usually the ending bounds from the preceding phase.
+    - Unless there are many previous trial results that are more promising.
+  - Each middle phase operates in a sequence of four actions:
+    1. Perform trial at the load between the starting bounds.
+      - Depending on the trial result this becomes the first
+        new valid upper or lower bound for current phase.
+    2. Re-measure at the remaining starting lower or upper (respectively) bound.
+    3. If that did not result in a valid bound, start an *external search*.
+      - That is a variant of exponential search.
+        - The "growth" is given by input parameter *expansion_coefficient*.
+      - This action ends when a new valid bound is found.
+        - Or if an already existing valid bound becomes close enough.
+    4. Repeatedly bisect the current interval until the bounds are close enough.
+- Final search phase operates in exactly the same way as middle phases.
+  There are two reasons why it is named differently:
+  - The current trial duration and current precision goal within the phase
+    are equal to the target trial duration and target precision input parameters.
+  - The forwarding rates of the ending bounds become the output of MLRsearch.
+    - Specifically, the forwarding rates of the final lower bounds
+      are the conditional throughput values per given loss ratio goals.
+
+## Enhancement: Multiple trials per load
+
+An enhancement of MLRsearch is to introduce a *noise tolerance* input parameter.
+The idea is to perform several medium-length trials (instead of a single long trial)
+and tolerate a configurable fraction of them to not-satisfy the loss ratio goal.
+
+MLRsearch implementation with this enhancement exists in FD.io CSIT project
+and test results of VPP and DPDK (testpmd, l3fwd) DUTs look promising.
+
+This enhancement would make the description of MLRsearch approach
+considerably more complicated, so this document version only describes
+MLRsearch without this enhancement.
+
+# How the problems are addressed
+
+Configurable loss ratio goals are in direct support for non-zero-loss conditional througput.
+In practice the conditional throughput results' stability
+increases with higher loss ratio goals.
+
+Multiple trials with noise tolerance enhancement will also indirectly
+increase result stability and it will allow MLRsearch
+to add all the benefits of Binary Search with Loss Verification,
+as recommended in [RFC9004] (section 6.2)
+and specified in [TST009] (section 12.3.3).
+
+The main factor improving the overall search time is the introduction
+of middle phases. The full implementation can bring a large number of
+heuristics related to how exactly should the next trial load be chosen,
+but the impact of those is not as big.
+
+The Description subsection lacks any details on how to handle inconsistent
+trial results. In practice, there tend to be a three-way trade-off
+between i) short overall search time, ii) result stability
+and iii) how simple the definition of the returned conditional throughput can be.
+The third one is important for comparability between different MLRsearch
+implementations.
+
+# IANA Considerations
+
+No requests of IANA.
+
+# Security Considerations
+
+Benchmarking activities as described in this memo are limited to
+technology characterization of a DUT/SUT using controlled stimuli in a
+laboratory environment, with dedicated address space and the constraints
+specified in the sections above.
+
+The benchmarking network topology will be an independent test setup and
+MUST NOT be connected to devices that may forward the test traffic into
+a production network or misroute traffic to the test management network.
+
+Further, benchmarking is performed on a "black-box" basis, relying
+solely on measurements observable external to the DUT/SUT.
+
+Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
+benchmarking purposes. Any implications for network security arising
+from the DUT/SUT SHOULD be identical in the lab and in production
+networks.
+
+# Acknowledgements
+
+Many thanks to Alec Hothan of OPNFV NFVbench project for thorough
+review and numerous useful comments and suggestions.
+
+--- back
diff --git a/docs/ietf/process.txt b/docs/ietf/process.txt
index e170352cb9..261756fc8a 100644
--- a/docs/ietf/process.txt
+++ b/docs/ietf/process.txt
@@ -19,4 +19,6 @@ $ kdrfc --version
 $ sudo gem install kramdown-rfc2629
 
 Main:
-$ kdrfc draft-ietf-bmwg-mlrsearch-02.md
+$ kdrfc draft-ietf-bmwg-mlrsearch-03.md
+
+If that complains, do it manually at https://author-tools.ietf.org/
+\ No newline at end of file