aboutsummaryrefslogtreecommitdiffstats
path: root/docs/content/methodology/data_plane_throughput
diff options
context:
space:
mode:
Diffstat (limited to 'docs/content/methodology/data_plane_throughput')
-rw-r--r--docs/content/methodology/data_plane_throughput/_index.md6
-rw-r--r--docs/content/methodology/data_plane_throughput/data_plane_throughput.md129
-rw-r--r--docs/content/methodology/data_plane_throughput/mlrsearch.md88
-rw-r--r--docs/content/methodology/data_plane_throughput/mrr_throughput.md56
-rw-r--r--docs/content/methodology/data_plane_throughput/plrsearch.md383
5 files changed, 0 insertions, 662 deletions
diff --git a/docs/content/methodology/data_plane_throughput/_index.md b/docs/content/methodology/data_plane_throughput/_index.md
deleted file mode 100644
index 5791438b3b..0000000000
--- a/docs/content/methodology/data_plane_throughput/_index.md
+++ /dev/null
@@ -1,6 +0,0 @@
----
-bookCollapseSection: true
-bookFlatSection: false
-title: "Data Plane Throughput"
-weight: 4
---- \ No newline at end of file
diff --git a/docs/content/methodology/data_plane_throughput/data_plane_throughput.md b/docs/content/methodology/data_plane_throughput/data_plane_throughput.md
deleted file mode 100644
index 7ff1d38d17..0000000000
--- a/docs/content/methodology/data_plane_throughput/data_plane_throughput.md
+++ /dev/null
@@ -1,129 +0,0 @@
----
-title: "Data Plane Throughput"
-weight: 1
----
-
-# Data Plane Throughput
-
-Network data plane throughput is measured using multiple test methods in
-order to obtain representative and repeatable results across the large
-set of performance test cases implemented and executed within CSIT.
-
-Following throughput test methods are used:
-
-- MLRsearch - Multiple Loss Ratio search
-- MRR - Maximum Receive Rate
-- PLRsearch - Probabilistic Loss Ratio search
-
-Description of each test method is followed by generic test properties
-shared by all methods.
-
-## MLRsearch Tests
-
-### Description
-
-Multiple Loss Ratio search (MLRsearch) tests discover multiple packet
-throughput rates in a single search, reducing the overall test execution
-time compared to a binary search. Each rate is associated with a
-distinct Packet Loss Ratio (PLR) criteria. In FD.io CSIT two throughput
-rates are discovered: Non-Drop Rate (NDR, with zero packet loss, PLR=0)
-and Partial Drop Rate (PDR, with PLR<0.5%). MLRsearch is compliant with
-RFC2544.
-
-### Usage
-
-MLRsearch tests are run to discover NDR and PDR rates for each VPP and
-DPDK release covered by CSIT report. Results for small frame sizes
-(64b/78B, IMIX) are presented in packet throughput graphs
-(Box-and-Whisker Plots) with NDR and PDR rates plotted against the test
-cases covering popular VPP packet paths.
-
-Each test is executed at least 10 times to verify measurements
-repeatability and results are compared between releases and test
-environments. NDR and PDR packet and bandwidth throughput results for
-all frame sizes and for all tests are presented in detailed results
-tables.
-
-### Details
-
-See [MLRSearch]({{< ref "mlrsearch/#MLRsearch" >}}) section for more detail.
-MLRsearch is being standardized in IETF in
-[draft-ietf-bmwg-mlrsearch](https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-mlrsearch-01).
-
-## MRR Tests
-
-### Description
-
-Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
-as they provide a maximum “raw” throughput benchmark for development and
-testing community.
-
-MRR tests measure the packet forwarding rate under the maximum load
-offered by traffic generator (dependent on link type and NIC model) over
-a set trial duration, regardless of packet loss. Maximum load for
-specified Ethernet frame size is set to the bi-directional link rate.
-
-### Usage
-
-MRR tests are much faster than MLRsearch as they rely on a single trial
-or a small set of trials with very short duration. It is this property
-that makes them suitable for continuous execution in daily performance
-trending jobs enabling detection of performance anomalies (regressions,
-progressions) resulting from data plane code changes.
-
-MRR tests are also used for VPP per patch performance jobs verifying
-patch performance vs parent. CSIT reports include MRR throughput
-comparisons between releases and test environments. Small frame sizes
-only (64b/78B, IMIX).
-
-### Details
-
-See [MRR Throughput]({{< ref "mrr_throughput/#MRR Throughput" >}})
-section for more detail about MRR tests configuration.
-
-FD.io CSIT performance dashboard includes complete description of
-[daily performance trending tests](https://s3-docs.fd.io/csit/master/trending/methodology/performance_tests.html)
-and [VPP per patch tests](https://s3-docs.fd.io/csit/master/trending/methodology/perpatch_performance_tests.html).
-
-## PLRsearch Tests
-
-### Description
-
-Probabilistic Loss Ratio search (PLRsearch) tests discovers a packet
-throughput rate associated with configured Packet Loss Ratio (PLR)
-criteria for tests run over an extended period of time a.k.a. soak
-testing. PLRsearch assumes that system under test is probabilistic in
-nature, and not deterministic.
-
-### Usage
-
-PLRsearch are run to discover a sustained throughput for PLR=10^-7
-(close to NDR) for VPP release covered by CSIT report. Results for small
-frame sizes (64b/78B) are presented in packet throughput graphs (Box
-Plots) for a small subset of baseline tests.
-
-Each soak test lasts 30 minutes and is executed at least twice. Results are
-compared against NDR and PDR rates discovered with MLRsearch.
-
-### Details
-
-See [PLRSearch]({{< ref "plrsearch/#PLRsearch" >}}) methodology section for
-more detail. PLRsearch is being standardized in IETF in
-[draft-vpolak-bmwg-plrsearch](https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch).
-
-## Generic Test Properties
-
-All data plane throughput test methodologies share following generic
-properties:
-
-- Tested L2 frame sizes (untagged Ethernet):
-
- - IPv4 payload: 64B, IMIX (28x64B, 16x570B, 4x1518B), 1518B, 9000B.
- - IPv6 payload: 78B, IMIX (28x78B, 16x570B, 4x1518B), 1518B, 9000B.
- - All quoted sizes include frame CRC, but exclude per frame
- transmission overhead of 20B (preamble, inter frame gap).
-
-- Offered packet load is always bi-directional and symmetric.
-- All measured and reported packet and bandwidth rates are aggregate
- bi-directional rates reported from external Traffic Generator
- perspective. \ No newline at end of file
diff --git a/docs/content/methodology/data_plane_throughput/mlrsearch.md b/docs/content/methodology/data_plane_throughput/mlrsearch.md
deleted file mode 100644
index 73039c9b02..0000000000
--- a/docs/content/methodology/data_plane_throughput/mlrsearch.md
+++ /dev/null
@@ -1,88 +0,0 @@
----
-title: "MLRsearch"
-weight: 2
----
-
-# MLRsearch
-
-## Overview
-
-Multiple Loss Ratio search (MLRsearch) tests use an optimized search algorithm
-implemented in FD.io CSIT project. MLRsearch discovers any number of
-loss ratio loads in a single search.
-
-Two loss ratio goals are of interest in FD.io CSIT, leading to Non-Drop Rate
-(NDR, loss ratio goal is exact zero) and Partial Drop Rate
-(PDR, non-zero loss ratio goal, currently 0.5%).
-
-MLRsearch discovers all the loads in a single pass, reducing required time
-duration compared to separate `binary search`es[^1] for each rate. Overall
-search time is reduced even further by relying on shorter trial
-durations of intermediate steps, with only the final measurements
-conducted at the specified final trial duration. This results in the
-shorter overall execution time when compared to standard NDR/PDR binary
-search, while guaranteeing similar results.
-
-.. Note:: All throughput rates are *always* bi-directional
- aggregates of two equal (symmetric) uni-directional packet rates
- received and reported by an external traffic generator,
- unless the test specifically requires unidirectional traffic.
-
-## Search Implementation
-
-Detailed description of the MLRsearch algorithm is included in the IETF
-draft
-[draft-ietf-bmwg-mlrsearch-02](https://datatracker.ietf.org/doc/html/draft-ietf-bmwg-mlrsearch-02)
-that is in the process of being standardized in the IETF Benchmarking
-Methodology Working Group (BMWG).
-(Newer version is published in IETF, describing improvements not yet used
-in CSIT production.)
-
-MLRsearch is also available as a
-[PyPI (Python Package Index) library](https://pypi.org/project/MLRsearch/).
-
-## Algorithm highlights
-
-MRR and receive rate at MRR load are used as initial guesses for the search.
-
-All previously measured trials (except the very first one which can act
-as a warm-up) are taken into consideration, unless superseded
-by a trial at the same load but higher duration.
-
-For every loss ratio goal, tightest upper and lower bound
-(from results of large enough trial duration) form an interval.
-Exit condition is given by that interval reaching low enough relative width.
-Small enough width is achieved by bisecting the current interval.
-The bisection can be uneven, to save measurements based on information theory.
-
-Switching to higher trial duration generally requires a re-measure
-at a load from previous trial duration.
-When the re-measurement does not confirm previous bound classification
-(e.g. tightest lower bound at shorter trial duration becomes
-a newest tightest upper bound upon re-measurement),
-external search is used to find close enough bound of the lost type.
-External search is a generalization of the first stage of
-`exponential search`[^2].
-
-Shorter trial durations use double width goal,
-because one bisection is always safe before risking external search.
-
-Within an iteration for a specific trial duration, smaller loss ratios (NDR)
-are narrowed down first before search continues with higher loss ratios (PDR).
-
-Other heuristics are there, aimed to prevent unneccessarily narrow intervals,
-and to handle corner cases around min and max load.
-
-## Deviations from RFC 2544
-
-CSIT does not have any explicit wait times before and after trial traffic.
-
-Small differences between intended and offered load are tolerated,
-mainly due to various time overheads preventing precise measurement
-of the traffic duration (and TRex can sometimes suffer from duration
-stretching).
-
-The final trial duration is only 30s (10s for reconf tests).
-
-[^1]: [binary search](https://en.wikipedia.org/wiki/Binary_search)
-[^2]: [exponential search](https://en.wikipedia.org/wiki/Exponential_search)
diff --git a/docs/content/methodology/data_plane_throughput/mrr_throughput.md b/docs/content/methodology/data_plane_throughput/mrr_throughput.md
deleted file mode 100644
index 076946fb66..0000000000
--- a/docs/content/methodology/data_plane_throughput/mrr_throughput.md
+++ /dev/null
@@ -1,56 +0,0 @@
----
-title: "MRR Throughput"
-weight: 4
----
-
-# MRR Throughput
-
-Maximum Receive Rate (MRR) tests are complementary to MLRsearch tests,
-as they provide a maximum "raw" throughput benchmark for development and
-testing community. MRR tests measure the packet forwarding rate under
-the maximum load offered by traffic generator over a set trial duration,
-regardless of packet loss.
-
-MRR tests are currently used for following test jobs:
-
-- Report performance comparison: 64B, IMIX for vhost, memif.
-- Daily performance trending: 64B, IMIX for vhost, memif.
-- Per-patch performance verification: 64B.
-- Initial iterations of MLRsearch and PLRsearch: 64B.
-
-Maximum offered load for specific L2 Ethernet frame size is set to
-either the maximum bi-directional link rate or tested NIC model
-capacity, as follows:
-
-- For 10GE NICs the maximum packet rate load is 2x14.88 Mpps for 64B, a
- 10GE bi-directional link rate.
-- For 25GE NICs the maximum packet rate load is 2x18.75 Mpps for 64B, a
- 25GE bi-directional link sub-rate limited by 25GE NIC used on TRex TG,
- XXV710.
-- For 40GE NICs the maximum packet rate load is 2x18.75 Mpps for 64B, a
- 40GE bi-directional link sub-rate limited by 40GE NIC used on TRex
- TG,XL710. Packet rate for other tested frame sizes is limited by
- PCIeGen3 x8 bandwidth limitation of ~50Gbps.
-
-MRR test code implements multiple bursts of offered packet load and has
-two configurable burst parameters: individual trial duration and number
-of trials in a single burst. This enables more precise performance
-trending by providing more results data for analysis.
-
-Burst parameter settings vary between different tests using MRR:
-
-- MRR individual trial duration:
-
- - Report performance comparison: 1 sec.
- - Daily performance trending: 1 sec.
- - Per-patch performance verification: 10 sec.
- - Initial iteration for MLRsearch: 1 sec.
- - Initial iteration for PLRsearch: 5.2 sec.
-
-- Number of MRR trials per burst:
-
- - Report performance comparison: 10.
- - Daily performance trending: 10.
- - Per-patch performance verification: 5.
- - Initial iteration for MLRsearch: 1.
- - Initial iteration for PLRsearch: 1. \ No newline at end of file
diff --git a/docs/content/methodology/data_plane_throughput/plrsearch.md b/docs/content/methodology/data_plane_throughput/plrsearch.md
deleted file mode 100644
index 1facccc63b..0000000000
--- a/docs/content/methodology/data_plane_throughput/plrsearch.md
+++ /dev/null
@@ -1,383 +0,0 @@
----
-title: "PLRsearch"
-weight: 3
----
-
-# PLRsearch
-
-## Motivation for PLRsearch
-
-Network providers are interested in throughput a system can sustain.
-
-`RFC 2544`[^3] assumes loss ratio is given by a deterministic function of
-offered load. But NFV software systems are not deterministic enough.
-This makes deterministic algorithms (such as `binary search`[^9] per RFC 2544
-and MLRsearch with single trial) to return results,
-which when repeated show relatively high standard deviation,
-thus making it harder to tell what "the throughput" actually is.
-
-We need another algorithm, which takes this indeterminism into account.
-
-## Generic Algorithm
-
-Detailed description of the PLRsearch algorithm is included in the IETF
-draft `draft-vpolak-bmwg-plrsearch-02`[^1] that is in the process
-of being standardized in the IETF Benchmarking Methodology Working Group (BMWG).
-
-### Terms
-
-The rest of this page assumes the reader is familiar with the following terms
-defined in the IETF draft:
-
-+ Trial Order Independent System
-+ Duration Independent System
-+ Target Loss Ratio
-+ Critical Load
-+ Offered Load regions
-
- + Zero Loss Region
- + Non-Deterministic Region
- + Guaranteed Loss Region
-
-+ Fitting Function
-
- + Stretch Function
- + Erf Function
-
-+ Bayesian Inference
-
- + Prior distribution
- + Posterior Distribution
-
-+ Numeric Integration
-
- + Monte Carlo
- + Importance Sampling
-
-## FD.io CSIT Implementation Specifics
-
-The search receives min_rate and max_rate values, to avoid measurements
-at offered loads not supporeted by the traffic generator.
-
-The implemented tests cases use bidirectional traffic.
-The algorithm stores each rate as bidirectional rate (internally,
-the algorithm is agnostic to flows and directions,
-it only cares about aggregate counts of packets sent and packets lost),
-but debug output from traffic generator lists unidirectional values.
-
-### Measurement Delay
-
-In a sample implemenation in FD.io CSIT project, there is roughly 0.5
-second delay between trials due to restrictons imposed by packet traffic
-generator in use (T-Rex).
-
-As measurements results come in, posterior distribution computation takes
-more time (per sample), although there is a considerable constant part
-(mostly for inverting the fitting functions).
-
-Also, the integrator needs a fair amount of samples to reach the region
-the posterior distribution is concentrated at.
-
-And of course, the speed of the integrator depends on computing power
-of the CPU the algorithm is able to use.
-
-All those timing related effects are addressed by arithmetically increasing
-trial durations with configurable coefficients
-(currently 5.1 seconds for the first trial,
-each subsequent trial being 0.1 second longer).
-
-### Rounding Errors and Underflows
-
-In order to avoid them, the current implementation tracks natural logarithm
-(instead of the original quantity) for any quantity which is never negative.
-Logarithm of zero is minus infinity (not supported by Python),
-so special value "None" is used instead.
-Specific functions for frequent operations (such as "logarithm
-of sum of exponentials") are defined to handle None correctly.
-
-### Fitting Functions
-
-Current implementation uses two fitting functions, called "stretch" and "erf".
-In general, their estimates for critical rate differ,
-which adds a simple source of systematic error,
-on top of randomness error reported by integrator.
-Otherwise the reported stdev of critical rate estimate
-is unrealistically low.
-
-Both functions are not only increasing, but also convex
-(meaning the rate of increase is also increasing).
-
-Both fitting functions have several mathematically equivalent formulas,
-each can lead to an arithmetic overflow or underflow in different sub-terms.
-Overflows can be eliminated by using different exact formulas
-for different argument ranges.
-Underflows can be avoided by using approximate formulas
-in affected argument ranges, such ranges have their own formulas to compute.
-At the end, both fitting function implementations
-contain multiple "if" branches, discontinuities are a possibility
-at range boundaries.
-
-### Prior Distributions
-
-The numeric integrator expects all the parameters to be distributed
-(independently and) uniformly on an interval (-1, 1).
-
-As both "mrr" and "spread" parameters are positive and not dimensionless,
-a transformation is needed. Dimentionality is inherited from max_rate value.
-
-The "mrr" parameter follows a `Lomax distribution`[^4]
-with alpha equal to one, but shifted so that mrr is always greater than 1
-packet per second.
-
-The "stretch" parameter is generated simply as the "mrr" value
-raised to a random power between zero and one;
-thus it follows a `reciprocal distribution`[^5].
-
-### Integrator
-
-After few measurements, the posterior distribution of fitting function
-arguments gets quite concentrated into a small area.
-The integrator is using `Monte Carlo`[^6] with `importance sampling`[^7]
-where the biased distribution is `bivariate Gaussian`[^8] distribution,
-with deliberately larger variance.
-If the generated sample falls outside (-1, 1) interval,
-another sample is generated.
-
-The center and the covariance matrix for the biased distribution
-is based on the first and second moments of samples seen so far
-(within the computation). The center is used directly,
-covariance matrix is scaled up by a heurictic constant (8.0 by default).
-The following additional features are applied
-designed to avoid hyper-focused distributions.
-
-Each computation starts with the biased distribution inherited
-from the previous computation (zero point and unit covariance matrix
-is used in the first computation), but the overal weight of the data
-is set to the weight of the first sample of the computation.
-Also, the center is set to the first sample point.
-When additional samples come, their weight (including the importance correction)
-is compared to sum of the weights of data seen so far (within the iteration).
-If the new sample is more than one e-fold more impactful, both weight values
-(for data so far and for the new sample) are set to (geometric) average
-of the two weights.
-
-This combination showed the best behavior, as the integrator usually follows
-two phases. First phase (where inherited biased distribution
-or single big sample are dominating) is mainly important
-for locating the new area the posterior distribution is concentrated at.
-The second phase (dominated by whole sample population)
-is actually relevant for the critical rate estimation.
-
-### Offered Load Selection
-
-First two measurements are hardcoded to happen at the middle of rate interval
-and at max_rate. Next two measurements follow MRR-like logic,
-offered load is decreased so that it would reach target loss ratio
-if offered load decrease lead to equal decrease of loss rate.
-
-The rest of measurements start directly in between
-erf and stretch estimate average.
-There is one workaround implemented, aimed at reducing the number of consequent
-zero loss measurements (per fitting function). The workaround first stores
-every measurement result which loss ratio was the targed loss ratio or higher.
-Sorted list (called lossy loads) of such results is maintained.
-
-When a sequence of one or more zero loss measurement results is encountered,
-a smallest of lossy loads is drained from the list.
-If the estimate average is smaller than the drained value,
-a weighted average of this estimate and the drained value is used
-as the next offered load. The weight of the estimate decreases exponentially
-with the length of consecutive zero loss results.
-
-This behavior helps the algorithm with convergence speed,
-as it does not need so many zero loss result to get near critical region.
-Using the smallest (not drained yet) of lossy loads makes it sure
-the new offered load is unlikely to result in big loss region.
-Draining even if the estimate is large enough helps to discard
-early measurements when loss hapened at too low offered load.
-Current implementation adds 4 copies of lossy loads and drains 3 of them,
-which leads to fairly stable behavior even for somewhat inconsistent SUTs.
-
-### Caveats
-
-As high loss count measurements add many bits of information,
-they need a large amount of small loss count measurements to balance them,
-making the algorithm converge quite slowly. Typically, this happens
-when few initial measurements suggest spread way bigger then later measurements.
-The workaround in offered load selection helps,
-but more intelligent workarounds could get faster convergence still.
-
-Some systems evidently do not follow the assumption of repeated measurements
-having the same average loss rate (when the offered load is the same).
-The idea of estimating the trend is not implemented at all,
-as the observed trends have varied characteristics.
-
-Probably, using a more realistic fitting functions
-will give better estimates than trend analysis.
-
-## Bottom Line
-
-The notion of Throughput is easy to grasp, but it is harder to measure
-with any accuracy for non-deterministic systems.
-
-Even though the notion of critical rate is harder to grasp than the notion
-of throughput, it is easier to measure using probabilistic methods.
-
-In testing, the difference between througput measurements and critical
-rate measurements is usually small.
-
-In pactice, rules of thumb such as "send at max 95% of purported throughput"
-are common. The correct benchmarking analysis should ask "Which notion is
-95% of throughput an approximation to?" before attempting to answer
-"Is 95% of critical rate safe enough?".
-
-## Algorithmic Analysis
-
-### Motivation
-
-While the estimation computation is based on hard probability science;
-the offered load selection part of PLRsearch logic is pure heuristics,
-motivated by what would a human do based on measurement and computation results.
-
-The quality of any heuristic is not affected by soundness of its motivation,
-just by its ability to achieve the intended goals.
-In case of offered load selection, the goal is to help the search to converge
-to the long duration estimates sooner.
-
-But even those long duration estimates could still be of poor quality.
-Even though the estimate computation is Bayesian (so it is the best it could be
-within the applied assumptions), it can still of poor quality when compared
-to what a human would estimate.
-
-One possible source of poor quality is the randomnes inherently present
-in Monte Carlo numeric integration, but that can be supressed
-by tweaking the time related input parameters.
-
-The most likely source of poor quality then are the assumptions.
-Most importantly, the number and the shape of fitting functions;
-but also others, such as trial order independence and duration independence.
-
-The result can have poor quality in basically two ways.
-One way is related to location. Both upper and lower bounds
-can be overestimates or underestimates, meaning the entire estimated interval
-between lower bound and upper bound lays above or below (respectively)
-of human-estimated interval.
-The other way is related to the estimation interval width.
-The interval can be too wide or too narrow, compared to human estimation.
-
-An estimate from a particular fitting function can be classified
-as an overestimate (or underestimate) just by looking at time evolution
-(without human examining measurement results). Overestimates
-decrease by time, underestimates increase by time (assuming
-the system performance stays constant).
-
-Quality of the width of the estimation interval needs human evaluation,
-and is unrelated to both rate of narrowing (both good and bad estimate intervals
-get narrower at approximately the same relative rate) and relatative width
-(depends heavily on the system being tested).
-
-### Graphical Examples
-
-The following pictures show the upper (red) and lower (blue) bound,
-as well as average of Stretch (pink) and Erf (light green) estimate,
-and offered load chosen (grey), as computed by PLRsearch,
-after each trial measurement within the 30 minute duration of a test run.
-
-Both graphs are focusing on later estimates. Estimates computed from
-few initial measurements are wildly off the y-axis range shown.
-
-The following analysis will rely on frequency of zero loss measurements
-and magnitude of loss ratio if nonzero.
-
-The offered load selection strategy used implies zero loss measurements
-can be gleaned from the graph by looking at offered load points.
-When the points move up farther from lower estimate, it means
-the previous measurement had zero loss. After non-zero loss,
-the offered load starts again right between (the previous values of)
-the estimate curves.
-
-The very big loss ratio results are visible as noticeable jumps
-of both estimates downwards. Medium and small loss ratios are much harder
-to distinguish just by looking at the estimate curves,
-the analysis is based on raw loss ratio measurement results.
-
-The following descriptions should explain why the graphs seem to signal
-low quality estimate at first sight, but a more detailed look
-reveals the quality is good (considering the measurement results).
-
-#### L2 patch
-
-Both fitting functions give similar estimates, the graph shows
-"stochasticity" of measurements (estimates increase and decrease
-within small time regions), and an overall trend of decreasing estimates.
-
-On the first look, the final interval looks fairly narrow,
-especially compared to the region the estimates have travelled
-during the search. But the look at the frequency of zero loss results shows
-this is not a case of overestimation. Measurements at around the same
-offered load have higher probability of zero loss earlier
-(when performed farther from upper bound), but smaller probability later
-(when performed closer to upper bound). That means it is the performance
-of the system under test that decreases (slightly) over time.
-
-With that in mind, the apparent narrowness of the interval
-is not a sign of low quality, just a consequence of PLRsearch assuming
-the performance stays constant.
-
-{{< figure src="/cdocs/PLR_patch.svg" >}}
-
-#### Vhost
-
-This test case shows what looks like a quite broad estimation interval,
-compared to other test cases with similarly looking zero loss frequencies.
-Notable features are infrequent high-loss measurement results
-causing big drops of estimates, and lack of long-term convergence.
-
-Any convergence in medium-sized intervals (during zero loss results)
-is reverted by the big loss results, as they happen quite far
-from the critical load estimates, and the two fitting functions
-extrapolate differently.
-
-In other words, human only seeing estimates from one fitting function
-would expect narrower end interval, but human seeing the measured loss ratios
-agrees that the interval should be wider than that.
-
-{{< figure src="/cdocs/PLR_vhost.svg" >}}
-
-#### Summary
-
-The two graphs show the behavior of PLRsearch algorithm applied to soaking test
-when some of PLRsearch assumptions do not hold:
-
-+ L2 patch measurement results violate the assumption
- of performance not changing over time.
-+ Vhost measurement results violate the assumption
- of Poisson distribution matching the loss counts.
-
-The reported upper and lower bounds can have distance larger or smaller
-than a first look by a human would expect, but a more closer look reveals
-the quality is good, considering the circumstances.
-
-The usefullness of the critical load estimate is of questionable value
-when the assumptions are violated.
-
-Some improvements can be made via more specific workarounds,
-for example long term limit of L2 patch performance could be estmated
-by some heuristic.
-
-Other improvements can be achieved only by asking users
-whether loss patterns matter. Is it better to have single digit losses
-distributed fairly evenly over time (as Poisson distribution would suggest),
-or is it better to have short periods of medium losses
-mixed with long periods of zero losses (as happens in Vhost test)
-with the same overall loss ratio?
-
-[^1]: [draft-vpolak-bmwg-plrsearch-02](https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch-02)
-[^2]: [plrsearch draft](https://tools.ietf.org/html/draft-vpolak-bmwg-plrsearch-00)
-[^3]: [RFC 2544](https://tools.ietf.org/html/rfc2544)
-[^4]: [Lomax distribution](https://en.wikipedia.org/wiki/Lomax_distribution)
-[^5]: [reciprocal distribution](https://en.wikipedia.org/wiki/Reciprocal_distribution)
-[^6]: [Monte Carlo](https://en.wikipedia.org/wiki/Monte_Carlo_integration)
-[^7]: [importance sampling](https://en.wikipedia.org/wiki/Importance_sampling)
-[^8]: [bivariate Gaussian](https://en.wikipedia.org/wiki/Multivariate_normal_distribution)
-[^9]: [binary search](https://en.wikipedia.org/wiki/Binary_search_algorithm) \ No newline at end of file