diff options
author | Maciek Konstantynowicz <mkonstan@cisco.com> | 2018-04-28 12:50:59 +0100 |
---|---|---|
committer | Maciek Konstantynowicz <mkonstan@cisco.com> | 2018-04-28 14:11:43 +0100 |
commit | 1a72adeb35bfd540f882a107ed1007e4a8545dec (patch) | |
tree | d646c3b43f8d331cd72009633d30b65946f5dee6 /docs/cpta | |
parent | 9eb9f9300831fcdd56dbefac7aaec5659cdfc31c (diff) |
More edits in trending docs: methodology, dashboard.
Change-Id: I008fd39c57810dcf4cb84e5a9fc4f9adf6923a4f
Signed-off-by: Maciek Konstantynowicz <mkonstan@cisco.com>
Diffstat (limited to 'docs/cpta')
-rw-r--r-- | docs/cpta/introduction/index.rst | 37 | ||||
-rw-r--r-- | docs/cpta/methodology/index.rst | 51 |
2 files changed, 46 insertions, 42 deletions
diff --git a/docs/cpta/introduction/index.rst b/docs/cpta/introduction/index.rst index 8b3c17029d..c724c30bad 100644 --- a/docs/cpta/introduction/index.rst +++ b/docs/cpta/introduction/index.rst @@ -4,18 +4,28 @@ VPP Performance Dashboard Description ----------- -Dashboard tables list a summary of latest per test-case VPP Maximum -Receive Rate (MRR) performance trend, trend compliance metrics and -detected number of anomalies. Data samples come from the CSIT VPP -performance trending jobs executed twice a day, every 12 hrs (02:00, -14:00 UTC). All trend and anomaly evaluation is based on a rolling -window of <N=14> data samples, covering last 7 days. +Performance dashboard tables provide the latest VPP throughput trend, +trend compliance and detected anomalies, all on a per VPP test case +basis. Linked trendline graphs enable further drill-down into the +trendline compliance, sequence and nature of anomalies, as well as +pointers to performance test builds/logs and VPP builds. + +Performance trending is currently based on the Maximum Receive Rate +(MRR) tests. MRR tests measure the packet forwarding rate under the +maximum load offered by traffic generator over a set trial duration, +regardless of packet loss. See :ref:`trending_methodology` section for +more detail including trend and anomaly calculations. + +Data samples are generated by the CSIT VPP performance trending jobs +executed twice a day (target start: every 12 hrs, 02:00, 14:00 UTC). All +trend and anomaly evaluation is based on a rolling window of <N=14> data +samples, covering last 7 days. Legend to table: - - **Test Case** : name of CSIT test case, naming convention in - `CSIT wiki <https://wiki.fd.io/view/CSIT/csit-test-naming>`_. - - **Trend [Mpps]** : last value of trend. + - **Test Case** : name of FD.io CSIT test case, naming convention + `here <https://wiki.fd.io/view/CSIT/csit-test-naming>`_. + - **Trend [Mpps]** : last value of performance trend. - **Short-Term Change [%]** : Relative change of last trend value vs. last week trend value. - **Long-Term Change [%]** : Relative change of last trend value vs. @@ -24,17 +34,10 @@ Legend to table: - **Progressions [#]** : Number of progressions detected. - **Outliers [#]** : Number of outliers detected. -MRR tests measure the packet forwarding rate under the maximum load -offered by traffic generator over a set trial duration, regardless of -packet loss. - -For more detail about MRR tests, trend and anomaly calculations please -refer to :ref:`trending_methodology` section. - Tested VPP worker-thread-core combinations (1t1c, 2t2c, 4t4c) are listed in separate tables in section 1.x. Followed by trending methodology in section 2. and daily trending graphs in sections 3.x. Daily trending -data used is provided in sections 4.x. +data used for graphs is provided in sections 4.x. VPP worker on 1t1c ------------------ diff --git a/docs/cpta/methodology/index.rst b/docs/cpta/methodology/index.rst index 29dcae2e7f..5efdfaae32 100644 --- a/docs/cpta/methodology/index.rst +++ b/docs/cpta/methodology/index.rst @@ -1,10 +1,10 @@ -Performance Trending Methodology -================================ - .. _trending_methodology: -Continuous Trending and Analysis --------------------------------- +Trending Methodology +==================== + +Overview +-------- This document describes a high-level design of a system for continuous performance measuring, trending and change detection for FD.io VPP SW @@ -22,8 +22,8 @@ trending dashboard and graphs with summary and drill-down views across all specified tests that can be reviewed and inspected regularly by FD.io developers and users community. -Performance Trending Tests --------------------------- +Performance Tests +----------------- Performance trending is currently relying on the Maximum Receive Rate (MRR) tests. MRR tests measure the packet forwarding rate under the @@ -51,13 +51,14 @@ Current parameters for performance trending MRR tests: - Trial duration: 10sec. - Execution frequency: twice a day, every 12 hrs (02:00, 14:00 UTC). -In the future if tested VPP configuration can handle the packet rate -higher than bi-directional 10GE link rate, e.g. all IMIX tests and -64B/78B multi-core tests, a higher maximum load will be offered -(25GE|40GE|100GE). +Note: MRR tests should be reporting bi-directional link rate (or NIC +rate, if lower) if tested VPP configuration can handle the packet rate +higher than bi-directional link rate, e.g. large packet tests and/or +multi-core tests. In other words MRR = min(VPP rate, bi-dir link rate, +NIC rate). -Performance Trend Analysis --------------------------- +Trend Analysis +-------------- All measured performance trend data is treated as time-series data that can be modelled using normal distribution. After trimming the outliers, @@ -65,12 +66,11 @@ the median and deviations from median are used for detecting performance change anomalies following the three-sigma rule of thumb (a.k.a. 68-95-99.7 rule). -Analysis Metrics +Metrics ```````````````` -Following statistical metrics are proposed as performance trend -indicators over the rolling window of last <N> sets of historical -measurement data: +Following statistical metrics are used as performance trend indicators +over the rolling window of last <N> sets of historical measurement data: - Q1, Q2, Q3 : Quartiles, three points dividing a ranked data set of <N> values into four equal parts, Q2 is the median of the data. @@ -135,8 +135,8 @@ respectively. This results in following trend compliance calculations: Short-Term Change ((V - R) / R) TMM[last] TMM[last - 1week] Long-Term Change ((V - R) / R) TMM[last] max(TMM[(last - 3mths)..(last - 1week)]) -Performance Trend Presentation ------------------------------- +Trend Presentation +------------------ Performance Dashboard ````````````````````` @@ -168,8 +168,8 @@ data points, representing (trend job build Id, MRR value) and the actual vpp build number (b<XXX>) tested. -Jenkins Jobs Description ------------------------- +Jenkins Jobs +------------ Performance Trending (PT) ````````````````````````` @@ -231,13 +231,14 @@ PA is defined as follows: #. Evaluate new test data against trend metrics: #. If within the range of (TMA +/- 3*TMSD) => Result = Pass, - Reason = Normal. (to be updated base on final Jenkins code) + Reason = Normal. (to be updated base on the final Jenkins code). #. If below the range => Result = Fail, Reason = Regression. #. If above the range => Result = Pass, Reason = Progression. #. Generate and publish results - #. Relay evaluation result to job result. (to be updated base on final - Jenkins code) + #. Relay evaluation result to job result. (to be updated base on the + final Jenkins code). #. Generate a new set of trend summary dashboard and graphs. - #. Publish trend dashboard and graphs in html format on https://docs.fd.io/. + #. Publish trend dashboard and graphs in html format on + https://docs.fd.io/. |