CSIT-1110: Integrate anomaly detection into PAL

+ Keep the original detection, + add the new one as subdirectory (both in source and in rendered tree). - The new detection is not rebased over "Add dpdk mrr tests to trending". New detection features: + Do not remove (nor detect) outliers. + Trend line shows the constant average within a group. + Anomaly circles are placed at the changed average. + Small bias against too similar averages. + Should be ready for moving the detection library out to pip. Change-Id: I7ab1a92b79eeeed53ba65a071b1305e927816a89 Signed-off-by: Vratko Polak <vrpolak@cisco.com>
author: Vratko Polak <vrpolak@cisco.com> 2018-06-08 18:07:35 +0200
committer: Tibor Frank <tifrank@cisco.com> 2018-06-11 08:30:21 +0000
commit: beeb2acb9ac153eaa54983bea46a76d596168965 (patch)
tree: 0465617b135a2e64693265969c48ff466db3d287 /resources/tools/presentation/new/doc/pal_lld.rst
parent: 3dcef45002a1b82c4503ec590d680950930fa193 (diff)
1 files changed, 1623 insertions, 0 deletions
diff --git a/resources/tools/presentation/new/doc/pal_lld.rst b/resources/tools/presentation/new/doc/pal_lld.rst
new file mode 100644
index 0000000000..81c2547a82
--- /dev/null
+++ b/resources/tools/presentation/new/doc/pal_lld.rst
@@ -0,0 +1,1623 @@
+Presentation and Analytics Layer
+================================
+
+Overview
+--------
+
+The presentation and analytics layer (PAL) is the fourth layer of CSIT
+hierarchy. The model of presentation and analytics layer consists of four
+sub-layers, bottom up:
+
+ - sL1 - Data - input data to be processed:
+
+   - Static content - .rst text files, .svg static figures, and other files
+     stored in the CSIT git repository.
+   - Data to process - .xml files generated by Jenkins jobs executing tests,
+     stored as robot results files (output.xml).
+   - Specification - .yaml file with the models of report elements (tables,
+     plots, layout, ...) generated by this tool. There is also the configuration
+     of the tool and the specification of input data (jobs and builds).
+
+ - sL2 - Data processing
+
+   - The data are read from the specified input files (.xml) and stored as
+     multi-indexed `pandas.Series <https://pandas.pydata.org/pandas-docs/stable/
+     generated/pandas.Series.html>`_.
+   - This layer provides also interface to input data and filtering of the input
+     data.
+
+ - sL3 - Data presentation - This layer generates the elements specified in the
+   specification file:
+
+   - Tables: .csv files linked to static .rst files.
+   - Plots: .html files generated using plot.ly linked to static .rst files.
+
+ - sL4 - Report generation - Sphinx generates required formats and versions:
+
+   - formats: html, pdf
+   - versions: minimal, full (TODO: define the names and scope of versions)
+
+.. only:: latex
+
+    .. raw:: latex
+
+        \begin{figure}[H]
+        \centering
+            \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_layers}
+            \label{fig:pal_layers}
+        \end{figure}
+
+.. only:: html
+
+    .. figure:: pal_layers.svg
+        :alt: PAL Layers
+        :align: center
+
+Data
+----
+
+Report Specification
+````````````````````
+
+The report specification file defines which data is used and which outputs are
+generated. It is human readable and structured. It is easy to add / remove /
+change items. The specification includes:
+
+ - Specification of the environment.
+ - Configuration of debug mode (optional).
+ - Specification of input data (jobs, builds, files, ...).
+ - Specification of the output.
+ - What and how is generated:
+   - What: plots, tables.
+   - How: specification of all properties and parameters.
+ - .yaml format.
+
+Structure of the specification file
+'''''''''''''''''''''''''''''''''''
+
+The specification file is organized as a list of dictionaries distinguished by
+the type:
+
+::
+
+    -
+      type: "environment"
+    -
+      type: "configuration"
+    -
+      type: "debug"
+    -
+      type: "static"
+    -
+      type: "input"
+    -
+      type: "output"
+    -
+      type: "table"
+    -
+      type: "plot"
+    -
+      type: "file"
+
+Each type represents a section. The sections "environment", "debug", "static",
+"input" and "output" are listed only once in the specification; "table", "file"
+and "plot" can be there multiple times.
+
+Sections "debug", "table", "file" and "plot" are optional.
+
+Table(s), files(s) and plot(s) are referred as "elements" in this text. It is
+possible to define and implement other elements if needed.
+
+
+Section: Environment
+''''''''''''''''''''
+
+This section has the following parts:
+
+ - type: "environment" - says that this is the section "environment".
+ - configuration - configuration of the PAL.
+ - paths - paths used by the PAL.
+ - urls - urls pointing to the data sources.
+ - make-dirs - a list of the directories to be created by the PAL while
+   preparing the environment.
+ - remove-dirs - a list of the directories to be removed while cleaning the
+   environment.
+ - build-dirs - a list of the directories where the results are stored.
+
+The structure of the section "Environment" is as follows (example):
+
+::
+
+    -
+      type: "environment"
+      configuration:
+        # Debug mode:
+        # - Skip:
+        #   - Download of input data files
+        # - Do:
+        #   - Read data from given zip / xml files
+        #   - Set the configuration as it is done in normal mode
+        # If the section "type: debug" is missing, CFG[DEBUG] is set to 0.
+        CFG[DEBUG]: 0
+
+      paths:
+        # Top level directories:
+        ## Working directory
+        DIR[WORKING]: "_tmp"
+        ## Build directories
+        DIR[BUILD,HTML]: "_build"
+        DIR[BUILD,LATEX]: "_build_latex"
+
+        # Static .rst files
+        DIR[RST]: "../../../docs/report"
+
+        # Working directories
+        ## Input data files (.zip, .xml)
+        DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
+        ## Static source files from git
+        DIR[WORKING,SRC]: "{DIR[WORKING]}/src"
+        DIR[WORKING,SRC,STATIC]: "{DIR[WORKING,SRC]}/_static"
+
+        # Static html content
+        DIR[STATIC]: "{DIR[BUILD,HTML]}/_static"
+        DIR[STATIC,VPP]: "{DIR[STATIC]}/vpp"
+        DIR[STATIC,DPDK]: "{DIR[STATIC]}/dpdk"
+        DIR[STATIC,ARCH]: "{DIR[STATIC]}/archive"
+
+        # Detailed test results
+        DIR[DTR]: "{DIR[WORKING,SRC]}/detailed_test_results"
+        DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results"
+        DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results"
+        DIR[DTR,PERF,HC]: "{DIR[DTR]}/honeycomb_performance_results"
+        DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results"
+        DIR[DTR,FUNC,HC]: "{DIR[DTR]}/honeycomb_functional_results"
+        DIR[DTR,FUNC,NSHSFC]: "{DIR[DTR]}/nshsfc_functional_results"
+        DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements"
+
+        # Detailed test configurations
+        DIR[DTC]: "{DIR[WORKING,SRC]}/test_configuration"
+        DIR[DTC,PERF,VPP]: "{DIR[DTC]}/vpp_performance_configuration"
+        DIR[DTC,FUNC,VPP]: "{DIR[DTC]}/vpp_functional_configuration"
+
+        # Detailed tests operational data
+        DIR[DTO]: "{DIR[WORKING,SRC]}/test_operational_data"
+        DIR[DTO,PERF,VPP]: "{DIR[DTO]}/vpp_performance_operational_data"
+
+        # .css patch file to fix tables generated by Sphinx
+        DIR[CSS_PATCH_FILE]: "{DIR[STATIC]}/theme_overrides.css"
+        DIR[CSS_PATCH_FILE2]: "{DIR[WORKING,SRC,STATIC]}/theme_overrides.css"
+
+      urls:
+        URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job"
+        URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job"
+
+      make-dirs:
+      # List the directories which are created while preparing the environment.
+      # All directories MUST be defined in "paths" section.
+      - "DIR[WORKING,DATA]"
+      - "DIR[STATIC,VPP]"
+      - "DIR[STATIC,DPDK]"
+      - "DIR[STATIC,ARCH]"
+      - "DIR[BUILD,LATEX]"
+      - "DIR[WORKING,SRC]"
+      - "DIR[WORKING,SRC,STATIC]"
+
+      remove-dirs:
+      # List the directories which are deleted while cleaning the environment.
+      # All directories MUST be defined in "paths" section.
+      #- "DIR[BUILD,HTML]"
+
+      build-dirs:
+      # List the directories where the results (build) is stored.
+      # All directories MUST be defined in "paths" section.
+      - "DIR[BUILD,HTML]"
+      - "DIR[BUILD,LATEX]"
+
+It is possible to use defined items in the definition of other items, e.g.:
+
+::
+
+    DIR[WORKING,DATA]: "{DIR[WORKING]}/data"
+
+will be automatically changed to
+
+::
+
+    DIR[WORKING,DATA]: "_tmp/data"
+
+
+Section: Configuration
+''''''''''''''''''''''
+
+This section specifies the groups of parameters which are repeatedly used in the
+elements defined later in the specification file. It has the following parts:
+
+ - data sets - Specification of data sets used later in element's specifications
+   to define the input data.
+ - plot layouts - Specification of plot layouts used later in plots'
+   specifications to define the plot layout.
+
+The structure of the section "Configuration" is as follows (example):
+
+::
+
+    -
+      type: "configuration"
+      data-sets:
+        plot-vpp-throughput-latency:
+          csit-vpp-perf-1710-all:
+          - 11
+          - 12
+          - 13
+          - 14
+          - 15
+          - 16
+          - 17
+          - 18
+          - 19
+          - 20
+        vpp-perf-results:
+          csit-vpp-perf-1710-all:
+          - 20
+          - 23
+      plot-layouts:
+        plot-throughput:
+          xaxis:
+            autorange: True
+            autotick: False
+            fixedrange: False
+            gridcolor: "rgb(238, 238, 238)"
+            linecolor: "rgb(238, 238, 238)"
+            linewidth: 1
+            showgrid: True
+            showline: True
+            showticklabels: True
+            tickcolor: "rgb(238, 238, 238)"
+            tickmode: "linear"
+            title: "Indexed Test Cases"
+            zeroline: False
+          yaxis:
+            gridcolor: "rgb(238, 238, 238)'"
+            hoverformat: ".4s"
+            linecolor: "rgb(238, 238, 238)"
+            linewidth: 1
+            range: []
+            showgrid: True
+            showline: True
+            showticklabels: True
+            tickcolor: "rgb(238, 238, 238)"
+            title: "Packets Per Second [pps]"
+            zeroline: False
+          boxmode: "group"
+          boxgroupgap: 0.5
+          autosize: False
+          margin:
+            t: 50
+            b: 20
+            l: 50
+            r: 20
+          showlegend: True
+          legend:
+            orientation: "h"
+          width: 700
+          height: 1000
+
+The definitions from this sections are used in the elements, e.g.:
+
+::
+
+    -
+      type: "plot"
+      title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+      algorithm: "plot_performance_box"
+      output-file-type: ".html"
+      output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc"
+      data:
+        "plot-vpp-throughput-latency"
+      filter: "'64B' and ('BASE' or 'SCALE') and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
+      parameters:
+      - "throughput"
+      - "parent"
+      traces:
+        hoverinfo: "x+y"
+        boxpoints: "outliers"
+        whiskerwidth: 0
+      layout:
+        title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+        layout:
+          "plot-throughput"
+
+
+Section: Debug mode
+'''''''''''''''''''
+
+This section is optional as it configures the debug mode. It is used if one
+does not want to download input data files and use local files instead.
+
+If the debug mode is configured, the "input" section is ignored.
+
+This section has the following parts:
+
+ - type: "debug" - says that this is the section "debug".
+ - general:
+
+   - input-format - xml or zip.
+   - extract - if "zip" is defined as the input format, this file is extracted
+     from the zip file, otherwise this parameter is ignored.
+
+ - builds - list of builds from which the data is used. Must include a job
+   name as a key and then a list of builds and their output files.
+
+The structure of the section "Debug" is as follows (example):
+
+::
+
+    -
+      type: "debug"
+      general:
+        input-format: "zip"  # zip or xml
+        extract: "robot-plugin/output.xml"  # Only for zip
+      builds:
+        # The files must be in the directory DIR[WORKING,DATA]
+        csit-dpdk-perf-1707-all:
+        -
+          build: 10
+          file: "csit-dpdk-perf-1707-all__10.xml"
+        -
+          build: 9
+          file: "csit-dpdk-perf-1707-all__9.xml"
+        csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
+        -
+          build: 2
+          file: "csit-nsh_sfc-verify-func-1707-ubuntu1604-virl-2.xml"
+        csit-vpp-functional-1707-ubuntu1604-virl:
+        -
+          build: lastSuccessfulBuild
+          file: "csit-vpp-functional-1707-ubuntu1604-virl-lastSuccessfulBuild.xml"
+        hc2vpp-csit-integration-1707-ubuntu1604:
+        -
+          build: lastSuccessfulBuild
+          file: "hc2vpp-csit-integration-1707-ubuntu1604-lastSuccessfulBuild.xml"
+        csit-vpp-perf-1707-all:
+        -
+          build: 16
+          file: "csit-vpp-perf-1707-all__16__output.xml"
+        -
+          build: 17
+          file: "csit-vpp-perf-1707-all__17__output.xml"
+
+
+Section: Static
+'''''''''''''''
+
+This section defines the static content which is stored in git and will be used
+as a source to generate the report.
+
+This section has these parts:
+
+ - type: "static" - says that this section is the "static".
+ - src-path - path to the static content.
+ - dst-path - destination path where the static content is copied and then
+   processed.
+
+::
+    -
+      type: "static"
+      src-path: "{DIR[RST]}"
+      dst-path: "{DIR[WORKING,SRC]}"
+
+
+Section: Input
+''''''''''''''
+
+This section defines the data used to generate elements. It is mandatory
+if the debug mode is not used.
+
+This section has the following parts:
+
+ - type: "input" - says that this section is the "input".
+ - general - parameters common to all builds:
+
+   - file-name: file to be downloaded.
+   - file-format: format of the downloaded file, ".zip" or ".xml" are supported.
+   - download-path: path to be added to url pointing to the file, e.g.:
+     "{job}/{build}/robot/report/*zip*/{filename}"; {job}, {build} and
+     {filename} are replaced by proper values defined in this section.
+   - extract: file to be extracted from downloaded zip file, e.g.: "output.xml";
+     if xml file is downloaded, this parameter is ignored.
+
+ - builds - list of jobs (keys) and numbers of builds which output data will be
+   downloaded.
+
+The structure of the section "Input" is as follows (example from 17.07 report):
+
+::
+
+    -
+      type: "input"  # Ignored in debug mode
+      general:
+        file-name: "robot-plugin.zip"
+        file-format: ".zip"
+        download-path: "{job}/{build}/robot/report/*zip*/{filename}"
+        extract: "robot-plugin/output.xml"
+      builds:
+        csit-vpp-perf-1707-all:
+        - 9
+        - 10
+        - 13
+        - 14
+        - 15
+        - 16
+        - 17
+        - 18
+        - 19
+        - 21
+        - 22
+        csit-dpdk-perf-1707-all:
+        - 1
+        - 2
+        - 3
+        - 4
+        - 5
+        - 6
+        - 7
+        - 8
+        - 9
+        - 10
+        csit-vpp-functional-1707-ubuntu1604-virl:
+        - lastSuccessfulBuild
+        hc2vpp-csit-perf-master-ubuntu1604:
+        - 8
+        - 9
+        hc2vpp-csit-integration-1707-ubuntu1604:
+        - lastSuccessfulBuild
+        csit-nsh_sfc-verify-func-1707-ubuntu1604-virl:
+        - 2
+
+
+Section: Output
+'''''''''''''''
+
+This section specifies which format(s) will be generated (html, pdf) and which
+versions will be generated for each format.
+
+This section has the following parts:
+
+ - type: "output" - says that this section is the "output".
+ - format: html or pdf.
+ - version: defined for each format separately.
+
+The structure of the section "Output" is as follows (example):
+
+::
+
+    -
+      type: "output"
+      format:
+        html:
+        - full
+        pdf:
+        - full
+        - minimal
+
+TODO: define the names of versions
+
+
+Content of "minimal" version
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+TODO: define the name and content of this version
+
+
+Section: Table
+''''''''''''''
+
+This section defines a table to be generated. There can be 0 or more "table"
+sections.
+
+This section has the following parts:
+
+ - type: "table" - says that this section defines a table.
+ - title: Title of the table.
+ - algorithm: Algorithm which is used to generate the table. The other
+   parameters in this section must provide all information needed by the used
+   algorithm.
+ - template: (optional) a .csv file used as a template while generating the
+   table.
+ - output-file-ext: extension of the output file.
+ - output-file: file which the table will be written to.
+ - columns: specification of table columns:
+
+   - title: The title used in the table header.
+   - data: Specification of the data, it has two parts - command and arguments:
+
+     - command:
+
+       - template - take the data from template, arguments:
+
+         - number of column in the template.
+
+       - data - take the data from the input data, arguments:
+
+         - jobs and builds which data will be used.
+
+       - operation - performs an operation with the data already in the table,
+         arguments:
+
+         - operation to be done, e.g.: mean, stdev, relative_change (compute
+           the relative change between two columns) and display number of data
+           samples ~= number of test jobs. The operations are implemented in the
+           utils.py
+           TODO: Move from utils,py to e.g. operations.py
+         - numbers of columns which data will be used (optional).
+
+ - data: Specify the jobs and builds which data is used to generate the table.
+ - filter: filter based on tags applied on the input data, if "template" is
+   used, filtering is based on the template.
+ - parameters: Only these parameters will be put to the output data structure.
+
+The structure of the section "Table" is as follows (example of
+"table_performance_improvements"):
+
+::
+
+    -
+      type: "table"
+      title: "Performance improvements"
+      algorithm: "table_performance_improvements"
+      template: "{DIR[DTR,PERF,VPP,IMPRV]}/tmpl_performance_improvements.csv"
+      output-file-ext: ".csv"
+      output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/performance_improvements"
+      columns:
+      -
+        title: "VPP Functionality"
+        data: "template 1"
+      -
+        title: "Test Name"
+        data: "template 2"
+      -
+        title: "VPP-16.09 mean [Mpps]"
+        data: "template 3"
+      -
+        title: "VPP-17.01 mean [Mpps]"
+        data: "template 4"
+      -
+        title: "VPP-17.04 mean [Mpps]"
+        data: "template 5"
+      -
+        title: "VPP-17.07 mean [Mpps]"
+        data: "data csit-vpp-perf-1707-all mean"
+      -
+        title: "VPP-17.07 stdev [Mpps]"
+        data: "data csit-vpp-perf-1707-all stdev"
+      -
+        title: "17.04 to 17.07 change [%]"
+        data: "operation relative_change 5 4"
+      data:
+        csit-vpp-perf-1707-all:
+        - 9
+        - 10
+        - 13
+        - 14
+        - 15
+        - 16
+        - 17
+        - 18
+        - 19
+        - 21
+      filter: "template"
+      parameters:
+      - "throughput"
+
+Example of "table_details" which generates "Detailed Test Results - VPP
+Performance Results":
+
+::
+
+    -
+      type: "table"
+      title: "Detailed Test Results - VPP Performance Results"
+      algorithm: "table_details"
+      output-file-ext: ".csv"
+      output-file: "{DIR[WORKING]}/vpp_performance_results"
+      columns:
+      -
+        title: "Name"
+        data: "data test_name"
+      -
+        title: "Documentation"
+        data: "data test_documentation"
+      -
+        title: "Status"
+        data: "data test_msg"
+      data:
+        csit-vpp-perf-1707-all:
+        - 17
+      filter: "all"
+      parameters:
+      - "parent"
+      - "doc"
+      - "msg"
+
+Example of "table_details" which generates "Test configuration - VPP Performance
+Test Configs":
+
+::
+
+    -
+      type: "table"
+      title: "Test configuration - VPP Performance Test Configs"
+      algorithm: "table_details"
+      output-file-ext: ".csv"
+      output-file: "{DIR[WORKING]}/vpp_test_configuration"
+      columns:
+      -
+        title: "Name"
+        data: "data name"
+      -
+        title: "VPP API Test (VAT) Commands History - Commands Used Per Test Case"
+        data: "data show-run"
+      data:
+        csit-vpp-perf-1707-all:
+        - 17
+      filter: "all"
+      parameters:
+      - "parent"
+      - "name"
+      - "show-run"
+
+
+Section: Plot
+'''''''''''''
+
+This section defines a plot to be generated. There can be 0 or more "plot"
+sections.
+
+This section has these parts:
+
+ - type: "plot" - says that this section defines a plot.
+ - title: Plot title used in the logs. Title which is displayed is in the
+   section "layout".
+ - output-file-type: format of the output file.
+ - output-file: file which the plot will be written to.
+ - algorithm: Algorithm used to generate the plot. The other parameters in this
+   section must provide all information needed by plot.ly to generate the plot.
+   For example:
+
+   - traces
+   - layout
+
+   - These parameters are transparently passed to plot.ly.
+
+ - data: Specify the jobs and numbers of builds which data is used to generate
+   the plot.
+ - filter: filter applied on the input data.
+ - parameters: Only these parameters will be put to the output data structure.
+
+The structure of the section "Plot" is as follows (example of a plot showing
+throughput in a chart box-with-whiskers):
+
+::
+
+    -
+      type: "plot"
+      title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+      algorithm: "plot_performance_box"
+      output-file-type: ".html"
+      output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc"
+      data:
+        csit-vpp-perf-1707-all:
+        - 9
+        - 10
+        - 13
+        - 14
+        - 15
+        - 16
+        - 17
+        - 18
+        - 19
+        - 21
+      # Keep this formatting, the filter is enclosed with " (quotation mark) and
+      # each tag is enclosed with ' (apostrophe).
+      filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
+      parameters:
+      - "throughput"
+      - "parent"
+      traces:
+        hoverinfo: "x+y"
+        boxpoints: "outliers"
+        whiskerwidth: 0
+      layout:
+        title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+        xaxis:
+          autorange: True
+          autotick: False
+          fixedrange: False
+          gridcolor: "rgb(238, 238, 238)"
+          linecolor: "rgb(238, 238, 238)"
+          linewidth: 1
+          showgrid: True
+          showline: True
+          showticklabels: True
+          tickcolor: "rgb(238, 238, 238)"
+          tickmode: "linear"
+          title: "Indexed Test Cases"
+          zeroline: False
+        yaxis:
+          gridcolor: "rgb(238, 238, 238)'"
+          hoverformat: ".4s"
+          linecolor: "rgb(238, 238, 238)"
+          linewidth: 1
+          range: []
+          showgrid: True
+          showline: True
+          showticklabels: True
+          tickcolor: "rgb(238, 238, 238)"
+          title: "Packets Per Second [pps]"
+          zeroline: False
+        boxmode: "group"
+        boxgroupgap: 0.5
+        autosize: False
+        margin:
+          t: 50
+          b: 20
+          l: 50
+          r: 20
+        showlegend: True
+        legend:
+          orientation: "h"
+        width: 700
+        height: 1000
+
+The structure of the section "Plot" is as follows (example of a plot showing
+latency in a box chart):
+
+::
+
+    -
+      type: "plot"
+      title: "VPP Latency 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+      algorithm: "plot_latency_box"
+      output-file-type: ".html"
+      output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc-lat50"
+      data:
+        csit-vpp-perf-1707-all:
+        - 9
+        - 10
+        - 13
+        - 14
+        - 15
+        - 16
+        - 17
+        - 18
+        - 19
+        - 21
+      filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
+      parameters:
+      - "latency"
+      - "parent"
+      traces:
+        boxmean: False
+      layout:
+        title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+        xaxis:
+          autorange: True
+          autotick: False
+          fixedrange: False
+          gridcolor: "rgb(238, 238, 238)"
+          linecolor: "rgb(238, 238, 238)"
+          linewidth: 1
+          showgrid: True
+          showline: True
+          showticklabels: True
+          tickcolor: "rgb(238, 238, 238)"
+          tickmode: "linear"
+          title: "Indexed Test Cases"
+          zeroline: False
+        yaxis:
+          gridcolor: "rgb(238, 238, 238)'"
+          hoverformat: ""
+          linecolor: "rgb(238, 238, 238)"
+          linewidth: 1
+          range: []
+          showgrid: True
+          showline: True
+          showticklabels: True
+          tickcolor: "rgb(238, 238, 238)"
+          title: "Latency min/avg/max [uSec]"
+          zeroline: False
+        boxmode: "group"
+        boxgroupgap: 0.5
+        autosize: False
+        margin:
+          t: 50
+          b: 20
+          l: 50
+          r: 20
+        showlegend: True
+        legend:
+          orientation: "h"
+        width: 700
+        height: 1000
+
+The structure of the section "Plot" is as follows (example of a plot showing
+VPP HTTP server performance in a box chart with pre-defined data
+"plot-vpp-httlp-server-performance" set and  plot layout "plot-cps"):
+
+::
+
+    -
+      type: "plot"
+      title: "VPP HTTP Server Performance"
+      algorithm: "plot_http_server_performance_box"
+      output-file-type: ".html"
+      output-file: "{DIR[STATIC,VPP]}/http-server-performance-cps"
+      data:
+        "plot-vpp-httlp-server-performance"
+      # Keep this formatting, the filter is enclosed with " (quotation mark) and
+      # each tag is enclosed with ' (apostrophe).
+      filter: "'HTTP' and 'TCP_CPS'"
+      parameters:
+      - "result"
+      - "name"
+      traces:
+        hoverinfo: "x+y"
+        boxpoints: "outliers"
+        whiskerwidth: 0
+      layout:
+        title: "VPP HTTP Server Performance"
+        layout:
+          "plot-cps"
+
+
+Section: file
+'''''''''''''
+
+This section defines a file to be generated. There can be 0 or more "file"
+sections.
+
+This section has the following parts:
+
+ - type: "file" - says that this section defines a file.
+ - title: Title of the table.
+ - algorithm: Algorithm which is used to generate the file. The other
+   parameters in this section must provide all information needed by the used
+   algorithm.
+ - output-file-ext: extension of the output file.
+ - output-file: file which the file will be written to.
+ - file-header: The header of the generated .rst file.
+ - dir-tables: The directory with the tables.
+ - data: Specify the jobs and builds which data is used to generate the table.
+ - filter: filter based on tags applied on the input data, if "all" is
+   used, no filtering is done.
+ - parameters: Only these parameters will be put to the output data structure.
+ - chapters: the hierarchy of chapters in the generated file.
+ - start-level: the level of the the top-level chapter.
+
+The structure of the section "file" is as follows (example):
+
+::
+
+    -
+      type: "file"
+      title: "VPP Performance Results"
+      algorithm: "file_test_results"
+      output-file-ext: ".rst"
+      output-file: "{DIR[DTR,PERF,VPP]}/vpp_performance_results"
+      file-header: "\n.. |br| raw:: html\n\n    <br />\n\n\n.. |prein| raw:: html\n\n    <pre>\n\n\n.. |preout| raw:: html\n\n    </pre>\n\n"
+      dir-tables: "{DIR[DTR,PERF,VPP]}"
+      data:
+        csit-vpp-perf-1707-all:
+        - 22
+      filter: "all"
+      parameters:
+      - "name"
+      - "doc"
+      - "level"
+      data-start-level: 2  # 0, 1, 2, ...
+      chapters-start-level: 2  # 0, 1, 2, ...
+
+
+Static content
+``````````````
+
+ - Manually created / edited files.
+ - .rst files, static .csv files, static pictures (.svg), ...
+ - Stored in CSIT git repository.
+
+No more details about the static content in this document.
+
+
+Data to process
+```````````````
+
+The PAL processes tests results and other information produced by Jenkins jobs.
+The data are now stored as robot results in Jenkins (TODO: store the data in
+nexus) either as .zip and / or .xml files.
+
+
+Data processing
+---------------
+
+As the first step, the data are downloaded and stored locally (typically on a
+Jenkins slave). If .zip files are used, the given .xml files are extracted for
+further processing.
+
+Parsing of the .xml files is performed by a class derived from
+"robot.api.ResultVisitor", only necessary methods are overridden. All and only
+necessary data is extracted from .xml file and stored in a structured form.
+
+The parsed data are stored as the multi-indexed pandas.Series data type. Its
+structure is as follows:
+
+::
+
+    <job name>
+      <build>
+        <metadata>
+        <suites>
+        <tests>
+
+"job name", "build", "metadata", "suites", "tests" are indexes to access the
+data. For example:
+
+::
+
+    data =
+
+    job 1 name:
+      build 1:
+        metadata: metadata
+        suites: suites
+        tests: tests
+      ...
+      build N:
+        metadata: metadata
+        suites: suites
+        tests: tests
+    ...
+    job M name:
+      build 1:
+        metadata: metadata
+        suites: suites
+        tests: tests
+      ...
+      build N:
+        metadata: metadata
+        suites: suites
+        tests: tests
+
+Using indexes data["job 1 name"]["build 1"]["tests"] (e.g.:
+data["csit-vpp-perf-1704-all"]["17"]["tests"]) we get a list of all tests with
+all tests data.
+
+Data will not be accessible directly using indexes, but using getters and
+filters.
+
+**Structure of metadata:**
+
+::
+
+    "metadata": {
+        "version": "VPP version",
+        "job": "Jenkins job name"
+        "build": "Information about the build"
+    },
+
+**Structure of suites:**
+
+::
+
+    "suites": {
+        "Suite name 1": {
+            "doc": "Suite 1 documentation"
+            "parent": "Suite 1 parent"
+        }
+        "Suite name N": {
+            "doc": "Suite N documentation"
+            "parent": "Suite N parent"
+        }
+
+**Structure of tests:**
+
+Performance tests:
+
+::
+
+    "tests": {
+        "ID": {
+            "name": "Test name",
+            "parent": "Name of the parent of the test",
+            "doc": "Test documentation"
+            "msg": "Test message"
+            "tags": ["tag 1", "tag 2", "tag n"],
+            "type": "PDR" | "NDR",
+            "throughput": {
+                "value": int,
+                "unit": "pps" | "bps" | "percentage"
+            },
+            "latency": {
+                "direction1": {
+                    "100": {
+                        "min": int,
+                        "avg": int,
+                        "max": int
+                    },
+                    "50": {  # Only for NDR
+                        "min": int,
+                        "avg": int,
+                        "max": int
+                    },
+                    "10": {  # Only for NDR
+                        "min": int,
+                        "avg": int,
+                        "max": int
+                    }
+                },
+                "direction2": {
+                    "100": {
+                        "min": int,
+                        "avg": int,
+                        "max": int
+                    },
+                    "50": {  # Only for NDR
+                        "min": int,
+                        "avg": int,
+                        "max": int
+                    },
+                    "10": {  # Only for NDR
+                        "min": int,
+                        "avg": int,
+                        "max": int
+                    }
+                }
+            },
+            "lossTolerance": "lossTolerance"  # Only for PDR
+            "vat-history": "DUT1 and DUT2 VAT History"
+            },
+            "show-run": "Show Run"
+        },
+        "ID" {
+            # next test
+        }
+
+Functional tests:
+
+::
+
+    "tests": {
+        "ID": {
+            "name": "Test name",
+            "parent": "Name of the parent of the test",
+            "doc": "Test documentation"
+            "msg": "Test message"
+            "tags": ["tag 1", "tag 2", "tag n"],
+            "vat-history": "DUT1 and DUT2 VAT History"
+            "show-run": "Show Run"
+            "status": "PASS" | "FAIL"
+        },
+        "ID" {
+            # next test
+        }
+    }
+
+Note: ID is the lowercase full path to the test.
+
+
+Data filtering
+``````````````
+
+The first step when generating an element is getting the data needed to
+construct the element. The data are filtered from the processed input data.
+
+The data filtering is based on:
+
+ - job name(s).
+ - build number(s).
+ - tag(s).
+ - required data - only this data is included in the output.
+
+WARNING: The filtering is based on tags, so be careful with tagging.
+
+For example, the element which specification includes:
+
+::
+
+    data:
+      csit-vpp-perf-1707-all:
+      - 9
+      - 10
+      - 13
+      - 14
+      - 15
+      - 16
+      - 17
+      - 18
+      - 19
+      - 21
+    filter:
+      - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
+
+will be constructed using data from the job "csit-vpp-perf-1707-all", for all
+listed builds and the tests with the list of tags matching the filter
+conditions.
+
+The output data structure for filtered test data is:
+
+::
+
+    - job 1
+      - build 1
+        - test 1
+          - parameter 1
+          - parameter 2
+          ...
+          - parameter n
+        ...
+        - test n
+        ...
+      ...
+      - build n
+    ...
+    - job n
+
+
+Data analytics
+``````````````
+
+Data analytics part implements:
+
+ - methods to compute statistical data from the filtered input data.
+ - trending.
+
+Throughput Speedup Analysis - Multi-Core with Multi-Threading
+'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+Throughput Speedup Analysis (TSA) calculates throughput speedup ratios
+for tested 1-, 2- and 4-core multi-threaded VPP configurations using the
+following formula:
+
+::
+
+                                N_core_throughput
+    N_core_throughput_speedup = -----------------
+                                1_core_throughput
+
+Multi-core throughput speedup ratios are plotted in grouped bar graphs
+for throughput tests with 64B/78B frame size, with number of cores on
+X-axis and speedup ratio on Y-axis.
+
+For better comparison multiple test results' data sets are plotted per
+each graph:
+
+    - graph type: grouped bars;
+    - graph X-axis: (testcase index, number of cores);
+    - graph Y-axis: speedup factor.
+
+Subset of existing performance tests is covered by TSA graphs.
+
+**Model for TSA:**
+
+::
+
+    -
+      type: "plot"
+      title: "TSA: 64B-*-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+      algorithm: "plot_throughput_speedup_analysis"
+      output-file-type: ".html"
+      output-file: "{DIR[STATIC,VPP]}/10ge2p1x520-64B-l2-tsa-ndrdisc"
+      data:
+        "plot-throughput-speedup-analysis"
+      filter: "'NIC_Intel-X520-DA2' and '64B' and 'BASE' and 'NDRDISC' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'"
+      parameters:
+      - "throughput"
+      - "parent"
+      - "tags"
+      layout:
+        title: "64B-*-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc"
+        layout:
+          "plot-throughput-speedup-analysis"
+
+
+Comparison of results from two sets of the same test executions
+'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+This algorithm enables comparison of results coming from two sets of the
+same test executions. It is used to quantify performance changes across
+all tests after test environment changes e.g. Operating System
+upgrades/patches, Hardware changes.
+
+It is assumed that each set of test executions includes multiple runs
+of the same tests, 10 or more, to verify test results repeatibility and
+to yield statistically meaningful results data.
+
+Comparison results are presented in a table with a specified number of
+the best and the worst relative changes between the two sets. Following table
+columns are defined:
+
+    - name of the test;
+    - throughput mean values of the reference set;
+    - throughput standard deviation  of the reference set;
+    - throughput mean values of the set to compare;
+    - throughput standard deviation  of the set to compare;
+    - relative change of the mean values.
+
+**The model**
+
+The model specifies:
+
+    - type: "table" - means this section defines a table.
+    - title: Title of the table.
+    - algorithm: Algorithm which is used to generate the table. The other
+      parameters in this section must provide all information needed by the used
+      algorithm.
+    - output-file-ext: Extension of the output file.
+    - output-file: File which the table will be written to.
+    - reference - the builds which are used as the reference for comparison.
+    - compare - the builds which are compared to the reference.
+    - data: Specify the sources, jobs and builds, providing data for generating
+      the table.
+    - filter: Filter based on tags applied on the input data, if "template" is
+      used, filtering is based on the template.
+    - parameters: Only these parameters will be put to the output data
+      structure.
+    - nr-of-tests-shown: Number of the best and the worst tests presented in the
+      table. Use 0 (zero) to present all tests.
+
+*Example:*
+
+::
+
+    -
+      type: "table"
+      title: "Performance comparison"
+      algorithm: "table_performance_comparison"
+      output-file-ext: ".csv"
+      output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/vpp_performance_comparison"
+      reference:
+        title: "csit-vpp-perf-1801-all - 1"
+        data:
+          csit-vpp-perf-1801-all:
+          - 1
+          - 2
+      compare:
+        title: "csit-vpp-perf-1801-all - 2"
+        data:
+          csit-vpp-perf-1801-all:
+          - 1
+          - 2
+      data:
+        "vpp-perf-comparison"
+      filter: "all"
+      parameters:
+      - "name"
+      - "parent"
+      - "throughput"
+      nr-of-tests-shown: 20
+
+
+Advanced data analytics
+```````````````````````
+
+In the future advanced data analytics (ADA) will be added to analyze the
+telemetry data collected from SUT telemetry sources and correlate it to
+performance test results.
+
+:TODO:
+
+    - describe the concept of ADA.
+    - add specification.
+
+
+Data presentation
+-----------------
+
+Generates the plots and tables according to the report models per
+specification file. The elements are generated using algorithms and data
+specified in their models.
+
+
+Tables
+``````
+
+ - tables are generated by algorithms implemented in PAL, the model includes the
+   algorithm and all necessary information.
+ - output format: csv
+ - generated tables are stored in specified directories and linked to .rst
+   files.
+
+
+Plots
+`````
+
+ - `plot.ly <https://plot.ly/>`_ is currently used to generate plots, the model
+   includes the type of plot and all the necessary information to render it.
+ - output format: html.
+ - generated plots are stored in specified directories and linked to .rst files.
+
+
+Report generation
+-----------------
+
+Report is generated using Sphinx and Read_the_Docs template. PAL generates html
+and pdf formats. It is possible to define the content of the report by
+specifying the version (TODO: define the names and content of versions).
+
+
+The process
+```````````
+
+1. Read the specification.
+2. Read the input data.
+3. Process the input data.
+4. For element (plot, table, file) defined in specification:
+
+   a. Get the data needed to construct the element using a filter.
+   b. Generate the element.
+   c. Store the element.
+
+5. Generate the report.
+6. Store the report (Nexus).
+
+The process is model driven. The elements' models (tables, plots, files
+and report itself) are defined in the specification file. Script reads
+the elements' models from specification file and generates the elements.
+
+It is easy to add elements to be generated in the report. If a new type
+of an element is required, only a new algorithm needs to be implemented
+and integrated.
+
+
+Continuous Performance Measurements and Trending
+------------------------------------------------
+
+Performance analysis and trending execution sequence:
+`````````````````````````````````````````````````````
+
+CSIT PA runs performance analysis, change detection and trending using specified
+trend analysis metrics over the rolling window of last <N> sets of historical
+measurement data. PA is defined as follows:
+
+    #. PA job triggers:
+
+        #. By PT job at its completion.
+        #. Manually from Jenkins UI.
+
+    #. Download and parse archived historical data and the new data:
+
+        #. New data from latest PT job is evaluated against the rolling window
+           of <N> sets of historical data.
+        #. Download RF output.xml files and compressed archived data.
+        #. Parse out the data filtering test cases listed in PA specification
+           (part of CSIT PAL specification file).
+
+    #. Calculate trend metrics for the rolling window of <N> sets of historical data:
+
+        #. Calculate quartiles Q1, Q2, Q3.
+        #. Trim outliers using IQR.
+        #. Calculate TMA and TMSD.
+        #. Calculate normal trending range per test case based on TMA and TMSD.
+
+    #. Evaluate new test data against trend metrics:
+
+        #. If within the range of (TMA +/- 3*TMSD) => Result = Pass,
+           Reason = Normal.
+        #. If below the range => Result = Fail, Reason = Regression.
+        #. If above the range => Result = Pass, Reason = Progression.
+
+    #. Generate and publish results
+
+        #. Relay evaluation result to job result.
+        #. Generate a new set of trend analysis summary graphs and drill-down
+           graphs.
+
+            #. Summary graphs to include measured values with Normal,
+               Progression and Regression markers. MM shown in the background if
+               possible.
+            #. Drill-down graphs to include MM, TMA and TMSD.
+
+        #. Publish trend analysis graphs in html format on
+           https://docs.fd.io/csit/master/trending/.
+
+
+Parameters to specify:
+``````````````````````
+
+- job to be monitored - the Jenkins job which results are used as input data for
+  this test;
+- builds used for trending plot(s) - specified by a list of build numbers or by
+  a range of builds defined by the first and the last buld number;
+- list plots to generate:
+
+  - plot title;
+  - output file name;
+  - data for plots;
+  - tests to be displayed in the plot defined by a filter;
+  - list of parameters to extract from the data;
+  - periods (daily = 1, weekly = 5, monthly = 30);
+  - plot layout
+
+*Example:*
+
+::
+
+    -
+      type: "cpta"
+      title: "Continuous Performance Trending and Analysis"
+      algorithm: "cpta"
+      output-file-type: ".html"
+      output-file: "{DIR[STATIC,VPP]}/cpta"
+      data: "plot-performance-trending"
+      plots:
+        - title: "VPP 1T1C L2 64B Packet Throughput - {period} Trending"
+          output-file-name: "l2-1t1c-x520"
+          data: "plot-performance-trending"
+          filter: "'NIC_Intel-X520-DA2' and 'MRR' and '64B' and ('BASE' or 'SCALE') and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST' and not 'MEMIF'"
+          parameters:
+          - "result"
+    #      - "name"
+          periods:
+          - 1
+          - 5
+          - 30
+          layout: "plot-cpta"
+
+        - title: "VPP 2T2C L2 64B Packet Throughput - {period} Trending"
+          output-file-name: "l2-2t2c-x520"
+          data: "plot-performance-trending"
+          filter: "'NIC_Intel-X520-DA2' and 'MRR' and '64B' and ('BASE' or 'SCALE') and '2T2C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST' and not 'MEMIF'"
+          parameters:
+          - "result"
+    #      - "name"
+          periods:
+          - 1
+          - 5
+          - 30
+          layout: "plot-cpta"
+
+API
+---
+
+List of modules, classes, methods and functions
+```````````````````````````````````````````````
+
+::
+
+    specification_parser.py
+
+        class Specification
+
+            Methods:
+                read_specification
+                set_input_state
+                set_input_file_name
+
+            Getters:
+                specification
+                environment
+                debug
+                is_debug
+                input
+                builds
+                output
+                tables
+                plots
+                files
+                static
+
+
+    input_data_parser.py
+
+        class InputData
+
+            Methods:
+                read_data
+                filter_data
+
+            Getters:
+                data
+                metadata
+                suites
+                tests
+
+
+    environment.py
+
+        Functions:
+            clean_environment
+
+        class Environment
+
+            Methods:
+                set_environment
+
+            Getters:
+                environment
+
+
+    input_data_files.py
+
+        Functions:
+            download_data_files
+            unzip_files
+
+
+    generator_tables.py
+
+        Functions:
+            generate_tables
+
+        Functions implementing algorithms to generate particular types of
+        tables (called by the function "generate_tables"):
+            table_details
+            table_performance_improvements
+
+
+    generator_plots.py
+
+        Functions:
+            generate_plots
+
+        Functions implementing algorithms to generate particular types of
+        plots (called by the function "generate_plots"):
+            plot_performance_box
+            plot_latency_box
+
+
+    generator_files.py
+
+        Functions:
+            generate_files
+
+        Functions implementing algorithms to generate particular types of
+        files (called by the function "generate_files"):
+            file_test_results
+
+
+    report.py
+
+        Functions:
+            generate_report
+
+        Functions implementing algorithms to generate particular types of
+        report (called by the function "generate_report"):
+            generate_html_report
+            generate_pdf_report
+
+        Other functions called by the function "generate_report":
+            archive_input_data
+            archive_report
+
+
+PAL functional diagram
+``````````````````````
+
+.. only:: latex
+
+    .. raw:: latex
+
+        \begin{figure}[H]
+        \centering
+            \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_func_diagram}
+            \label{fig:pal_func_diagram}
+        \end{figure}
+
+.. only:: html
+
+    .. figure:: pal_func_diagram.svg
+        :alt: PAL functional diagram
+        :align: center
+
+
+How to add an element
+`````````````````````
+
+Element can be added by adding it's model to the specification file. If
+the element is to be generated by an existing algorithm, only it's
+parameters must be set.
+
+If a brand new type of element needs to be added, also the algorithm
+must be implemented. Element generation algorithms are implemented in
+the files with names starting with "generator" prefix. The name of the
+function implementing the algorithm and the name of algorithm in the
+specification file have to be the same.
author	Vratko Polak <vrpolak@cisco.com>	2018-06-08 18:07:35 +0200
committer	Tibor Frank <tifrank@cisco.com>	2018-06-11 08:30:21 +0000
commit	beeb2acb9ac153eaa54983bea46a76d596168965 (patch)
tree	0465617b135a2e64693265969c48ff466db3d287 /resources/tools/presentation/new/doc/pal_lld.rst
parent	3dcef45002a1b82c4503ec590d680950930fa193 (diff)