Presentation and Analytics ========================== Overview -------- The presentation and analytics layer (PAL) is the fourth layer of CSIT hierarchy. The model of presentation and analytics layer consists of four sub-layers, bottom up: - sL1 - Data - input data to be processed: - Static content - .rst text files, .svg static figures, and other files stored in the CSIT git repository. - Data to process - .xml files generated by Jenkins jobs executing tests, stored as robot results files (output.xml). - Specification - .yaml file with the models of report elements (tables, plots, layout, ...) generated by this tool. There is also the configuration of the tool and the specification of input data (jobs and builds). - sL2 - Data processing - The data are read from the specified input files (.xml) and stored as multi-indexed `pandas.Series <https://pandas.pydata.org/pandas-docs/stable/ generated/pandas.Series.html>`_. - This layer provides also interface to input data and filtering of the input data. - sL3 - Data presentation - This layer generates the elements specified in the specification file: - Tables: .csv files linked to static .rst files. - Plots: .html files generated using plot.ly linked to static .rst files. - sL4 - Report generation - Sphinx generates required formats and versions: - formats: html, pdf - versions: minimal, full (TODO: define the names and scope of versions) .. only:: latex .. raw:: latex \begin{figure}[H] \centering \graphicspath{{../_tmp/src/csit_framework_documentation/}} \includegraphics[width=0.90\textwidth]{pal_layers} \label{fig:pal_layers} \end{figure} .. only:: html .. figure:: pal_layers.svg :alt: PAL Layers :align: center Data ---- Report Specification ```````````````````` The report specification file defines which data is used and which outputs are generated. It is human readable and structured. It is easy to add / remove / change items. The specification includes: - Specification of the environment. - Configuration of debug mode (optional). - Specification of input data (jobs, builds, files, ...). - Specification of the output. - What and how is generated: - What: plots, tables. - How: specification of all properties and parameters. - .yaml format. Structure of the specification file ''''''''''''''''''''''''''''''''''' The specification file is organized as a list of dictionaries distinguished by the type: :: - type: "environment" - type: "configuration" - type: "debug" - type: "static" - type: "input" - type: "output" - type: "table" - type: "plot" - type: "file" Each type represents a section. The sections "environment", "debug", "static", "input" and "output" are listed only once in the specification; "table", "file" and "plot" can be there multiple times. Sections "debug", "table", "file" and "plot" are optional. Table(s), files(s) and plot(s) are referred as "elements" in this text. It is possible to define and implement other elements if needed. Section: Environment '''''''''''''''''''' This section has the following parts: - type: "environment" - says that this is the section "environment". - configuration - configuration of the PAL. - paths - paths used by the PAL. - urls - urls pointing to the data sources. - make-dirs - a list of the directories to be created by the PAL while preparing the environment. - remove-dirs - a list of the directories to be removed while cleaning the environment. - build-dirs - a list of the directories where the results are stored. The structure of the section "Environment" is as follows (example): :: - type: "environment" configuration: # Debug mode: # - Skip: # - Download of input data files # - Do: # - Read data from given zip / xml files # - Set the configuration as it is done in normal mode # If the section "type: debug" is missing, CFG[DEBUG] is set to 0. CFG[DEBUG]: 0 paths: # Top level directories: ## Working directory DIR[WORKING]: "_tmp" ## Build directories DIR[BUILD,HTML]: "_build" DIR[BUILD,LATEX]: "_build_latex" # Static .rst files DIR[RST]: "../../../docs/report" # Working directories ## Input data files (.zip, .xml) DIR[WORKING,DATA]: "{DIR[WORKING]}/data" ## Static source files from git DIR[WORKING,SRC]: "{DIR[WORKING]}/src" DIR[WORKING,SRC,STATIC]: "{DIR[WORKING,SRC]}/_static" # Static html content DIR[STATIC]: "{DIR[BUILD,HTML]}/_static" DIR[STATIC,VPP]: "{DIR[STATIC]}/vpp" DIR[STATIC,DPDK]: "{DIR[STATIC]}/dpdk" DIR[STATIC,ARCH]: "{DIR[STATIC]}/archive" # Detailed test results DIR[DTR]: "{DIR[WORKING,SRC]}/detailed_test_results" DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results" DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results" DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results" DIR[DTR,FUNC,NSHSFC]: "{DIR[DTR]}/nshsfc_functional_results" DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements" # Detailed test configurations DIR[DTC]: "{DIR[WORKING,SRC]}/test_configuration" DIR[DTC,PERF,VPP]: "{DIR[DTC]}/vpp_performance_configuration" DIR[DTC,FUNC,VPP]: "{DIR[DTC]}/vpp_functional_configuration" # Detailed tests operational data DIR[DTO]: "{DIR[WORKING,SRC]}/test_operational_data" DIR[DTO,PERF,VPP]: "{DIR[DTO]}/vpp_performance_operational_data" # .css patch file to fix tables generated by Sphinx DIR[CSS_PATCH_FILE]: "{DIR[STATIC]}/theme_overrides.css" DIR[CSS_PATCH_FILE2]: "{DIR[WORKING,SRC,STATIC]}/theme_overrides.css" urls: URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job" URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job" make-dirs: # List the directories which are created while preparing the environment. # All directories MUST be defined in "paths" section. - "DIR[WORKING,DATA]" - "DIR[STATIC,VPP]" - "DIR[STATIC,DPDK]" - "DIR[STATIC,ARCH]" - "DIR[BUILD,LATEX]" - "DIR[WORKING,SRC]" - "DIR[WORKING,SRC,STATIC]" remove-dirs: # List the directories which are deleted while cleaning the environment. # All directories MUST be defined in "paths" section. #- "DIR[BUILD,HTML]" build-dirs: # List the directories where the results (build) is stored. # All directories MUST be defined in "paths" section. - "DIR[BUILD,HTML]" - "DIR[BUILD,LATEX]" It is possible to use defined items in the definition of other items, e.g.: :: DIR[WORKING,DATA]: "{DIR[WORKING]}/data" will be automatically changed to :: DIR[WORKING,DATA]: "_tmp/data" Section: Configuration '''''''''''''''''''''' This section specifies the groups of parameters which are repeatedly used in the elements defined later in the specification file. It has the following parts: - data sets - Specification of data sets used later in element's specifications to define the input data. - plot layouts - Specification of plot layouts used later in plots' specifications to define the plot layout. The structure of the section "Configuration" is as follows (example): :: - type: "configuration" data-sets: plot-vpp-throughput-latency: csit-vpp-perf-1710-all: - 11 - 12 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 20 vpp-perf-results: csit-vpp-perf-1710-all: - 20 - 23 plot-layouts: plot-throughput: xaxis: autorange: True autotick: False fixedrange: False gridcolor: "rgb(238, 238, 238)" linecolor: "rgb(238, 238, 238)" linewidth: 1 showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" tickmode: "linear" title: "Indexed Test Cases" zeroline: False yaxis: gridcolor: "rgb(238, 238, 238)'" hoverformat: ".4s" linecolor: "rgb(238, 238, 238)" linewidth: 1 range: [] showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" title: "Packets Per Second [pps]" zeroline: False boxmode: "group" boxgroupgap: 0.5 autosize: False margin: t: 50 b: 20 l: 50 r: 20 showlegend: True legend: orientation: "h" width: 700 height: 1000 The definitions from this sections are used in the elements, e.g.: :: - type: "plot" title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" algorithm: "plot_performance_box" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc" data: "plot-vpp-throughput-latency" filter: "'64B' and ('BASE' or 'SCALE') and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" parameters: - "throughput" - "parent" traces: hoverinfo: "x+y" boxpoints: "outliers" whiskerwidth: 0 layout: title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" layout: "plot-throughput" Section: Debug mode ''''''''''''''''''' This section is optional as it configures the debug mode. It is used if one does not want to download input data files and use local files instead. If the debug mode is configured, the "input" section is ignored. This section has the following parts: - type: "debug" - says that this is the section "debug". - general: - input-format - xml or zip. - extract - if "zip" is defined as the input format, this file is extracted from the zip file, otherwise this parameter is ignored. - builds - list of builds from which the data is used. Must include a job name as a key and then a list of builds and their output files. The structure of the section "Debug" is as follows (example): :: - type: "debug" general: input-format: "zip" # zip or xml extract: "robot-plugin/output.xml" # Only for zip builds: # The files must be in the directory DIR[WORKING,DATA] csit-dpdk-perf-1707-all: - build: 10 file: "csit-dpdk-perf-1707-all__10.xml" - build: 9 file: "csit-dpdk-perf-1707-all__9.xml" csit-nsh_sfc-verify-func-1707-ubuntu1604-virl: - build: 2 file: "csit-nsh_sfc-verify-func-1707-ubuntu1604-virl-2.xml" csit-vpp-functional-1707-ubuntu1604-virl: - build: lastSuccessfulBuild file: "csit-vpp-functional-1707-ubuntu1604-virl-lastSuccessfulBuild.xml" hc2vpp-csit-integration-1707-ubuntu1604: - build: lastSuccessfulBuild file: "hc2vpp-csit-integration-1707-ubuntu1604-lastSuccessfulBuild.xml" csit-vpp-perf-1707-all: - build: 16 file: "csit-vpp-perf-1707-all__16__output.xml" - build: 17 file: "csit-vpp-perf-1707-all__17__output.xml" Section: Static ''''''''''''''' This section defines the static content which is stored in git and will be used as a source to generate the report. This section has these parts: - type: "static" - says that this section is the "static". - src-path - path to the static content. - dst-path - destination path where the static content is copied and then processed. :: - type: "static" src-path: "{DIR[RST]}" dst-path: "{DIR[WORKING,SRC]}" Section: Input '''''''''''''' This section defines the data used to generate elements. It is mandatory if the debug mode is not used. This section has the following parts: - type: "input" - says that this section is the "input". - general - parameters common to all builds: - file-name: file to be downloaded. - file-format: format of the downloaded file, ".zip" or ".xml" are supported. - download-path: path to be added to url pointing to the file, e.g.: "{job}/{build}/robot/report/*zip*/{filename}"; {job}, {build} and {filename} are replaced by proper values defined in this section. - extract: file to be extracted from downloaded zip file, e.g.: "output.xml"; if xml file is downloaded, this parameter is ignored. - builds - list of jobs (keys) and numbers of builds which output data will be downloaded. The structure of the section "Input" is as follows (example from 17.07 report): :: - type: "input" # Ignored in debug mode general: file-name: "robot-plugin.zip" file-format: ".zip" download-path: "{job}/{build}/robot/report/*zip*/{filename}" extract: "robot-plugin/output.xml" builds: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 - 22 csit-dpdk-perf-1707-all: - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 csit-vpp-functional-1707-ubuntu1604-virl: - lastSuccessfulBuild hc2vpp-csit-perf-master-ubuntu1604: - 8 - 9 hc2vpp-csit-integration-1707-ubuntu1604: - lastSuccessfulBuild csit-nsh_sfc-verify-func-1707-ubuntu1604-virl: - 2 Section: Output ''''''''''''''' This section specifies which format(s) will be generated (html, pdf) and which versions will be generated for each format. This section has the following parts: - type: "output" - says that this section is the "output". - format: html or pdf. - version: defined for each format separately. The structure of the section "Output" is as follows (example): :: - type: "output" format: html: - full pdf: - full - minimal TODO: define the names of versions Content of "minimal" version ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TODO: define the name and content of this version Section: Table '''''''''''''' This section defines a table to be generated. There can be 0 or more "table" sections. This section has the following parts: - type: "table" - says that this section defines a table. - title: Title of the table. - algorithm: Algorithm which is used to generate the table. The other parameters in this section must provide all information needed by the used algorithm. - template: (optional) a .csv file used as a template while generating the table. - output-file-ext: extension of the output file. - output-file: file which the table will be written to. - columns: specification of table columns: - title: The title used in the table header. - data: Specification of the data, it has two parts - command and arguments: - command: - template - take the data from template, arguments: - number of column in the template. - data - take the data from the input data, arguments: - jobs and builds which data will be used. - operation - performs an operation with the data already in the table, arguments: - operation to be done, e.g.: mean, stdev, relative_change (compute the relative change between two columns) and display number of data samples ~= number of test jobs. The operations are implemented in the utils.py TODO: Move from utils,py to e.g. operations.py - numbers of columns which data will be used (optional). - data: Specify the jobs and builds which data is used to generate the table. - filter: filter based on tags applied on the input data, if "template" is used, filtering is based on the template. - parameters: Only these parameters will be put to the output data structure. The structure of the section "Table" is as follows (example of "table_performance_improvements"): :: - type: "table" title: "Performance improvements" algorithm: "table_performance_improvements" template: "{DIR[DTR,PERF,VPP,IMPRV]}/tmpl_performance_improvements.csv" output-file-ext: ".csv" output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/performance_improvements" columns: - title: "VPP Functionality" data: "template 1" - title: "Test Name" data: "template 2" - title: "VPP-16.09 mean [Mpps]" data: "template 3" - title: "VPP-17.01 mean [Mpps]" data: "template 4" - title: "VPP-17.04 mean [Mpps]" data: "template 5" - title: "VPP-17.07 mean [Mpps]" data: "data csit-vpp-perf-1707-all mean" - title: "VPP-17.07 stdev [Mpps]" data: "data csit-vpp-perf-1707-all stdev" - title: "17.04 to 17.07 change [%]" data: "operation relative_change 5 4" data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 filter: "template" parameters: - "throughput" Example of "table_details" which generates "Detailed Test Results - VPP Performance Results": :: - type: "table" title: "Detailed Test Results - VPP Performance Results" algorithm: "table_details" output-file-ext: ".csv" output-file: "{DIR[WORKING]}/vpp_performance_results" columns: - title: "Name" data: "data test_name" - title: "Documentation" data: "data test_documentation" - title: "Status" data: "data test_msg" data: csit-vpp-perf-1707-all: - 17 filter: "all" parameters: - "parent" - "doc" - "msg" Example of "table_details" which generates "Test configuration - VPP Performance Test Configs": :: - type: "table" title: "Test configuration - VPP Performance Test Configs" algorithm: "table_details" output-file-ext: ".csv" output-file: "{DIR[WORKING]}/vpp_test_configuration" columns: - title: "Name" data: "data name" - title: "VPP API Test (VAT) Commands History - Commands Used Per Test Case" data: "data show-run" data: csit-vpp-perf-1707-all: - 17 filter: "all" parameters: - "parent" - "name" - "show-run" Section: Plot ''''''''''''' This section defines a plot to be generated. There can be 0 or more "plot" sections. This section has these parts: - type: "plot" - says that this section defines a plot. - title: Plot title used in the logs. Title which is displayed is in the section "layout". - output-file-type: format of the output file. - output-file: file which the plot will be written to. - algorithm: Algorithm used to generate the plot. The other parameters in this section must provide all information needed by plot.ly to generate the plot. For example: - traces - layout - These parameters are transparently passed to plot.ly. - data: Specify the jobs and numbers of builds which data is used to generate the plot. - filter: filter applied on the input data. - parameters: Only these parameters will be put to the output data structure. The structure of the section "Plot" is as follows (example of a plot showing throughput in a chart box-with-whiskers): :: - type: "plot" title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" algorithm: "plot_performance_box" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc" data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 # Keep this formatting, the filter is enclosed with " (quotation mark) and # each tag is enclosed with ' (apostrophe). filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" parameters: - "throughput" - "parent" traces: hoverinfo: "x+y" boxpoints: "outliers" whiskerwidth: 0 layout: title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" xaxis: autorange: True autotick: False fixedrange: False gridcolor: "rgb(238, 238, 238)" linecolor: "rgb(238, 238, 238)" linewidth: 1 showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" tickmode: "linear" title: "Indexed Test Cases" zeroline: False yaxis: gridcolor: "rgb(238, 238, 238)'" hoverformat: ".4s" linecolor: "rgb(238, 238, 238)" linewidth: 1 range: [] showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" title: "Packets Per Second [pps]" zeroline: False boxmode: "group" boxgroupgap: 0.5 autosize: False margin: t: 50 b: 20 l: 50 r: 20 showlegend: True legend: orientation: "h" width: 700 height: 1000 The structure of the section "Plot" is as follows (example of a plot showing latency in a box chart): :: - type: "plot" title: "VPP Latency 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" algorithm: "plot_latency_box" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc-lat50" data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" parameters: - "latency" - "parent" traces: boxmean: False layout: title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" xaxis: autorange: True autotick: False fixedrange: False gridcolor: "rgb(238, 238, 238)" linecolor: "rgb(238, 238, 238)" linewidth: 1 showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" tickmode: "linear" title: "Indexed Test Cases" zeroline: False yaxis: gridcolor: "rgb(238, 238, 238)'" hoverformat: "" linecolor: "rgb(238, 238, 238)" linewidth: 1 range: [] showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" title: "Latency min/avg/max [uSec]" zeroline: False boxmode: "group" boxgroupgap: 0.5 autosize: False margin: t: 50 b: 20 l: 50 r: 20 showlegend: True legend: orientation: "h" width: 700 height: 1000 The structure of the section "Plot" is as follows (example of a plot showing VPP HTTP server performance in a box chart with pre-defined data "plot-vpp-httlp-server-performance" set and plot layout "plot-cps"): :: - type: "plot" title: "VPP HTTP Server Performance" algorithm: "plot_http_server_performance_box" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/http-server-performance-cps" data: "plot-vpp-httlp-server-performance" # Keep this formatting, the filter is enclosed with " (quotation mark) and # each tag is enclosed with ' (apostrophe). filter: "'HTTP' and 'TCP_CPS'" parameters: - "result" - "name" traces: hoverinfo: "x+y" boxpoints: "outliers" whiskerwidth: 0 layout: title: "VPP HTTP Server Performance" layout: "plot-cps" Section: file ''''''''''''' This section defines a file to be generated. There can be 0 or more "file" sections. This section has the following parts: - type: "file" - says that this section defines a file. - title: Title of the table. - algorithm: Algorithm which is used to generate the file. The other parameters in this section must provide all information needed by the used algorithm. - output-file-ext: extension of the output file. - output-file: file which the file will be written to. - file-header: The header of the generated .rst file. - dir-tables: The directory with the tables. - data: Specify the jobs and builds which data is used to generate the table. - filter: filter based on tags applied on the input data, if "all" is used, no filtering is done. - parameters: Only these parameters will be put to the output data structure. - chapters: the hierarchy of chapters in the generated file. - start-level: the level of the the top-level chapter. The structure of the section "file" is as follows (example): :: - type: "file" title: "VPP Performance Results" algorithm: "file_test_results" output-file-ext: ".rst" output-file: "{DIR[DTR,PERF,VPP]}/vpp_performance_results" file-header: "\n.. |br| raw:: html\n\n <br />\n\n\n.. |prein| raw:: html\n\n <pre>\n\n\n.. |preout| raw:: html\n\n </pre>\n\n" dir-tables: "{DIR[DTR,PERF,VPP]}" data: csit-vpp-perf-1707-all: - 22 filter: "all" parameters: - "name" - "doc" - "level" data-start-level: 2 # 0, 1, 2, ... chapters-start-level: 2 # 0, 1, 2, ... Static content `````````````` - Manually created / edited files. - .rst files, static .csv files, static pictures (.svg), ... - Stored in CSIT git repository. No more details about the static content in this document. Data to process ``````````````` The PAL processes tests results and other information produced by Jenkins jobs. The data are now stored as robot results in Jenkins (TODO: store the data in nexus) either as .zip and / or .xml files. Data processing --------------- As the first step, the data are downloaded and stored locally (typically on a Jenkins slave). If .zip files are used, the given .xml files are extracted for further processing. Parsing of the .xml files is performed by a class derived from "robot.api.ResultVisitor", only necessary methods are overridden. All and only necessary data is extracted from .xml file and stored in a structured form. The parsed data are stored as the multi-indexed pandas.Series data type. Its structure is as follows: :: <job name> <build> <metadata> <suites> <tests> "job name", "build", "metadata", "suites", "tests" are indexes to access the data. For example: :: data = job 1 name: build 1: metadata: metadata suites: suites tests: tests ... build N: metadata: metadata suites: suites tests: tests ... job M name: build 1: metadata: metadata suites: suites tests: tests ... build N: metadata: metadata suites: suites tests: tests Using indexes data["job 1 name"]["build 1"]["tests"] (e.g.: data["csit-vpp-perf-1704-all"]["17"]["tests"]) we get a list of all tests with all tests data. Data will not be accessible directly using indexes, but using getters and filters. **Structure of metadata:** :: "metadata": { "version": "VPP version", "job": "Jenkins job name" "build": "Information about the build" }, **Structure of suites:** :: "suites": { "Suite name 1": { "doc": "Suite 1 documentation" "parent": "Suite 1 parent" } "Suite name N": { "doc": "Suite N documentation" "parent": "Suite N parent" } **Structure of tests:** Performance tests: :: "tests": { "ID": { "name": "Test name", "parent": "Name of the parent of the test", "doc": "Test documentation" "msg": "Test message" "tags": ["tag 1", "tag 2", "tag n"], "type": "PDR" | "NDR", "throughput": { "value": int, "unit": "pps" | "bps" | "percentage" }, "latency": { "direction1": { "100": { "min": int, "avg": int, "max": int }, "50": { # Only for NDR "min": int, "avg": int, "max": int }, "10": { # Only for NDR "min": int, "avg": int, "max": int } }, "direction2": { "100": { "min": int, "avg": int, "max": int }, "50": { # Only for NDR "min": int, "avg": int, "max": int }, "10": { # Only for NDR "min": int, "avg": int, "max": int } } }, "lossTolerance": "lossTolerance" # Only for PDR "vat-history": "DUT1 and DUT2 VAT History" }, "show-run": "Show Run" }, "ID" { # next test } Functional tests: :: "tests": { "ID": { "name": "Test name", "parent": "Name of the parent of the test", "doc": "Test documentation" "msg": "Test message" "tags": ["tag 1", "tag 2", "tag n"], "vat-history": "DUT1 and DUT2 VAT History" "show-run": "Show Run" "status": "PASS" | "FAIL" }, "ID" { # next test } } Note: ID is the lowercase full path to the test. Data filtering `````````````` The first step when generating an element is getting the data needed to construct the element. The data are filtered from the processed input data. The data filtering is based on: - job name(s). - build number(s). - tag(s). - required data - only this data is included in the output. WARNING: The filtering is based on tags, so be careful with tagging. For example, the element which specification includes: :: data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 filter: - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" will be constructed using data from the job "csit-vpp-perf-1707-all", for all listed builds and the tests with the list of tags matching the filter conditions. The output data structure for filtered test data is: :: - job 1 - build 1 - test 1 - parameter 1 - parameter 2 ... - parameter n ... - test n ... ... - build n ... - job n Data analytics `````````````` Data analytics part implements: - methods to compute statistical data from the filtered input data. - trending. Throughput Speedup Analysis - Multi-Core with Multi-Threading ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' Throughput Speedup Analysis (TSA) calculates throughput speedup ratios for tested 1-, 2- and 4-core multi-threaded VPP configurations using the following formula: :: N_core_throughput N_core_throughput_speedup = ----------------- 1_core_throughput Multi-core throughput speedup ratios are plotted in grouped bar graphs for throughput tests with 64B/78B frame size, with number of cores on X-axis and speedup ratio on Y-axis. For better comparison multiple test results' data sets are plotted per each graph: - graph type: grouped bars; - graph X-axis: (testcase index, number of cores); - graph Y-axis: speedup factor. Subset of existing performance tests is covered by TSA graphs. **Model for TSA:** :: - type: "plot" title: "TSA: 64B-*-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" algorithm: "plot_throughput_speedup_analysis" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/10ge2p1x520-64B-l2-tsa-ndrdisc" data: "plot-throughput-speedup-analysis" filter: "'NIC_Intel-X520-DA2' and '64B' and 'BASE' and 'NDRDISC' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" parameters: - "throughput" - "parent" - "tags" layout: title: "64B-*-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" layout: "plot-throughput-speedup-analysis" Comparison of results from two sets of the same test executions ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' This algorithm enables comparison of results coming from two sets of the same test executions. It is used to quantify performance changes across all tests after test environment changes e.g. Operating System upgrades/patches, Hardware changes. It is assumed that each set of test executions includes multiple runs of the same tests, 10 or more, to verify test results repeatibility and to yield statistically meaningful results data. Comparison results are presented in a table with a specified number of the best and the worst relative changes between the two sets. Following table columns are defined: - name of the test; - throughput mean values of the reference set; - throughput standard deviation of the reference set; - throughput mean values of the set to compare; - throughput standard deviation of the set to compare; - relative change of the mean values. **The model** The model specifies: - type: "table" - means this section defines a table. - title: Title of the table. - algorithm: Algorithm which is used to generate the table. The other parameters in this section must provide all information needed by the used algorithm. - output-file-ext: Extension of the output file. - output-file: File which the table will be written to. - reference - the builds which are used as the reference for comparison. - compare - the builds which are compared to the reference. - data: Specify the sources, jobs and builds, providing data for generating the table. - filter: Filter based on tags applied on the input data, if "template" is used, filtering is based on the template. - parameters: Only these parameters will be put to the output data structure. - nr-of-tests-shown: Number of the best and the worst tests presented in the table. Use 0 (zero) to present all tests. *Example:* :: - type: "table" title: "Performance comparison" algorithm: "table_performance_comparison" output-file-ext: ".csv" output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/vpp_performance_comparison" reference: title: "csit-vpp-perf-1801-all - 1" data: csit-vpp-perf-1801-all: - 1 - 2 compare: title: "csit-vpp-perf-1801-all - 2" data: csit-vpp-perf-1801-all: - 1 - 2 data: "vpp-perf-comparison" filter: "all" parameters: - "name" - "parent" - "throughput" nr-of-tests-shown: 20 Advanced data analytics ``````````````````````` In the future advanced data analytics (ADA) will be added to analyze the telemetry data collected from SUT telemetry sources and correlate it to performance test results. :TODO: - describe the concept of ADA. - add specification. Data presentation ----------------- Generates the plots and tables according to the report models per specification file. The elements are generated using algorithms and data specified in their models. Tables `````` - tables are generated by algorithms implemented in PAL, the model includes the algorithm and all necessary information. - output format: csv - generated tables are stored in specified directories and linked to .rst files. Plots ````` - `plot.ly <https://plot.ly/>`_ is currently used to generate plots, the model includes the type of plot and all the necessary information to render it. - output format: html. - generated plots are stored in specified directories and linked to .rst files. Report generation ----------------- Report is generated using Sphinx and Read_the_Docs template. PAL generates html and pdf formats. It is possible to define the content of the report by specifying the version (TODO: define the names and content of versions). The process ``````````` 1. Read the specification. 2. Read the input data. 3. Process the input data. 4. For element (plot, table, file) defined in specification: a. Get the data needed to construct the element using a filter. b. Generate the element. c. Store the element. 5. Generate the report. 6. Store the report (Nexus). The process is model driven. The elements' models (tables, plots, files and report itself) are defined in the specification file. Script reads the elements' models from specification file and generates the elements. It is easy to add elements to be generated in the report. If a new type of an element is required, only a new algorithm needs to be implemented and integrated. Continuous Performance Measurements and Trending ------------------------------------------------ Performance analysis and trending execution sequence: ````````````````````````````````````````````````````` CSIT PA runs performance analysis, change detection and trending using specified trend analysis metrics over the rolling window of last <N> sets of historical measurement data. PA is defined as follows: #. PA job triggers: #. By PT job at its completion. #. Manually from Jenkins UI. #. Download and parse archived historical data and the new data: #. New data from latest PT job is evaluated against the rolling window of <N> sets of historical data. #. Download RF output.xml files and compressed archived data. #. Parse out the data filtering test cases listed in PA specification (part of CSIT PAL specification file). #. Calculate trend metrics for the rolling window of <N> sets of historical data: #. Calculate quartiles Q1, Q2, Q3. #. Trim outliers using IQR. #. Calculate TMA and TMSD. #. Calculate normal trending range per test case based on TMA and TMSD. #. Evaluate new test data against trend metrics: #. If within the range of (TMA +/- 3*TMSD) => Result = Pass, Reason = Normal. #. If below the range => Result = Fail, Reason = Regression. #. If above the range => Result = Pass, Reason = Progression. #. Generate and publish results #. Relay evaluation result to job result. #. Generate a new set of trend analysis summary graphs and drill-down graphs. #. Summary graphs to include measured values with Normal, Progression and Regression markers. MM shown in the background if possible. #. Drill-down graphs to include MM, TMA and TMSD. #. Publish trend analysis graphs in html format on https://docs.fd.io/csit/master/trending/. Parameters to specify: `````````````````````` *General section - parameters common to all plots:* - type: "cpta"; - title: The title of this section; - output-file-type: only ".html" is supported; - output-file: path where the generated files will be stored. *Plots section:* - plot title; - output file name; - input data for plots; - job to be monitored - the Jenkins job which results are used as input data for this test; - builds used for trending plot(s) - specified by a list of build numbers or by a range of builds defined by the first and the last build number; - tests to be displayed in the plot defined by a filter; - list of parameters to extract from the data; - plot layout *Example:* :: - type: "cpta" title: "Continuous Performance Trending and Analysis" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/cpta" plots: - title: "VPP 1T1C L2 64B Packet Throughput - Trending" output-file-name: "l2-1t1c-x520" data: "plot-performance-trending-vpp" filter: "'NIC_Intel-X520-DA2' and 'MRR' and '64B' and ('BASE' or 'SCALE') and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST' and not 'MEMIF'" parameters: - "result" layout: "plot-cpta-vpp" - title: "DPDK 4T4C IMIX MRR Trending" output-file-name: "dpdk-imix-4t4c-xl710" data: "plot-performance-trending-dpdk" filter: "'NIC_Intel-XL710' and 'IMIX' and 'MRR' and '4T4C' and 'DPDK'" parameters: - "result" layout: "plot-cpta-dpdk" The Dashboard ````````````` Performance dashboard tables provide the latest VPP throughput trend, trend compliance and detected anomalies, all on a per VPP test case basis. The Dashboard is generated as three tables for 1t1c, 2t2c and 4t4c MRR tests. At first, the .csv tables are generated (only the table for 1t1c is shown): :: - type: "table" title: "Performance trending dashboard" algorithm: "table_performance_trending_dashboard" output-file-ext: ".csv" output-file: "{DIR[STATIC,VPP]}/performance-trending-dashboard-1t1c" data: "plot-performance-trending-all" filter: "'MRR' and '1T1C'" parameters: - "name" - "parent" - "result" ignore-list: - "tests.vpp.perf.l2.10ge2p1x520-eth-l2bdscale1mmaclrn-mrr.tc01-64b-1t1c-eth-l2bdscale1mmaclrn-ndrdisc" outlier-const: 1.5 window: 14 evaluated-window: 14 long-trend-window: 180 Then, html tables stored inside .rst files are generated: :: - type: "table" title: "HTML performance trending dashboard 1t1c" algorithm: "table_performance_trending_dashboard_html" input-file: "{DIR[STATIC,VPP]}/performance-trending-dashboard-1t1c.csv" output-file: "{DIR[STATIC,VPP]}/performance-trending-dashboard-1t1c.rst" Root Cause Analysis ------------------- Root Cause Analysis (RCA) by analysing archived performance results – re-analyse available data for specified: - range of jobs builds, - set of specific tests and - PASS/FAIL criteria to detect performance change. In addition, PAL generates trending plots to show performance over the specified time interval. Root Cause Analysis - Option 1: Analysing Archived VPP Results `````````````````````````````````````````````````````````````` It can be used to speed-up the process, or when the existing data is sufficient. In this case, PAL uses existing data saved in Nexus, searches for performance degradations and generates plots to show performance over the specified time interval for the selected tests. Execution Sequence '''''''''''''''''' #. Download and parse archived historical data and the new data. #. Calculate trend metrics. #. Find regression / progression. #. Generate and publish results: #. Summary graphs to include measured values with Progression and Regression markers. #. List the DUT build(s) where the anomalies were detected. CSIT PAL Specification '''''''''''''''''''''' - What to test: - first build (Good); specified by the Jenkins job name and the build number - last build (Bad); specified by the Jenkins job name and the build number - step (1..n). - Data: - tests of interest; list of tests (full name is used) which results are used *Example:* :: TODO API --- List of modules, classes, methods and functions ``````````````````````````````````````````````` :: specification_parser.py class Specification Methods: read_specification set_input_state set_input_file_name Getters: specification environment debug is_debug input builds output tables plots files static input_data_parser.py class InputData Methods: read_data filter_data Getters: data metadata suites tests environment.py Functions: clean_environment class Environment Methods: set_environment Getters: environment input_data_files.py Functions: download_data_files unzip_files generator_tables.py Functions: generate_tables Functions implementing algorithms to generate particular types of tables (called by the function "generate_tables"): table_details table_performance_improvements generator_plots.py Functions: generate_plots Functions implementing algorithms to generate particular types of plots (called by the function "generate_plots"): plot_performance_box plot_latency_box generator_files.py Functions: generate_files Functions implementing algorithms to generate particular types of files (called by the function "generate_files"): file_test_results report.py Functions: generate_report Functions implementing algorithms to generate particular types of report (called by the function "generate_report"): generate_html_report generate_pdf_report Other functions called by the function "generate_report": archive_input_data archive_report PAL functional diagram `````````````````````` .. only:: latex .. raw:: latex \begin{figure}[H] \centering \graphicspath{{../_tmp/src/csit_framework_documentation/}} \includegraphics[width=0.90\textwidth]{pal_func_diagram} \label{fig:pal_func_diagram} \end{figure} .. only:: html .. figure:: pal_func_diagram.svg :alt: PAL functional diagram :align: center How to add an element ````````````````````` Element can be added by adding it's model to the specification file. If the element is to be generated by an existing algorithm, only it's parameters must be set. If a brand new type of element needs to be added, also the algorithm must be implemented. Element generation algorithms are implemented in the files with names starting with "generator" prefix. The name of the function implementing the algorithm and the name of algorithm in the specification file have to be the same.