Presentation and Analytics Layer ================================ Overview -------- The presentation and analytics layer (PAL) is the fourth layer of CSIT hierarchy. The model of presentation and analytics layer consists of four sub-layers, bottom up: - sL1 - Data - input data to be processed: - Static content - .rst text files, .svg static figures, and other files stored in the CSIT git repository. - Data to process - .xml files generated by Jenkins jobs executing tests, stored as robot results files (output.xml). - Specification - .yaml file with the models of report elements (tables, plots, layout, ...) generated by this tool. There is also the configuration of the tool and the specification of input data (jobs and builds). - sL2 - Data processing - The data are read from the specified input files (.xml) and stored as multi-indexed `pandas.Series `_. - This layer provides also interface to input data and filtering of the input data. - sL3 - Data presentation - This layer generates the elements specified in the specification file: - Tables: .csv files linked to static .rst files. - Plots: .html files generated using plot.ly linked to static .rst files. - sL4 - Report generation - Sphinx generates required formats and versions: - formats: html, pdf - versions: minimal, full (TODO: define the names and scope of versions) .. only:: latex .. raw:: latex \begin{figure}[H] \centering \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_layers} \label{fig:pal_layers} \end{figure} .. only:: html .. figure:: pal_layers.svg :alt: PAL Layers :align: center Data ---- Report Specification ```````````````````` The report specification file defines which data is used and which outputs are generated. It is human readable and structured. It is easy to add / remove / change items. The specification includes: - Specification of the environment. - Configuration of debug mode (optional). - Specification of input data (jobs, builds, files, ...). - Specification of the output. - What and how is generated: - What: plots, tables. - How: specification of all properties and parameters. - .yaml format. Structure of the specification file ''''''''''''''''''''''''''''''''''' The specification file is organized as a list of dictionaries distinguished by the type: :: - type: "environment" - type: "configuration" - type: "debug" - type: "static" - type: "input" - type: "output" - type: "table" - type: "plot" - type: "file" Each type represents a section. The sections "environment", "debug", "static", "input" and "output" are listed only once in the specification; "table", "file" and "plot" can be there multiple times. Sections "debug", "table", "file" and "plot" are optional. Table(s), files(s) and plot(s) are referred as "elements" in this text. It is possible to define and implement other elements if needed. Section: Environment '''''''''''''''''''' This section has the following parts: - type: "environment" - says that this is the section "environment". - configuration - configuration of the PAL. - paths - paths used by the PAL. - urls - urls pointing to the data sources. - make-dirs - a list of the directories to be created by the PAL while preparing the environment. - remove-dirs - a list of the directories to be removed while cleaning the environment. - build-dirs - a list of the directories where the results are stored. The structure of the section "Environment" is as follows (example): :: - type: "environment" configuration: # Debug mode: # - Skip: # - Download of input data files # - Do: # - Read data from given zip / xml files # - Set the configuration as it is done in normal mode # If the section "type: debug" is missing, CFG[DEBUG] is set to 0. CFG[DEBUG]: 0 paths: # Top level directories: ## Working directory DIR[WORKING]: "_tmp" ## Build directories DIR[BUILD,HTML]: "_build" DIR[BUILD,LATEX]: "_build_latex" # Static .rst files DIR[RST]: "../../../docs/report" # Working directories ## Input data files (.zip, .xml) DIR[WORKING,DATA]: "{DIR[WORKING]}/data" ## Static source files from git DIR[WORKING,SRC]: "{DIR[WORKING]}/src" DIR[WORKING,SRC,STATIC]: "{DIR[WORKING,SRC]}/_static" # Static html content DIR[STATIC]: "{DIR[BUILD,HTML]}/_static" DIR[STATIC,VPP]: "{DIR[STATIC]}/vpp" DIR[STATIC,DPDK]: "{DIR[STATIC]}/dpdk" DIR[STATIC,ARCH]: "{DIR[STATIC]}/archive" # Detailed test results DIR[DTR]: "{DIR[WORKING,SRC]}/detailed_test_results" DIR[DTR,PERF,DPDK]: "{DIR[DTR]}/dpdk_performance_results" DIR[DTR,PERF,VPP]: "{DIR[DTR]}/vpp_performance_results" DIR[DTR,PERF,HC]: "{DIR[DTR]}/honeycomb_performance_results" DIR[DTR,FUNC,VPP]: "{DIR[DTR]}/vpp_functional_results" DIR[DTR,FUNC,HC]: "{DIR[DTR]}/honeycomb_functional_results" DIR[DTR,FUNC,NSHSFC]: "{DIR[DTR]}/nshsfc_functional_results" DIR[DTR,PERF,VPP,IMPRV]: "{DIR[WORKING,SRC]}/vpp_performance_tests/performance_improvements" # Detailed test configurations DIR[DTC]: "{DIR[WORKING,SRC]}/test_configuration" DIR[DTC,PERF,VPP]: "{DIR[DTC]}/vpp_performance_configuration" DIR[DTC,FUNC,VPP]: "{DIR[DTC]}/vpp_functional_configuration" # Detailed tests operational data DIR[DTO]: "{DIR[WORKING,SRC]}/test_operational_data" DIR[DTO,PERF,VPP]: "{DIR[DTO]}/vpp_performance_operational_data" # .css patch file to fix tables generated by Sphinx DIR[CSS_PATCH_FILE]: "{DIR[STATIC]}/theme_overrides.css" DIR[CSS_PATCH_FILE2]: "{DIR[WORKING,SRC,STATIC]}/theme_overrides.css" urls: URL[JENKINS,CSIT]: "https://jenkins.fd.io/view/csit/job" URL[JENKINS,HC]: "https://jenkins.fd.io/view/hc2vpp/job" make-dirs: # List the directories which are created while preparing the environment. # All directories MUST be defined in "paths" section. - "DIR[WORKING,DATA]" - "DIR[STATIC,VPP]" - "DIR[STATIC,DPDK]" - "DIR[STATIC,ARCH]" - "DIR[BUILD,LATEX]" - "DIR[WORKING,SRC]" - "DIR[WORKING,SRC,STATIC]" remove-dirs: # List the directories which are deleted while cleaning the environment. # All directories MUST be defined in "paths" section. #- "DIR[BUILD,HTML]" build-dirs: # List the directories where the results (build) is stored. # All directories MUST be defined in "paths" section. - "DIR[BUILD,HTML]" - "DIR[BUILD,LATEX]" It is possible to use defined items in the definition of other items, e.g.: :: DIR[WORKING,DATA]: "{DIR[WORKING]}/data" will be automatically changed to :: DIR[WORKING,DATA]: "_tmp/data" Section: Configuration '''''''''''''''''''''' This section specifies the groups of parameters which are repeatedly used in the elements defined later in the specification file. It has the following parts: - data sets - Specification of data sets used later in element's specifications to define the input data. - plot layouts - Specification of plot layouts used later in plots' specifications to define the plot layout. The structure of the section "Configuration" is as follows (example): :: - type: "configuration" data-sets: plot-vpp-throughput-latency: csit-vpp-perf-1710-all: - 11 - 12 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 20 vpp-perf-results: csit-vpp-perf-1710-all: - 20 - 23 plot-layouts: plot-throughput: xaxis: autorange: True autotick: False fixedrange: False gridcolor: "rgb(238, 238, 238)" linecolor: "rgb(238, 238, 238)" linewidth: 1 showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" tickmode: "linear" title: "Indexed Test Cases" zeroline: False yaxis: gridcolor: "rgb(238, 238, 238)'" hoverformat: ".4s" linecolor: "rgb(238, 238, 238)" linewidth: 1 range: [] showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" title: "Packets Per Second [pps]" zeroline: False boxmode: "group" boxgroupgap: 0.5 autosize: False margin: t: 50 b: 20 l: 50 r: 20 showlegend: True legend: orientation: "h" width: 700 height: 1000 The definitions from this sections are used in the elements, e.g.: :: - type: "plot" title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" algorithm: "plot_performance_box" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc" data: "plot-vpp-throughput-latency" filter: "'64B' and ('BASE' or 'SCALE') and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" parameters: - "throughput" - "parent" traces: hoverinfo: "x+y" boxpoints: "outliers" whiskerwidth: 0 layout: title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" layout: "plot-throughput" Section: Debug mode ''''''''''''''''''' This section is optional as it configures the debug mode. It is used if one does not want to download input data files and use local files instead. If the debug mode is configured, the "input" section is ignored. This section has the following parts: - type: "debug" - says that this is the section "debug". - general: - input-format - xml or zip. - extract - if "zip" is defined as the input format, this file is extracted from the zip file, otherwise this parameter is ignored. - builds - list of builds from which the data is used. Must include a job name as a key and then a list of builds and their output files. The structure of the section "Debug" is as follows (example): :: - type: "debug" general: input-format: "zip" # zip or xml extract: "robot-plugin/output.xml" # Only for zip builds: # The files must be in the directory DIR[WORKING,DATA] csit-dpdk-perf-1707-all: - build: 10 file: "csit-dpdk-perf-1707-all__10.xml" - build: 9 file: "csit-dpdk-perf-1707-all__9.xml" csit-nsh_sfc-verify-func-1707-ubuntu1604-virl: - build: 2 file: "csit-nsh_sfc-verify-func-1707-ubuntu1604-virl-2.xml" csit-vpp-functional-1707-ubuntu1604-virl: - build: lastSuccessfulBuild file: "csit-vpp-functional-1707-ubuntu1604-virl-lastSuccessfulBuild.xml" hc2vpp-csit-integration-1707-ubuntu1604: - build: lastSuccessfulBuild file: "hc2vpp-csit-integration-1707-ubuntu1604-lastSuccessfulBuild.xml" csit-vpp-perf-1707-all: - build: 16 file: "csit-vpp-perf-1707-all__16__output.xml" - build: 17 file: "csit-vpp-perf-1707-all__17__output.xml" Section: Static ''''''''''''''' This section defines the static content which is stored in git and will be used as a source to generate the report. This section has these parts: - type: "static" - says that this section is the "static". - src-path - path to the static content. - dst-path - destination path where the static content is copied and then processed. :: - type: "static" src-path: "{DIR[RST]}" dst-path: "{DIR[WORKING,SRC]}" Section: Input '''''''''''''' This section defines the data used to generate elements. It is mandatory if the debug mode is not used. This section has the following parts: - type: "input" - says that this section is the "input". - general - parameters common to all builds: - file-name: file to be downloaded. - file-format: format of the downloaded file, ".zip" or ".xml" are supported. - download-path: path to be added to url pointing to the file, e.g.: "{job}/{build}/robot/report/*zip*/{filename}"; {job}, {build} and {filename} are replaced by proper values defined in this section. - extract: file to be extracted from downloaded zip file, e.g.: "output.xml"; if xml file is downloaded, this parameter is ignored. - builds - list of jobs (keys) and numbers of builds which output data will be downloaded. The structure of the section "Input" is as follows (example from 17.07 report): :: - type: "input" # Ignored in debug mode general: file-name: "robot-plugin.zip" file-format: ".zip" download-path: "{job}/{build}/robot/report/*zip*/{filename}" extract: "robot-plugin/output.xml" builds: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 - 22 csit-dpdk-perf-1707-all: - 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 csit-vpp-functional-1707-ubuntu1604-virl: - lastSuccessfulBuild hc2vpp-csit-perf-master-ubuntu1604: - 8 - 9 hc2vpp-csit-integration-1707-ubuntu1604: - lastSuccessfulBuild csit-nsh_sfc-verify-func-1707-ubuntu1604-virl: - 2 Section: Output ''''''''''''''' This section specifies which format(s) will be generated (html, pdf) and which versions will be generated for each format. This section has the following parts: - type: "output" - says that this section is the "output". - format: html or pdf. - version: defined for each format separately. The structure of the section "Output" is as follows (example): :: - type: "output" format: html: - full pdf: - full - minimal TODO: define the names of versions Content of "minimal" version ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ TODO: define the name and content of this version Section: Table '''''''''''''' This section defines a table to be generated. There can be 0 or more "table" sections. This section has the following parts: - type: "table" - says that this section defines a table. - title: Title of the table. - algorithm: Algorithm which is used to generate the table. The other parameters in this section must provide all information needed by the used algorithm. - template: (optional) a .csv file used as a template while generating the table. - output-file-ext: extension of the output file. - output-file: file which the table will be written to. - columns: specification of table columns: - title: The title used in the table header. - data: Specification of the data, it has two parts - command and arguments: - command: - template - take the data from template, arguments: - number of column in the template. - data - take the data from the input data, arguments: - jobs and builds which data will be used. - operation - performs an operation with the data already in the table, arguments: - operation to be done, e.g.: mean, stdev, relative_change (compute the relative change between two columns) and display number of data samples ~= number of test jobs. The operations are implemented in the utils.py TODO: Move from utils,py to e.g. operations.py - numbers of columns which data will be used (optional). - data: Specify the jobs and builds which data is used to generate the table. - filter: filter based on tags applied on the input data, if "template" is used, filtering is based on the template. - parameters: Only these parameters will be put to the output data structure. The structure of the section "Table" is as follows (example of "table_performance_improvements"): :: - type: "table" title: "Performance improvements" algorithm: "table_performance_improvements" template: "{DIR[DTR,PERF,VPP,IMPRV]}/tmpl_performance_improvements.csv" output-file-ext: ".csv" output-file: "{DIR[DTR,PERF,VPP,IMPRV]}/performance_improvements" columns: - title: "VPP Functionality" data: "template 1" - title: "Test Name" data: "template 2" - title: "VPP-16.09 mean [Mpps]" data: "template 3" - title: "VPP-17.01 mean [Mpps]" data: "template 4" - title: "VPP-17.04 mean [Mpps]" data: "template 5" - title: "VPP-17.07 mean [Mpps]" data: "data csit-vpp-perf-1707-all mean" - title: "VPP-17.07 stdev [Mpps]" data: "data csit-vpp-perf-1707-all stdev" - title: "17.04 to 17.07 change [%]" data: "operation relative_change 5 4" data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 filter: "template" parameters: - "throughput" Example of "table_details" which generates "Detailed Test Results - VPP Performance Results": :: - type: "table" title: "Detailed Test Results - VPP Performance Results" algorithm: "table_details" output-file-ext: ".csv" output-file: "{DIR[WORKING]}/vpp_performance_results" columns: - title: "Name" data: "data test_name" - title: "Documentation" data: "data test_documentation" - title: "Status" data: "data test_msg" data: csit-vpp-perf-1707-all: - 17 filter: "all" parameters: - "parent" - "doc" - "msg" Example of "table_details" which generates "Test configuration - VPP Performance Test Configs": :: - type: "table" title: "Test configuration - VPP Performance Test Configs" algorithm: "table_details" output-file-ext: ".csv" output-file: "{DIR[WORKING]}/vpp_test_configuration" columns: - title: "Name" data: "data name" - title: "VPP API Test (VAT) Commands History - Commands Used Per Test Case" data: "data show-run" data: csit-vpp-perf-1707-all: - 17 filter: "all" parameters: - "parent" - "name" - "show-run" Section: Plot ''''''''''''' This section defines a plot to be generated. There can be 0 or more "plot" sections. This section has these parts: - type: "plot" - says that this section defines a plot. - title: Plot title used in the logs. Title which is displayed is in the section "layout". - output-file-type: format of the output file. - output-file: file which the plot will be written to. - algorithm: Algorithm used to generate the plot. The other parameters in this section must provide all information needed by plot.ly to generate the plot. For example: - traces - layout - These parameters are transparently passed to plot.ly. - data: Specify the jobs and numbers of builds which data is used to generate the plot. - filter: filter applied on the input data. - parameters: Only these parameters will be put to the output data structure. The structure of the section "Plot" is as follows (example of a plot showing throughput in a chart box-with-whiskers): :: - type: "plot" title: "VPP Performance 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" algorithm: "plot_performance_box" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc" data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 # Keep this formatting, the filter is enclosed with " (quotation mark) and # each tag is enclosed with ' (apostrophe). filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" parameters: - "throughput" - "parent" traces: hoverinfo: "x+y" boxpoints: "outliers" whiskerwidth: 0 layout: title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" xaxis: autorange: True autotick: False fixedrange: False gridcolor: "rgb(238, 238, 238)" linecolor: "rgb(238, 238, 238)" linewidth: 1 showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" tickmode: "linear" title: "Indexed Test Cases" zeroline: False yaxis: gridcolor: "rgb(238, 238, 238)'" hoverformat: ".4s" linecolor: "rgb(238, 238, 238)" linewidth: 1 range: [] showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" title: "Packets Per Second [pps]" zeroline: False boxmode: "group" boxgroupgap: 0.5 autosize: False margin: t: 50 b: 20 l: 50 r: 20 showlegend: True legend: orientation: "h" width: 700 height: 1000 The structure of the section "Plot" is as follows (example of a plot showing latency in a box chart): :: - type: "plot" title: "VPP Latency 64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" algorithm: "plot_latency_box" output-file-type: ".html" output-file: "{DIR[STATIC,VPP]}/64B-1t1c-l2-sel1-ndrdisc-lat50" data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 filter: "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" parameters: - "latency" - "parent" traces: boxmean: False layout: title: "64B-1t1c-(eth|dot1q|dot1ad)-(l2xcbase|l2bdbasemaclrn)-ndrdisc" xaxis: autorange: True autotick: False fixedrange: False gridcolor: "rgb(238, 238, 238)" linecolor: "rgb(238, 238, 238)" linewidth: 1 showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" tickmode: "linear" title: "Indexed Test Cases" zeroline: False yaxis: gridcolor: "rgb(238, 238, 238)'" hoverformat: "" linecolor: "rgb(238, 238, 238)" linewidth: 1 range: [] showgrid: True showline: True showticklabels: True tickcolor: "rgb(238, 238, 238)" title: "Latency min/avg/max [uSec]" zeroline: False boxmode: "group" boxgroupgap: 0.5 autosize: False margin: t: 50 b: 20 l: 50 r: 20 showlegend: True legend: orientation: "h" width: 700 height: 1000 Section: file ''''''''''''' This section defines a file to be generated. There can be 0 or more "file" sections. This section has the following parts: - type: "file" - says that this section defines a file. - title: Title of the table. - algorithm: Algorithm which is used to generate the file. The other parameters in this section must provide all information needed by the used algorithm. - output-file-ext: extension of the output file. - output-file: file which the file will be written to. - file-header: The header of the generated .rst file. - dir-tables: The directory with the tables. - data: Specify the jobs and builds which data is used to generate the table. - filter: filter based on tags applied on the input data, if "all" is used, no filtering is done. - parameters: Only these parameters will be put to the output data structure. - chapters: the hierarchy of chapters in the generated file. - start-level: the level of the the top-level chapter. The structure of the section "file" is as follows (example): :: - type: "file" title: "VPP Performance Results" algorithm: "file_test_results" output-file-ext: ".rst" output-file: "{DIR[DTR,PERF,VPP]}/vpp_performance_results" file-header: "\n.. |br| raw:: html\n\n
\n\n\n.. |prein| raw:: html\n\n
\n\n\n.. |preout| raw:: html\n\n    
\n\n" dir-tables: "{DIR[DTR,PERF,VPP]}" data: csit-vpp-perf-1707-all: - 22 filter: "all" parameters: - "name" - "doc" - "level" data-start-level: 2 # 0, 1, 2, ... chapters-start-level: 2 # 0, 1, 2, ... Static content `````````````` - Manually created / edited files. - .rst files, static .csv files, static pictures (.svg), ... - Stored in CSIT git repository. No more details about the static content in this document. Data to process ``````````````` The PAL processes tests results and other information produced by Jenkins jobs. The data are now stored as robot results in Jenkins (TODO: store the data in nexus) either as .zip and / or .xml files. Data processing --------------- As the first step, the data are downloaded and stored locally (typically on a Jenkins slave). If .zip files are used, the given .xml files are extracted for further processing. Parsing of the .xml files is performed by a class derived from "robot.api.ResultVisitor", only necessary methods are overridden. All and only necessary data is extracted from .xml file and stored in a structured form. The parsed data are stored as the multi-indexed pandas.Series data type. Its structure is as follows: :: "job name", "build", "metadata", "suites", "tests" are indexes to access the data. For example: :: data = job 1 name: build 1: metadata: metadata suites: suites tests: tests ... build N: metadata: metadata suites: suites tests: tests ... job M name: build 1: metadata: metadata suites: suites tests: tests ... build N: metadata: metadata suites: suites tests: tests Using indexes data["job 1 name"]["build 1"]["tests"] (e.g.: data["csit-vpp-perf-1704-all"]["17"]["tests"]) we get a list of all tests with all tests data. Data will not be accessible directly using indexes, but using getters and filters. **Structure of metadata:** :: "metadata": { "version": "VPP version", "job": "Jenkins job name" "build": "Information about the build" }, **Structure of suites:** :: "suites": { "Suite name 1": { "doc": "Suite 1 documentation" "parent": "Suite 1 parent" } "Suite name N": { "doc": "Suite N documentation" "parent": "Suite N parent" } **Structure of tests:** Performance tests: :: "tests": { "ID": { "name": "Test name", "parent": "Name of the parent of the test", "doc": "Test documentation" "msg": "Test message" "tags": ["tag 1", "tag 2", "tag n"], "type": "PDR" | "NDR", "throughput": { "value": int, "unit": "pps" | "bps" | "percentage" }, "latency": { "direction1": { "100": { "min": int, "avg": int, "max": int }, "50": { # Only for NDR "min": int, "avg": int, "max": int }, "10": { # Only for NDR "min": int, "avg": int, "max": int } }, "direction2": { "100": { "min": int, "avg": int, "max": int }, "50": { # Only for NDR "min": int, "avg": int, "max": int }, "10": { # Only for NDR "min": int, "avg": int, "max": int } } }, "lossTolerance": "lossTolerance" # Only for PDR "vat-history": "DUT1 and DUT2 VAT History" }, "show-run": "Show Run" }, "ID" { # next test } Functional tests: :: "tests": { "ID": { "name": "Test name", "parent": "Name of the parent of the test", "doc": "Test documentation" "msg": "Test message" "tags": ["tag 1", "tag 2", "tag n"], "vat-history": "DUT1 and DUT2 VAT History" "show-run": "Show Run" "status": "PASS" | "FAIL" }, "ID" { # next test } } Note: ID is the lowercase full path to the test. Data filtering `````````````` The first step when generating an element is getting the data needed to construct the element. The data are filtered from the processed input data. The data filtering is based on: - job name(s). - build number(s). - tag(s). - required data - only this data is included in the output. WARNING: The filtering is based on tags, so be careful with tagging. For example, the element which specification includes: :: data: csit-vpp-perf-1707-all: - 9 - 10 - 13 - 14 - 15 - 16 - 17 - 18 - 19 - 21 filter: - "'64B' and 'BASE' and 'NDRDISC' and '1T1C' and ('L2BDMACSTAT' or 'L2BDMACLRN' or 'L2XCFWD') and not 'VHOST'" will be constructed using data from the job "csit-vpp-perf-1707-all", for all listed builds and the tests with the list of tags matching the filter conditions. The output data structure for filtered test data is: :: - job 1 - build 1 - test 1 - parameter 1 - parameter 2 ... - parameter n ... - test n ... ... - build n ... - job n Data analytics `````````````` Data analytics part implements: - methods to compute statistical data from the filtered input data. - trending. - etc. Throughput Speedup Analysis - Multi-Core Speedup Ratio '''''''''''''''''''''''''''''''''''''''''''''''''''''' Throughput Speedup Analysis (TSA) calculates a speedup factor for 1, 2, 4 cores which is defined as: :: throughput speedup factor = ----------------- 1-core-throughput A bar plot displays the speedup factor (normalized throughput for 64B/78B on 1 core). The plot displays number of cores on the X-axis and the speedup factor on the Y-axis. For better comparision, there can be displayed more than one set of data in a plot. So, in general: - graph type: grouped bars; - graph X-axis: (testcase index, number of cores); - graph Y-axis: speedup factor. The data displayed is a subset of existing performance tests with 1core, 2core, 4core. :TODO: Specify the data model for TSA. Advanced data analytics ``````````````````````` As the next steps, advanced data analytics (ADA) will be implemented using machine learning (ML) and artificial intelligence (AI). :TODO: - describe the concept of ADA. - add specification. Data presentation ----------------- Generates the plots and tables according to the report models per specification file. The elements are generated using algorithms and data specified in their models. Tables `````` - tables are generated by algorithms implemented in PAL, the model includes the algorithm and all necessary information. - output format: csv - generated tables are stored in specified directories and linked to .rst files. Plots ````` - `plot.ly `_ is currently used to generate plots, the model includes the type of plot and all the necessary information to render it. - output format: html. - generated plots are stored in specified directories and linked to .rst files. Report generation ----------------- Report is generated using Sphinx and Read_the_Docs template. PAL generates html and pdf formats. It is possible to define the content of the report by specifying the version (TODO: define the names and content of versions). The process ``````````` 1. Read the specification. 2. Read the input data. 3. Process the input data. 4. For element (plot, table, file) defined in specification: a. Get the data needed to construct the element using a filter. b. Generate the element. c. Store the element. 5. Generate the report. 6. Store the report (Nexus). The process is model driven. The elements’ models (tables, plots, files and report itself) are defined in the specification file. Script reads the elements’ models from specification file and generates the elements. It is easy to add elements to be generated, if a new kind of element is required, only a new algorithm is implemented and integrated. API --- List of modules, classes, methods and functions ``````````````````````````````````````````````` :: specification_parser.py class Specification Methods: read_specification set_input_state set_input_file_name Getters: specification environment debug is_debug input builds output tables plots files static input_data_parser.py class InputData Methods: read_data filter_data Getters: data metadata suites tests environment.py Functions: clean_environment class Environment Methods: set_environment Getters: environment input_data_files.py Functions: download_data_files unzip_files generator_tables.py Functions: generate_tables Functions implementing algorithms to generate particular types of tables (called by the function "generate_tables"): table_details table_performance_improvements generator_plots.py Functions: generate_plots Functions implementing algorithms to generate particular types of plots (called by the function "generate_plots"): plot_performance_box plot_latency_box generator_files.py Functions: generate_files Functions implementing algorithms to generate particular types of files (called by the function "generate_files"): file_test_results report.py Functions: generate_report Functions implementing algorithms to generate particular types of report (called by the function "generate_report"): generate_html_report generate_pdf_report Other functions called by the function "generate_report": archive_input_data archive_report PAL functional diagram `````````````````````` .. only:: latex .. raw:: latex \begin{figure}[H] \centering \includesvg[width=0.90\textwidth]{../_tmp/src/csit_framework_documentation/pal_func_diagram} \label{fig:pal_func_diagram} \end{figure} .. only:: html .. figure:: pal_func_diagram.svg :alt: PAL functional diagram :align: center How to add an element ````````````````````` Element can be added by adding its model to the specification file. If the element will be generated by an existing algorithm, only its parameters must be set. If a brand new type of element will be added, also the algorithm must be implemented. The algorithms are implemented in the files which names start with "generator". The name of the function implementing the algorithm and the name of algorithm in the specification file had to be the same.