docs: Rewrite the what is VPP (first) section, also fix the build

Signed-off-by: John DeNisco <jdenisco@cisco.com> Change-Id: Ifb558171f8976a721703e74afea997d006273b5f Signed-off-by: Dave Barach <dave@barachs.net>
author: John DeNisco <jdenisco@cisco.com> 2019-11-06 10:58:28 -0800
committer: Dave Barach <dave@barachs.net> 2019-11-06 16:15:49 -0500
commit: c96d618a5dd96e3a40d59860d2cdb9d5c6b71d11 (patch)
tree: 74a19b9b8364bf56dceced8ec982c6fbb7ddb8e4 /docs/whatisvpp
parent: 340c15c6ed34ce60c821b5260fec3eb11d65dcb7 (diff)
8 files changed, 344 insertions, 0 deletions
diff --git a/docs/whatisvpp/developer.rst b/docs/whatisvpp/developer.rst
new file mode 100644
index 00000000000..5151e65ff74
--- /dev/null
+++ b/docs/whatisvpp/developer.rst
@@ -0,0 +1,29 @@
+.. _developer-friendly:
+
+=======================
+Features for Developers
+=======================
+
+This section describes a little about the VPP environment and some of the features
+that can be used by developers.
+
+* Extensive runtime counters; throughput, `intructions per cycle <https://en.wikipedia.org/wiki/Instructions_per_cycle>`_, errors, events etc.
+* Integrated pipeline tracing facilities
+* Multi-language API bindings
+* Integrated command line for debugging
+* Fault-tolerant and upgradable
+
+  * Runs as a standard user-space process for fault tolerance, software crashes seldom require more than a process restart. 
+  * Improved fault-tolerance and upgradability when compared to running similar packet processing in the kernel, software updates never require system reboots. 
+  * Development experience is easier compared to similar kernel code
+  * Hardware isolation and protection (`iommu <https://en.wikipedia.org/wiki/Input%E2%80%93output_memory_management_unit>`_)
+
+* Built for security
+
+  * Extensive white-box testing
+  * Image segment base address randomization
+  * Shared-memory segment base address randomization
+  * Stack bounds checking
+  * Static analysis with `Coverity <https://en.wikipedia.org/wiki/Coverity>`_
+
+For the supported architectures click next.
diff --git a/docs/whatisvpp/extensible.rst b/docs/whatisvpp/extensible.rst
new file mode 100644
index 00000000000..1df3b9fbd2f
--- /dev/null
+++ b/docs/whatisvpp/extensible.rst
@@ -0,0 +1,43 @@
+.. _extensible:
+
+===========================
+The Packet Processing Graph
+===========================
+
+At the core of the FD.io VPP design is the **Packet Procerssing Graph**
+
+This makes the software:
+
+* Pluggable, easy to understand & extend
+* Mature graph node architecture
+* Full control to reorganize the pipeline
+* Fast, plugins are equal citizens
+
+The FD.io VPP packet processing pipeline is decomposed into a ‘packet processing
+graph’.  This modular approach means that anyone can ‘plugin’ new graph
+nodes. This makes VPP easily extensible and means that plugins can be
+customized for specific purposes. VPP is also configurable through it's
+Low-Level API.
+
+.. figure:: /_images/VPP_custom_application_packet_processing_graph.280.jpg
+   :alt: Extensible, modular graph node architecture?
+   
+   Extensible and modular graph node architecture. 
+
+At runtime, the FD.io VPP platform assembles a vector of packets from RX rings,
+typically up to 256 packets in a single vector. The packet processing graph is
+then applied, node by node (including plugins) to the entire packet vector. The
+received packets typically traverse the packet processing graph nodes in the
+vector, when the network processing represented by each graph node is applied to
+each packet in turn.  Graph nodes are small and modular, and loosely
+coupled. This makes it easy to introduce new graph nodes and rewire existing
+graph nodes.
+
+Plugins are `shared libraries <https://en.wikipedia.org/wiki/Library_(computing)>`_ 
+and are loaded at runtime by VPP. VPP find plugins by searching the plugin path 
+for libraries, and then dynamically loads each one in turn on startup. 
+A plugin can introduce new graph nodes or rearrange the packet processing graph. 
+You can build a plugin completely independently of the FD.io VPP source tree,
+which means you can treat it as an independent component.
+
+For more on the network stack press next.
diff --git a/docs/whatisvpp/hoststack.rst b/docs/whatisvpp/hoststack.rst
new file mode 100644
index 00000000000..77e259a6731
--- /dev/null
+++ b/docs/whatisvpp/hoststack.rst
@@ -0,0 +1,26 @@
+.. _hoststack:
+
+==============
+TCP Host Stack
+==============
+
+VPP’s host stack leverages VPP’s graph based forwarding model and vectorized packet
+processing to ensure high throughput and scale transport protocol termination. It
+exposes apis that apart from allowing for efficient user-space app consumption and
+generation of data, also enables highly efficient local inter-app communication. 
+ 
+At a high level VPP’s host stack consists of 3 major components: 
+
+* A session layer that facilitates interaction between transport protocols and applications
+* Pluggable transport protocols, including TCP, QUIC, TLS, UDP
+* VCL (VPPComs library) a set of libraries meant to ease the consumability of the stack from application perspective
+ 
+All of these components were custom built to fit within VPP’s architecture and to
+leverage its speed. As a result, a significant amount of effort was invested into:
+
+*  building a transport pluggable session layer that abstracts the interaction between applications and transports using a custom-built shared memory infrastructure. Notably, this also allows for transport protocols that are typically implemented in applications, like QUIC and TLS, to be implemented within VPP. 
+* a clean slate TCP implementation that supports vectorized packet processing and follows VPP’s highly scalable threading model. The implementation is RFC compliant, supports a high number of high-speed TCP protocol features and it was validated using Defensic’s Codenomicon 1M+ tests suite. 
+* VCL, a library that emulates traditional asynchronous communication functions in user-space, all while allowing for new patterns to be developed, if needed. 
+* implementing a high performance “cut-through” communication mode that enables applications attached to vpp to transparently exchange data over shared memory without incurring the extra cost of a traditional transport protocol. Testing has shown this to be much more efficient than traditional inter-container networking.
+
+For developer features press next.
diff --git a/docs/whatisvpp/index.rst b/docs/whatisvpp/index.rst
new file mode 100644
index 00000000000..464119ccad5
--- /dev/null
+++ b/docs/whatisvpp/index.rst
@@ -0,0 +1,37 @@
+.. _whatisvpp:
+
+=================================
+The Vector Packet Processor (VPP)
+=================================
+
+This section describes some of the core concepts and features of FD.io VPP.
+
+To start with FD.io VPP uses a technique called Vector Packet Processing.
+This gives FD.io VPP a siginficant performance improvement over packet
+processing applications that use scalar processing. 
+
+Also, At the heart of Fd.io VPP's modular design is a 'Packet Processing Graph'.
+This makes FD.io VPP scalable and easily extensible.
+
+The FD.io software also includes a feature rich network stack. This includes
+a TCP host stack that utilizes VPP’s graph based forwarding model and vectorized
+packet processing.
+
+FD.io VPP is tested nightly for functionality and performance with the
+CSIT project.
+
+For more information on any of these features click on the links below or
+press next.
+
+.. toctree::
+   :maxdepth: 1
+
+   scalar-vs-vector-packet-processing.rst
+   extensible.rst
+   networkstack.rst
+   hoststack.rst
+   developer.rst
+   supported.rst
+   performance.rst
+
+Press next for more about Scalar/Vector Packet processing.
diff --git a/docs/whatisvpp/networkstack.rst b/docs/whatisvpp/networkstack.rst
new file mode 100644
index 00000000000..20c470828b1
--- /dev/null
+++ b/docs/whatisvpp/networkstack.rst
@@ -0,0 +1,39 @@
+.. _network-stack:
+
+=============
+Network Stack
+=============
+
+This section describes a little about the FD.io network stack and describes some benefits:
+
+* Layer 2 - 4 Network Stack
+
+  * Fast lookup tables for routes, bridge entries
+  * Arbitrary n-tuple classifiers 
+  * Control Plane, Traffic Management and Overlays
+
+ 
+* `Linux <https://en.wikipedia.org/wiki/Linux>`_ and `FreeBSD <https://en.wikipedia.org/wiki/FreeBSD>`_ support
+
+  * Support for standard Operating System Interfaces such as AF_Packet, Tun/Tap & Netmap.
+
+* Network and cryptographic hardware support with `DPDK <https://www.dpdk.org/>`_.
+* Container and Virtualization support
+
+  * Para-virtualized interfaces; Vhost and Virtio
+  * Network Adapters over PCI passthrough
+  * Native container interfaces; MemIF
+  
+* Host Stack
+* Universal Data Plane: one code base, for many use cases
+ 
+  * Discrete appliances; such as `Routers <https://en.wikipedia.org/wiki/Router_(computing)>`_ and `Switches <https://en.wikipedia.org/wiki/Network_switch>`_.
+  * `Cloud Infrastructure and Virtual Network Functions <https://en.wikipedia.org/wiki/Network_function_virtualization>`_
+  * `Cloud Native Infrastructure <https://www.cncf.io/>`_
+  * The same binary package for all use cases. 
+
+* Out of the box production quality, with thanks to `CSIT <https://wiki.fd.io/view/CSIT#Start_Here>`_. 
+
+For more information, please see :ref:`featuresbyrelease` for the complete list.
+
+For more on the TCP Host Stack press next.
diff --git a/docs/whatisvpp/performance.rst b/docs/whatisvpp/performance.rst
new file mode 100644
index 00000000000..9b0fb21bb71
--- /dev/null
+++ b/docs/whatisvpp/performance.rst
@@ -0,0 +1,70 @@
+.. _performance:
+
+Performance
+===========
+
+One of the benefits of FD.io VPP is it's high performance on relatively low-power computing.
+Included are the following.
+
+* A high-performance user-space network stack designed for commodity hardware:
+
+  - L2, L3 and L4 features and encapsulations.
+
+* Optimized packet interfaces supporting a multitude of use cases:
+
+  - An integrated vhost-user backend for high speed VM-to-VM connectivity
+  - An integrated memif container backend for high speed Container-to-Container connectivity
+  - An integrated vhost based interface to punt packets to the Linux Kernel
+
+* The same optimized code-paths run execute on the host, and inside VMs and Linux containers
+* Leverages best-of-breed open source driver technology: `DPDK <https://www.dpdk.org/>`_
+* Tested at scale; linear core scaling, tested with millions of flows and mac addresses  
+
+These features have been designed to take full advantage of common micro-processor optimization techniques, such as: 
+
+* Reducing cache and TLS misses by processing packets in vectors
+* Realizing `IPC <https://en.wikipedia.org/wiki/Instructions_per_cycle>`_ gains with vector instructions such as: SSE, AVX and NEON
+* Eliminating mode switching, context switches and blocking, to always be doing useful work
+* Cache-lined aligned buffers for cache and memory efficiency
+
+
+Continuous System Integration and Testing (CSIT)
+------------------------------------------------
+
+The Continuous System Integration and Testing (CSIT) project provides functional and performance
+testing for FD.io VPP. This testing is focused on functional and performance regresssions. The results
+are posted to `CSIT Test Report <https://docs.fd.io/csit/master/report/>`_.
+
+For more about CSIT checkout the following links:
+
+* `CSIT Code Documentation <https://docs.fd.io/csit/master/doc/overview.html>`_
+* `CSIT Test Overview <https://docs.fd.io/csit/master/report/introduction/overview.html>`_
+* `VPP Performance Dashboard <https://docs.fd.io/csit/master/trending/introduction/index.html>`_
+
+
+CSIT Packet Throughput examples
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Following are pointers to a few of the CSIT test reports. The test's titles read like this:
+
+<packet size>-<number of threads><number of cores>-<test>-<interface type> 
+
+For example the test with the title 64b-2t1c-l2switching-base-i40e is the
+test that does l2 switching using 64 byte packets, 2 threads, 1 core using an i40e
+interface.
+
+Here are a few examples:
+
+* `L2 Ethernet switching <https://docs.fd.io/csit/master/report/vpp_performance_tests/packet_throughput_graphs/l2.html>`_
+* `IPv4 Routing <https://docs.fd.io/csit/master/report/vpp_performance_tests/packet_throughput_graphs/ip4.html>`_
+* `IPv6 Routing <https://docs.fd.io/csit/master/report/vpp_performance_tests/packet_throughput_graphs/ip6.html>`_
+
+
+Trending Throughput Graphs
+^^^^^^^^^^^^^^^^^^^^^^^^^^ 
+
+These are some of the trending packet throughput graphs from the CSIT `trending dashboard <https://docs.fd.io/csit/master/trending/introduction/index.html>`_. **Please note that**, performance in the trending graphs will change on a nightly basis in line with the software development cycle:
+
+* `L2 Ethernet Switching Trending <https://docs.fd.io/csit/master/trending/trending/l2.html>`_
+* `IPv4 Routing Trending <https://docs.fd.io/csit/master/trending/trending/ip4.html>`_
+* `IPv6 Routing Trending <https://docs.fd.io/csit/master/trending/trending/ip6.html>`_
diff --git a/docs/whatisvpp/scalar-vs-vector-packet-processing.rst b/docs/whatisvpp/scalar-vs-vector-packet-processing.rst
new file mode 100644
index 00000000000..ffa54a3f306
--- /dev/null
+++ b/docs/whatisvpp/scalar-vs-vector-packet-processing.rst
@@ -0,0 +1,69 @@
+.. _scalar_vector:
+
+==================================
+Scalar vs Vector packet processing
+==================================
+
+FD.io VPP is developed using vector packet processing, as opposed to
+scalar packet processing.
+
+Vector packet processing is a common approach among high performance packet
+processing applications such FD.io VPP and `DPDK <https://en.wikipedia.org/wiki/Data_Plane_Development_Kit>`_.
+The scalar based approach tends to be favoured by network stacks that
+don't necessarily have strict performance requirements.
+
+**Scalar Packet Processing**
+
+A scalar packet processing network stack typically processes one packet at a
+time: an interrupt handling function takes a single packet from a Network
+Interface, and processes it through a set of functions: fooA calls fooB calls
+fooC and so on.
+
+.. code-block:: none 
+
+   +---> fooA(packet1) +---> fooB(packet1) +---> fooC(packet1)
+   +---> fooA(packet2) +---> fooB(packet2) +---> fooC(packet2)
+   ...
+   +---> fooA(packet3) +---> fooB(packet3) +---> fooC(packet3)
+
+
+Scalar packet processing is simple, but inefficient in these ways:
+
+* When the code path length exceeds the size of the Microprocessor's instruction
+  cache (I-cache), `thrashing
+  <https://en.wikipedia.org/wiki/Thrashing_(computer_science)>`_ occurs as the
+  Microprocessor is continually loading new instructions. In this model, each
+  packet incurs an identical set of I-cache misses.
+* The associated deep call stack will also add load-store-unit pressure as
+  stack-locals fall out of the Microprocessor's Layer 1 Data Cache (D-cache).
+
+**Vector Packet Processing**
+
+In contrast, a vector packet processing network stack processes multiple packets
+at a time, called 'vectors of packets' or simply a 'vector'. An interrupt
+handling function takes the vector of packets from a Network Interface, and
+processes the vector through a set of functions: fooA calls fooB calls fooC and
+so on.
+
+.. code-block:: none 
+
+   +---> fooA([packet1, +---> fooB([packet1, +---> fooC([packet1, +--->
+               packet2,             packet2,             packet2,
+               ...                  ...                  ...
+               packet256])          packet256])          packet256])
+
+This approach fixes: 
+
+* The I-cache thrashing problem described above, by amortizing the cost of
+  I-cache loads across multiple packets.
+
+* The inefficiencies associated with the deep call stack by receiving vectors
+  of up to 256 packets at a time from the Network Interface, and processes them
+  using a directed graph of node. The graph scheduler invokes one node dispatch
+  function at a time, restricting stack depth to a few stack frames.
+
+The further optimizations that this approaches enables are pipelining and
+prefetching to minimize read latency on table data and parallelize packet loads
+needed to process packets.
+
+Press next for more on Packet Processing Graphs.
diff --git a/docs/whatisvpp/supported.rst b/docs/whatisvpp/supported.rst
new file mode 100644
index 00000000000..b201def5e06
--- /dev/null
+++ b/docs/whatisvpp/supported.rst
@@ -0,0 +1,31 @@
+.. _supported:
+
+.. toctree::
+
+Architectures and Operating Systems
+***********************************
+
+The following architectures and operating systems are supported in VPP:
+
+Architectures
+-----------------------
+
+* The FD.io VPP platform supports:
+
+   * x86/64
+   * ARM-AArch64
+
+Operating Systems and Packaging
+-------------------------------
+
+FD.io VPP supports package installation on the following 
+recent LTS operating systems releases:
+
+* Operating Systems:
+
+   * Debian
+   * Ubuntu 
+   * CentOS 
+   * OpenSUSE
+
+For more about VPP performance press next.
author	John DeNisco <jdenisco@cisco.com>	2019-11-06 10:58:28 -0800
committer	Dave Barach <dave@barachs.net>	2019-11-06 16:15:49 -0500
commit	c96d618a5dd96e3a40d59860d2cdb9d5c6b71d11 (patch)
tree	74a19b9b8364bf56dceced8ec982c6fbb7ddb8e4 /docs/whatisvpp
parent	340c15c6ed34ce60c821b5260fec3eb11d65dcb7 (diff)