4 files changed, 258 insertions, 0 deletions
diff --git a/docs/gettingstarted/troubleshooting/cpuusage.rst b/docs/gettingstarted/troubleshooting/cpuusage.rst
new file mode 100644
index 00000000000..9b4514e128e
--- /dev/null
+++ b/docs/gettingstarted/troubleshooting/cpuusage.rst
@@ -0,0 +1,112 @@
+.. _cpuusage:
+
+**************
+CPU Load/Usage
+**************
+
+There are various commands and tools that can help users see FD.io VPP CPU and memory usage at runtime.
+
+Linux top/htop
+==============
+
+The Linux top and htop are decent tools to look at FD.io VPP cpu and memory usage, but they will only show
+preallocated memory and total CPU usage. These commands can be useful to show which cores VPP is running on.
+
+This is an example of VPP instance that is running on cores 8 and 9. For this output type **top** and then
+type **1** when the tool starts.
+
+.. code-block:: console
+
+    $ top
+
+    top - 11:04:04 up 35 days,  3:16,  5 users,  load average: 2.33, 2.23, 2.16
+    Tasks: 435 total,   2 running, 432 sleeping,   1 stopped,   0 zombie
+    %Cpu0  :  1.0 us,  0.7 sy,  0.0 ni, 98.0 id,  0.0 wa,  0.0 hi,  0.3 si,  0.0 st
+    %Cpu1  :  2.0 us,  0.3 sy,  0.0 ni, 97.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu2  :  0.7 us,  1.0 sy,  0.0 ni, 98.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu3  :  1.7 us,  0.7 sy,  0.0 ni, 97.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu4  :  2.0 us,  0.7 sy,  0.0 ni, 97.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu5  :  3.0 us,  0.3 sy,  0.0 ni, 96.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu6  :  2.3 us,  0.7 sy,  0.0 ni, 97.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu7  :  2.6 us,  0.3 sy,  0.0 ni, 97.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu8  : 96.0 us,  0.3 sy,  0.0 ni,  3.6 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu9  :100.0 us,  0.0 sy,  0.0 ni,  0.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    %Cpu10 :  1.0 us,  0.3 sy,  0.0 ni, 98.7 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
+    ....
+
+VPP Memory Usage
+================
+
+For details on VPP memory usage you can use the **show memory** command
+
+This is the example VPP memory usage on 2 cores.
+
+.. code-block:: console
+
+    # vppctl show memory verbose
+    Thread 0 vpp_main
+    22043 objects, 17878k of 20826k used, 2426k free, 2396k reclaimed, 346k overhead, 1048572k capacity
+      alloc. from small object cache: 22875 hits 39973 attempts (57.23%) replacements 5143
+      alloc. from free-list: 44732 attempts, 26017 hits (58.16%), 528461 considered (per-attempt 11.81)
+      alloc. from vector-expand: 3430
+      allocs: 52324 2027.84 clocks/call
+      frees: 30280 594.38 clocks/call
+    Thread 1 vpp_wk_0
+    22043 objects, 17878k of 20826k used, 2427k free, 2396k reclaimed, 346k overhead, 1048572k capacity
+      alloc. from small object cache: 22881 hits 39984 attempts (57.23%) replacements 5148
+      alloc. from free-list: 44736 attempts, 26021 hits (58.17%), 528465 considered (per-attempt 11.81)
+      alloc. from vector-expand: 3430
+      allocs: 52335 2027.54 clocks/call
+      frees: 30291 594.36 clocks/call
+
+VPP CPU Load
+============
+
+To find the VPP CPU load or how busy VPP is use the **show runtime** command.
+
+With at least one interface in polling mode, the VPP CPU utilization is always 100%.
+
+A good indicator of CPU load is **"average vectors/node"**. A bigger number means VPP
+is more busy but also more efficient. The Maximum value is 255 (unless you change VLIB_FRAME_SIZE in code).
+It basically means how many packets are processed in batch.
+
+If VPP is not loaded it will likely poll so fast that it will just get one or few
+packets from the rx queue. This is the case shown below on Thread 1. As load goes up vpp
+will have more work to do, so it will poll less frequently, and that will result in more
+packets waiting in rx queue. More packets will result in more efficient execution of the
+code so number of clock cycles / packet will go down. When "average vectors/node" goes up
+close to 255, you will likely start observing rx queue tail drops.
+
+.. code-block:: console
+
+    # vppctl show run
+    Thread 0 vpp_main (lcore 8)
+    Time 6152.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00
+      vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0
+                 Name                 State         Calls          Vectors        Suspends         Clocks       Vectors/Call
+    acl-plugin-fa-cleaner-process  event wait                0               0               1          3.66e4            0.00
+    admin-up-down-process          event wait                0               0               1          2.54e3            0.00
+    ....
+    ---------------
+    Thread 1 vpp_wk_0 (lcore 9)
+    Time 6152.9, average vectors/node 1.00, last 128 main loops 0.00 per node 0.00
+      vector rates in 1.3073e2, out 1.3073e2, drop 6.5009e-4, punt 0.0000e0
+                 Name                 State         Calls          Vectors        Suspends         Clocks       Vectors/Call
+    TenGigabitEthernet86/0/0-outpu   active             804395          804395               0          6.17e2            1.00
+    TenGigabitEthernet86/0/0-tx      active             804395          804395               0          7.29e2            1.00
+    arp-input                        active                  2               2               0          3.82e4            1.00
+    dpdk-input                       polling       24239296364          804398               0          1.59e7            0.00
+    error-drop                       active                  4               4               0          4.65e3            1.00
+    ethernet-input                   active                  2               2               0          1.08e4            1.00
+    interface-output                 active                  1               1               0          3.78e3            1.00
+    ip4-glean                        active                  1               1               0          6.98e4            1.00
+    ip4-icmp-echo-request            active             804394          804394               0          5.02e2            1.00
+    ip4-icmp-input                   active             804394          804394               0          4.63e2            1.00
+    ip4-input-no-checksum            active             804394          804394               0          8.51e2            1.00
+    ip4-load-balance                 active             804394          804394               0          5.46e2            1.00
+    ip4-local                        active             804394          804394               0          5.79e2            1.00
+    ip4-lookup                       active             804394          804394               0          5.71e2            1.00
+    ip4-rewrite                      active             804393          804393               0          5.69e2            1.00
+    ip6-input                        active                  2               2               0          5.72e3            1.00
+    ip6-not-enabled                  active                  2               2               0          1.56e4            1.00
+    unix-epoll-input                 polling            835722               0               0         3.03e-3            0.00
diff --git a/docs/gettingstarted/troubleshooting/index.rst b/docs/gettingstarted/troubleshooting/index.rst
new file mode 100644
index 00000000000..d70c19042c8
--- /dev/null
+++ b/docs/gettingstarted/troubleshooting/index.rst
@@ -0,0 +1,14 @@
+.. _troubleshooting:
+
+###############
+Troubleshooting
+###############
+
+This chapter describes some of the many techniques used to troubleshoot and diagnose
+problem with FD.io VPP implementations.
+
+.. toctree::
+
+    cpuusage
+    sanitizer
+    mem
diff --git a/docs/gettingstarted/troubleshooting/mem.rst b/docs/gettingstarted/troubleshooting/mem.rst
new file mode 100644
index 00000000000..630b0af02f3
--- /dev/null
+++ b/docs/gettingstarted/troubleshooting/mem.rst
@@ -0,0 +1,87 @@
+.. _memleak:
+
+*****************
+Memory leaks
+*****************
+
+Memory traces
+=============
+
+VPP supports memory traces to help debug (suspected) memory leaks. Each
+allocation/deallocation is instrumented so that the number of allocations and
+current global allocated size is maintained for each unique allocation stack
+trace.
+
+Looking at a memory trace can help diagnose where memory is (over-)used, and
+comparing memory traces at different point in time can help diagnose if and
+where memory leaks happen.
+
+To enable memory traces on main-heap:
+
+.. code-block:: console
+
+    $ vppctl memory-trace on main-heap
+
+To dump memory traces for analysis:
+
+.. code-block:: console
+
+    $ vppctl show memory-trace on main-heap
+    Thread 0 vpp_main
+      base 0x7fffb6422000, size 1g, locked, unmap-on-destroy, name 'main heap'
+	page stats: page-size 4K, total 262144, mapped 30343, not-mapped 231801
+	  numa 0: 30343 pages, 118.53m bytes
+	total: 1023.99M, used: 115.49M, free: 908.50M, trimmable: 908.48M
+	  free chunks 451 free fastbin blks 0
+	  max total allocated 1023.99M
+
+      Bytes    Count     Sample   Traceback
+     31457440        1 0x7fffbb31ad00 clib_mem_alloc_aligned_at_offset + 0x80
+				      clib_mem_alloc_aligned + 0x26
+				      alloc_aligned_8_8 + 0xe1
+				      clib_bihash_instantiate_8_8 + 0x76
+				      clib_bihash_init2_8_8 + 0x2ec
+				      clib_bihash_init_8_8 + 0x6a
+				      l2fib_table_init + 0x54
+				      set_int_l2_mode + 0x89
+				      int_l3 + 0xb4
+				      vlib_cli_dispatch_sub_commands + 0xeee
+				      vlib_cli_dispatch_sub_commands + 0xc62
+				      vlib_cli_dispatch_sub_commands + 0xc62
+       266768     5222 0x7fffbd79f978 clib_mem_alloc_aligned_at_offset + 0x80
+				      vec_resize_allocate_memory + 0xa8
+				      _vec_resize_inline + 0x240
+				      unix_cli_file_add + 0x83d
+				      unix_cli_listen_read_ready + 0x10b
+				      linux_epoll_input_inline + 0x943
+				      linux_epoll_input + 0x39
+				      dispatch_node + 0x336
+				      vlib_main_or_worker_loop + 0xbf1
+				      vlib_main_loop + 0x1a
+				      vlib_main + 0xae7
+				      thread0 + 0x3e
+    ....
+
+libc memory traces
+==================
+
+Internal VPP memory allocations rely on VPP main-heap, however when using
+external libraries, esp. in plugins (e.g. OpenSSL library used by the IKEv2
+plugin), those external libraries usually manages memory using the standard
+libc malloc()/free()/... calls. This, in turn, makes use of the default
+libc heap.
+
+VPP has no knowledge of this heap and tools such as memory traces cannot be
+used.
+
+In order to enable the use of standard VPP debugging tools, this library
+replaces standard libc memory management calls with version using VPP
+main-heap.
+
+To use it, you need to use the `LD_PRELOAD` mechanism, e.g.
+
+.. code-block:: console
+
+    ~# LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libvppmem_preload.so /usr/bin/vpp -c /etc/vpp/startup.conf
+
+You can then use tools such as memory traces as usual.
diff --git a/docs/gettingstarted/troubleshooting/sanitizer.rst b/docs/gettingstarted/troubleshooting/sanitizer.rst
new file mode 100644
index 00000000000..217f5e57182
--- /dev/null
+++ b/docs/gettingstarted/troubleshooting/sanitizer.rst
@@ -0,0 +1,45 @@
+.. _sanitizer:
+
+*****************
+Google Sanitizers
+*****************
+
+VPP is instrumented to support `Google Sanitizers <https://github.com/google/sanitizers>`_.
+As of today, only `AddressSanitizer <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_
+is supported, both for GCC and clang.
+
+AddressSanitizer
+================
+
+`AddressSanitizer <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_  (aka ASan) is a memory
+error detector for C/C++. Think Valgrind but much faster.
+
+In order to use it, VPP must be recompiled with ASan support. It is implemented as a cmake
+build option, so all VPP targets should be supported. For example:
+
+.. code-block:: console
+
+    # build a debug image with ASan support:
+    $ make rebuild VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
+    ....
+
+    # build a release image with ASan support:
+    $ make rebuild-release VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
+    ....
+
+    # build packages in debug mode with ASan support:
+    $ make pkg-deb-debug VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
+    ....
+
+    # run GBP plugin tests in debug mode with ASan
+    $ make test-debug TEST=test_gbp VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
+    ....
+
+Once VPP has been built with ASan support you can use it as usual including
+under gdb:
+
+.. code-block:: console
+
+    $ gdb --args $PWD/build-root/install-vpp_debug-native/vpp/bin/vpp "unix { interactive }"
+    ....
+