aboutsummaryrefslogtreecommitdiffstats
path: root/docs/troubleshooting
diff options
context:
space:
mode:
Diffstat (limited to 'docs/troubleshooting')
-rw-r--r--docs/troubleshooting/cpuusage.rst112
-rw-r--r--docs/troubleshooting/index.rst15
-rw-r--r--docs/troubleshooting/mem.rst87
-rw-r--r--docs/troubleshooting/reportingissues/index.rst8
-rw-r--r--docs/troubleshooting/reportingissues/reportingissues.rst284
-rw-r--r--docs/troubleshooting/sanitizer.rst45
6 files changed, 0 insertions, 551 deletions
diff --git a/docs/troubleshooting/cpuusage.rst b/docs/troubleshooting/cpuusage.rst
deleted file mode 100644
index b9b8942a3dd..00000000000
--- a/docs/troubleshooting/cpuusage.rst
+++ /dev/null
@@ -1,112 +0,0 @@
-.. _cpuusage:
-
-**************
-CPU Load/Usage
-**************
-
-There are various commands and tools that can help users see FD.io VPP CPU and memory usage at runtime.
-
-Linux top/htop
-==============
-
-The Linux top and htop are decent tools to look at FD.io VPP cpu and memory usage, but they will only show
-preallocated memory and total CPU usage. These commands can be useful to show which cores VPP is running on.
-
-This is an example of VPP instance that is running on cores 8 and 9. For this output type **top** and then
-type **1** when the tool starts.
-
-.. code-block:: console
-
- $ top
-
- top - 11:04:04 up 35 days, 3:16, 5 users, load average: 2.33, 2.23, 2.16
- Tasks: 435 total, 2 running, 432 sleeping, 1 stopped, 0 zombie
- %Cpu0 : 1.0 us, 0.7 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st
- %Cpu1 : 2.0 us, 0.3 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu2 : 0.7 us, 1.0 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu3 : 1.7 us, 0.7 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu4 : 2.0 us, 0.7 sy, 0.0 ni, 97.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu5 : 3.0 us, 0.3 sy, 0.0 ni, 96.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu6 : 2.3 us, 0.7 sy, 0.0 ni, 97.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu7 : 2.6 us, 0.3 sy, 0.0 ni, 97.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu8 : 96.0 us, 0.3 sy, 0.0 ni, 3.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu9 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- %Cpu10 : 1.0 us, 0.3 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
- ....
-
-VPP Memory Usage
-================
-
-For details on VPP memory usage you can use the **show memory** command
-
-This is the example VPP memory usage on 2 cores.
-
-.. code-block:: console
-
- # vppctl show memory verbose
- Thread 0 vpp_main
- 22043 objects, 17878k of 20826k used, 2426k free, 2396k reclaimed, 346k overhead, 1048572k capacity
- alloc. from small object cache: 22875 hits 39973 attempts (57.23%) replacements 5143
- alloc. from free-list: 44732 attempts, 26017 hits (58.16%), 528461 considered (per-attempt 11.81)
- alloc. from vector-expand: 3430
- allocs: 52324 2027.84 clocks/call
- frees: 30280 594.38 clocks/call
- Thread 1 vpp_wk_0
- 22043 objects, 17878k of 20826k used, 2427k free, 2396k reclaimed, 346k overhead, 1048572k capacity
- alloc. from small object cache: 22881 hits 39984 attempts (57.23%) replacements 5148
- alloc. from free-list: 44736 attempts, 26021 hits (58.17%), 528465 considered (per-attempt 11.81)
- alloc. from vector-expand: 3430
- allocs: 52335 2027.54 clocks/call
- frees: 30291 594.36 clocks/call
-
-VPP CPU Load
-============
-
-To find the VPP CPU load or how busy VPP is use the **show runtime** command.
-
-With at least one interface in polling mode, the VPP CPU utilization is always 100%.
-
-A good indicator of CPU load is **"average vectors/node"**. A bigger number means VPP
-is more busy but also more efficient. The Maximum value is 255 (unless you change VLIB_FRAME_SIZE in code).
-It basically means how many packets are processed in batch.
-
-If VPP is not loaded it will likely poll so fast that it will just get one or few
-packets from the rx queue. This is the case shown below on Thread 1. As load goes up vpp
-will have more work to do, so it will poll less frequently, and that will result in more
-packets waiting in rx queue. More packets will result in more efficient execution of the
-code so number of clock cycles / packet will go down. When "average vectors/node" goes up
-close to 255, you will likely start observing rx queue tail drops.
-
-.. code-block:: console
-
- # vppctl show run
- Thread 0 vpp_main (lcore 8)
- Time 6152.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00
- vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0
- Name State Calls Vectors Suspends Clocks Vectors/Call
- acl-plugin-fa-cleaner-process event wait 0 0 1 3.66e4 0.00
- admin-up-down-process event wait 0 0 1 2.54e3 0.00
- ....
- ---------------
- Thread 1 vpp_wk_0 (lcore 9)
- Time 6152.9, average vectors/node 1.00, last 128 main loops 0.00 per node 0.00
- vector rates in 1.3073e2, out 1.3073e2, drop 6.5009e-4, punt 0.0000e0
- Name State Calls Vectors Suspends Clocks Vectors/Call
- TenGigabitEthernet86/0/0-outpu active 804395 804395 0 6.17e2 1.00
- TenGigabitEthernet86/0/0-tx active 804395 804395 0 7.29e2 1.00
- arp-input active 2 2 0 3.82e4 1.00
- dpdk-input polling 24239296364 804398 0 1.59e7 0.00
- error-drop active 4 4 0 4.65e3 1.00
- ethernet-input active 2 2 0 1.08e4 1.00
- interface-output active 1 1 0 3.78e3 1.00
- ip4-glean active 1 1 0 6.98e4 1.00
- ip4-icmp-echo-request active 804394 804394 0 5.02e2 1.00
- ip4-icmp-input active 804394 804394 0 4.63e2 1.00
- ip4-input-no-checksum active 804394 804394 0 8.51e2 1.00
- ip4-load-balance active 804394 804394 0 5.46e2 1.00
- ip4-local active 804394 804394 0 5.79e2 1.00
- ip4-lookup active 804394 804394 0 5.71e2 1.00
- ip4-rewrite active 804393 804393 0 5.69e2 1.00
- ip6-input active 2 2 0 5.72e3 1.00
- ip6-not-enabled active 2 2 0 1.56e4 1.00
- unix-epoll-input polling 835722 0 0 3.03e-3 0.00
diff --git a/docs/troubleshooting/index.rst b/docs/troubleshooting/index.rst
deleted file mode 100644
index 5dee98a8029..00000000000
--- a/docs/troubleshooting/index.rst
+++ /dev/null
@@ -1,15 +0,0 @@
-.. _troubleshooting:
-
-###############
-Troubleshooting
-###############
-
-This chapter describes some of the many techniques used to troubleshoot and diagnose
-problem with FD.io VPP implementations.
-
-.. toctree::
-
- reportingissues/index.rst
- cpuusage
- sanitizer
- mem
diff --git a/docs/troubleshooting/mem.rst b/docs/troubleshooting/mem.rst
deleted file mode 100644
index 207b2777c50..00000000000
--- a/docs/troubleshooting/mem.rst
+++ /dev/null
@@ -1,87 +0,0 @@
-.. _memleak:
-
-*****************
-Memory leaks
-*****************
-
-Memory traces
-=============
-
-VPP supports memory traces to help debug (suspected) memory leaks. Each
-allocation/deallocation is instrumented so that the number of allocations and
-current global allocated size is maintained for each unique allocation stack
-trace.
-
-Looking at a memory trace can help diagnose where memory is (over-)used, and
-comparing memory traces at different point in time can help diagnose if and
-where memory leaks happen.
-
-To enable memory traces on main-heap:
-
-.. code-block:: console
-
- $ vppctl memory-trace on main-heap
-
-To dump memory traces for analysis:
-
-.. code-block:: console
-
- $ vppctl show memory-trace on main-heap
- Thread 0 vpp_main
- base 0x7fffb6422000, size 1g, locked, unmap-on-destroy, name 'main heap'
- page stats: page-size 4K, total 262144, mapped 30343, not-mapped 231801
- numa 0: 30343 pages, 118.53m bytes
- total: 1023.99M, used: 115.49M, free: 908.50M, trimmable: 908.48M
- free chunks 451 free fastbin blks 0
- max total allocated 1023.99M
-
- Bytes Count Sample Traceback
- 31457440 1 0x7fffbb31ad00 clib_mem_alloc_aligned_at_offset + 0x80
- clib_mem_alloc_aligned + 0x26
- alloc_aligned_8_8 + 0xe1
- clib_bihash_instantiate_8_8 + 0x76
- clib_bihash_init2_8_8 + 0x2ec
- clib_bihash_init_8_8 + 0x6a
- l2fib_table_init + 0x54
- set_int_l2_mode + 0x89
- int_l3 + 0xb4
- vlib_cli_dispatch_sub_commands + 0xeee
- vlib_cli_dispatch_sub_commands + 0xc62
- vlib_cli_dispatch_sub_commands + 0xc62
- 266768 5222 0x7fffbd79f978 clib_mem_alloc_aligned_at_offset + 0x80
- vec_resize_allocate_memory + 0xa8
- _vec_resize_inline + 0x240
- unix_cli_file_add + 0x83d
- unix_cli_listen_read_ready + 0x10b
- linux_epoll_input_inline + 0x943
- linux_epoll_input + 0x39
- dispatch_node + 0x336
- vlib_main_or_worker_loop + 0xbf1
- vlib_main_loop + 0x1a
- vlib_main + 0xae7
- thread0 + 0x3e
- ....
-
-libc memory traces
-==================
-
-Internal VPP memory allocations rely on VPP main-heap, however when using
-external libraries, esp. in plugins (eg. OpenSSL library used by the IKEv2
-plugin), those external libraries usually manages memory using the standard
-libc malloc()/free()/... calls. This, in turn, makes use of the default
-libc heap.
-
-VPP has no knowledge of this heap and tools such as memory traces cannot be
-used.
-
-In order to enable the use of standard VPP debugging tools, this library
-replaces standard libc memory management calls with version using VPP
-main-heap.
-
-To use it, you need to use the `LD_PRELOAD` mechanism, eg.
-
-.. code-block:: console
-
- ~# LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libvppmem_preload.so /usr/bin/vpp -c /etc/vpp/startup.conf
-
-You can then use tools such as memory traces as usual.
diff --git a/docs/troubleshooting/reportingissues/index.rst b/docs/troubleshooting/reportingissues/index.rst
deleted file mode 100644
index 4d954ac8746..00000000000
--- a/docs/troubleshooting/reportingissues/index.rst
+++ /dev/null
@@ -1,8 +0,0 @@
-.. _reportingissues:
-
-How to Report an Issue
-======================
-
-.. toctree::
-
- reportingissues
diff --git a/docs/troubleshooting/reportingissues/reportingissues.rst b/docs/troubleshooting/reportingissues/reportingissues.rst
deleted file mode 100644
index 3ccd494d092..00000000000
--- a/docs/troubleshooting/reportingissues/reportingissues.rst
+++ /dev/null
@@ -1,284 +0,0 @@
-.. _reportingbugs:
-
-.. toctree::
-
-Reporting Bugs
-==============
-
-Although every situation is different, this section describes how to
-collect data which will help make efficient use of everyone's time
-when dealing with vpp bugs.
-
-Before you press the Jira button to create a bug report - or email
-vpp-dev@lists.fd.io - please ask yourself whether there's enough
-information for someone else to understand and to reproduce the issue
-given a reasonable amount of effort. **Unicast emails to maintainers,
-committers, and the project PTL are strongly discouraged.**
-
-A good strategy for clear-cut bugs: file a detailed Jira ticket, and
-then send a short description of the issue to vpp-dev@lists.fd.io,
-perhaps from the Jira ticket description. It's fine to send email to
-vpp-dev@lists.fd.io to ask a few questions **before** filing Jira tickets.
-
-Data to include in bug reports
-==============================
-
-Image version and operating environment
----------------------------------------
-
-Please make sure to include the vpp image version and command-line arguments.
-
-.. code-block:: console
-
- $ sudo bash
- # vppctl show version verbose cmdline
- Version: v18.07-rc0~509-gb9124828
- Compiled by: vppuser
- Compile host: vppbuild
- Compile date: Fri Jul 13 09:05:37 EDT 2018
- Compile location: /scratch/vpp-showversion
- Compiler: GCC 7.3.0
- Current PID: 5211
- Command line arguments:
- /scratch/vpp-showversion/build-root/install-vpp_debug-native/vpp/bin/vpp
- unix
- interactive
-
-With respect to the operating environment: if misbehavior involving a
-specific VM / container / bare-metal environment is involved, please
-describe the environment in detail:
-
-* Linux Distro (e.g. Ubuntu 18.04.2 LTS, CentOS-7, etc.)
-* NIC type(s) (ixgbe, i40e, enic, etc. etc.), vhost-user, tuntap
-* NUMA configuration if applicable
-
-Please note the CPU architecture (x86_86, aarch64), and hardware platform.
-
-When practicable, please report issues against released software, or
-unmodified master/latest software.
-
-"Show" command output
----------------------
-
-Every situation is different. If the issue involves a sequence of
-debug CLI command, please enable CLI command logging, and send the
-sequence involved. Note that the debug CLI is a developer's tool -
-**no warranty express or implied** - and that we may choose not to fix
-debug CLI bugs.
-
-Please include "show error" [error counter] output. It's often helpful
-to "clear error", send a bit of traffic, then "show error"
-particularly when running vpp on noisy networks.
-
-Please include ip4 / ip6 / mpls FIB contents ("show ip fib", "show ip6
-fib", "show mpls fib", "show mpls tunnel").
-
-Please include "show hardware", "show interface", and "show interface
-address" output
-
-Here is a consolidated set of commands that are generally useful
-before/after sending traffic. Before sending traffic:
-
-.. code-block:: console
-
- vppctl clear hardware
- vppctl clear interface
- vppctl clear error
- vppctl clear run
-
-Send some traffic and then issue the following commands.
-
-.. code-block:: console
-
- vppctl show version verbose
- vppctl show hardware
- vppctl show interface address
- vppctl show interface
- vppctl show run
- vppctl show error
-
-Here are some protocol specific show commands that may also make
-sense. Only include those features which have been configured.
-
-.. code-block:: console
-
- vppctl show l2fib
- vppctl show bridge-domain
-
- vppctl show ip fib
- vppctl show ip neighbors
-
- vppctl show ip6 fib
- vppctl show ip6 neighbors
-
- vppctl show mpls fib
- vppctl show mpls tunnel
-
-Network Topology
-----------------
-
-Please include a crisp description of the network topology, including
-L2 / IP / MPLS / segment-routing addressing details. If you expect
-folks to reproduce and debug issues, this is a must.
-
-At or above a certain level of topological complexity, it becomes
-problematic to reproduce the original setup.
-
-Packet Tracer Output
---------------------
-
-If you capture packet tracer output which seems relevant, please include it.
-
-.. code-block:: console
-
- vppctl trace add dpdk-input 100 # or similar
-
-send-traffic
-
-.. code-block:: console
-
- vppctl show trace
-
-Capturing post-mortem data
-==========================
-
-It should go without saying, but anyhow: **please put post-mortem data
-in obvious, accessible places.** Time wasted trying to acquire
-accounts, credentials, and IP addresses simply delays problem
-resolution.
-
-Please remember to add post-mortem data location information to Jira
-tickets.
-
-Syslog Output
--------------
-
-The vpp signal handler typically writes a certain amount of data in
-/var/log/syslog before exiting. Make sure to check for evidence, e.g
-via "grep /usr/bin/vpp /var/log/syslog" or similar.
-
-Binary API Trace
-----------------
-
-If the issue involves a sequence of control-plane API messages - even
-a very long sequence - please enable control-plane API
-tracing. Control-plane API post-mortem traces end up in
-/tmp/api_post_mortem.<pid>.
-
-Please remember to put post-mortem binary api traces in accessible
-places.
-
-These API traces are especially helpful in cases where the vpp engine
-is throwing traffic on the floor, e.g. for want of a default route or
-similar.
-
-Make sure to leave the default stanza "... api-trace { on } ... " in
-the vpp startup configuration file /etc/vpp/startup.conf, or to
-include it in the command line arguments passed by orchestration
-software.
-
-Core Files
-----------
-
-Production systems, as well as long-running pre-production soak-test
-systems, **must** arrange to collect core images. There are various
-ways to configure core image capture, including e.g. the Ubuntu
-"corekeeper" package. In a pinch, the following very basic sequence
-will capture usable vpp core files in /tmp/dumps.
-
-.. code-block:: console
-
- # mkdir -p /tmp/dumps
- # sysctl -w debug.exception-trace=1
- # sysctl -w kernel.core_pattern="/tmp/dumps/%e-%t"
- # ulimit -c unlimited
- # echo 2 > /proc/sys/fs/suid_dumpable
-
-If you start VPP from systemd, you also need to edit
-/lib/systemd/system/vpp.service and uncomment the "LimitCORE=infinity"
-line before restarting VPP.
-
-Vpp core files often appear enormous, but they are invariably
-sparse. Gzip compresses them to manageable sizes. A multi-GByte
-corefile often compresses to 10-20 Mbytes.
-
-When decompressing a vpp core file, we suggest using "dd" as shown to
-create a sparse, uncompressed core file:
-
-.. code-block:: console
-
- $ zcat vpp_core.gz | dd conv=sparse of=vpp_core
-
-Please remember to put compressed core files in accessible places.
-
-Make sure to leave the default stanza "... unix { ... full-coredump
-... } ... " in the vpp startup configuration file
-/etc/vpp/startup.conf, or to include it in the command line arguments
-passed by orchestration software.
-
-Core files from Private Images
-==============================
-
-Core files from private images require special handling. If it's
-necessary to go that route, copy the **exact** Debian packages (or
-RPMs) which correspond to the core file to the same public place as
-the core file. A no-excuses-allowed, hard-and-fast requirement.
-
-In particular:
-
-.. code-block:: console
-
- libvppinfra_<version>_<arch>.deb # vppinfra library
- libvppinfra-dev_<version>_<arch>.deb # vppinfra library development pkg
- vpp_<version>_<arch>.deb # the vpp executable
- vpp-dbg_<version>_<arch>.deb # debug symbols
- vpp-dev_<version>_<arch>.deb # vpp development pkg
- vpp-lib_<version>_<arch>.deb # shared libraries
- vpp-plugin-core_<version>_<arch>.deb # core plugins
- vpp-plugin-dpdk_<version>_<arch>.deb # dpdk plugin
-
-For reference, please include git commit-ID, branch, and git repo
-information [for repos other than gerrit.fd.io] in the Jira ticket.
-
-Note that git commit-ids are crypto sums of the head [latest]
-**merged** patch. They say **nothing whatsoever** about local
-workspace modifications, branching, or the git repo in question.
-
-Even given a byte-for-byte identical source tree, it's easy to build
-dramatically different binary artifacts. All it takes is a different
-toolchain version.
-
-
-On-the-fly Core File Compression
---------------------------------
-
-Depending on operational requirements, it's possible to compress
-corefiles as they are generated. Please note that it takes several
-seconds' worth of wall-clock time to compress a vpp core file on the
-fly, during which all packet processing activities are suspended.
-
-To create compressed core files on the fly, create the following
-script, e.g. in /usr/local/bin/compressed_corefiles, owned by root,
-executable:
-
-.. code-block:: console
-
- #!/bin/sh
- exec /bin/gzip -f - >"/tmp/dumps/core-$1.$2.gz"
-
-Adjust the kernel core file pattern as shown:
-
-.. code-block:: console
-
- sysctl -w kernel.core_pattern="|/usr/local/bin/compressed_corefiles %e %t"
-
-Core File Summary
------------------
-
-Bottom line: please follow core file handling instructions to the
-letter. It's not complicated. Simply copy the exact Debian packages or
-RPMs which correspond to core files to accessible locations.
-
-If we go through the setup process only to discover that the image and
-core files don't match, it will simply delay resolution of the issue;
-to say nothing of irritating the person who just wasted their time.
diff --git a/docs/troubleshooting/sanitizer.rst b/docs/troubleshooting/sanitizer.rst
deleted file mode 100644
index 217f5e57182..00000000000
--- a/docs/troubleshooting/sanitizer.rst
+++ /dev/null
@@ -1,45 +0,0 @@
-.. _sanitizer:
-
-*****************
-Google Sanitizers
-*****************
-
-VPP is instrumented to support `Google Sanitizers <https://github.com/google/sanitizers>`_.
-As of today, only `AddressSanitizer <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_
-is supported, both for GCC and clang.
-
-AddressSanitizer
-================
-
-`AddressSanitizer <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_ (aka ASan) is a memory
-error detector for C/C++. Think Valgrind but much faster.
-
-In order to use it, VPP must be recompiled with ASan support. It is implemented as a cmake
-build option, so all VPP targets should be supported. For example:
-
-.. code-block:: console
-
- # build a debug image with ASan support:
- $ make rebuild VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
- ....
-
- # build a release image with ASan support:
- $ make rebuild-release VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
- ....
-
- # build packages in debug mode with ASan support:
- $ make pkg-deb-debug VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
- ....
-
- # run GBP plugin tests in debug mode with ASan
- $ make test-debug TEST=test_gbp VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON
- ....
-
-Once VPP has been built with ASan support you can use it as usual including
-under gdb:
-
-.. code-block:: console
-
- $ gdb --args $PWD/build-root/install-vpp_debug-native/vpp/bin/vpp "unix { interactive }"
- ....
-