diff options
author | Nathan Skrzypczak <nathan.skrzypczak@gmail.com> | 2021-08-19 11:38:06 +0200 |
---|---|---|
committer | Dave Wallace <dwallacelf@gmail.com> | 2021-10-13 23:22:32 +0000 |
commit | 9ad39c026c8a3c945a7003c4aa4f5cb1d4c80160 (patch) | |
tree | 3cca19635417e28ae381d67ae31c75df2925032d /docs/gettingstarted/troubleshooting | |
parent | f47122e07e1ecd0151902a3cabe46c60a99bee8e (diff) |
docs: better docs, mv doxygen to sphinx
This patch refactors the VPP sphinx docs
in order to make it easier to consume
for external readers as well as VPP developers.
It also makes sphinx the single source
of documentation, which simplifies maintenance
and operation.
Most important updates are:
- reformat the existing documentation as rst
- split RELEASE.md and move it into separate rst files
- remove section 'events'
- remove section 'archive'
- remove section 'related projects'
- remove section 'feature by release'
- remove section 'Various links'
- make (Configuration reference, CLI docs,
developer docs) top level items in the list
- move 'Use Cases' as part of 'About VPP'
- move 'Troubleshooting' as part of 'Getting Started'
- move test framework docs into 'Developer Documentation'
- add a 'Contributing' section for gerrit,
docs and other contributer related infos
- deprecate doxygen and test-docs targets
- redirect the "make doxygen" target to "make docs"
Type: refactor
Change-Id: I552a5645d5b7964d547f99b1336e2ac24e7c209f
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
Diffstat (limited to 'docs/gettingstarted/troubleshooting')
-rw-r--r-- | docs/gettingstarted/troubleshooting/cpuusage.rst | 112 | ||||
-rw-r--r-- | docs/gettingstarted/troubleshooting/index.rst | 14 | ||||
-rw-r--r-- | docs/gettingstarted/troubleshooting/mem.rst | 87 | ||||
-rw-r--r-- | docs/gettingstarted/troubleshooting/sanitizer.rst | 45 |
4 files changed, 258 insertions, 0 deletions
diff --git a/docs/gettingstarted/troubleshooting/cpuusage.rst b/docs/gettingstarted/troubleshooting/cpuusage.rst new file mode 100644 index 00000000000..9b4514e128e --- /dev/null +++ b/docs/gettingstarted/troubleshooting/cpuusage.rst @@ -0,0 +1,112 @@ +.. _cpuusage: + +************** +CPU Load/Usage +************** + +There are various commands and tools that can help users see FD.io VPP CPU and memory usage at runtime. + +Linux top/htop +============== + +The Linux top and htop are decent tools to look at FD.io VPP cpu and memory usage, but they will only show +preallocated memory and total CPU usage. These commands can be useful to show which cores VPP is running on. + +This is an example of VPP instance that is running on cores 8 and 9. For this output type **top** and then +type **1** when the tool starts. + +.. code-block:: console + + $ top + + top - 11:04:04 up 35 days, 3:16, 5 users, load average: 2.33, 2.23, 2.16 + Tasks: 435 total, 2 running, 432 sleeping, 1 stopped, 0 zombie + %Cpu0 : 1.0 us, 0.7 sy, 0.0 ni, 98.0 id, 0.0 wa, 0.0 hi, 0.3 si, 0.0 st + %Cpu1 : 2.0 us, 0.3 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu2 : 0.7 us, 1.0 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu3 : 1.7 us, 0.7 sy, 0.0 ni, 97.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu4 : 2.0 us, 0.7 sy, 0.0 ni, 97.4 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu5 : 3.0 us, 0.3 sy, 0.0 ni, 96.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu6 : 2.3 us, 0.7 sy, 0.0 ni, 97.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu7 : 2.6 us, 0.3 sy, 0.0 ni, 97.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu8 : 96.0 us, 0.3 sy, 0.0 ni, 3.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu9 :100.0 us, 0.0 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + %Cpu10 : 1.0 us, 0.3 sy, 0.0 ni, 98.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st + .... + +VPP Memory Usage +================ + +For details on VPP memory usage you can use the **show memory** command + +This is the example VPP memory usage on 2 cores. + +.. code-block:: console + + # vppctl show memory verbose + Thread 0 vpp_main + 22043 objects, 17878k of 20826k used, 2426k free, 2396k reclaimed, 346k overhead, 1048572k capacity + alloc. from small object cache: 22875 hits 39973 attempts (57.23%) replacements 5143 + alloc. from free-list: 44732 attempts, 26017 hits (58.16%), 528461 considered (per-attempt 11.81) + alloc. from vector-expand: 3430 + allocs: 52324 2027.84 clocks/call + frees: 30280 594.38 clocks/call + Thread 1 vpp_wk_0 + 22043 objects, 17878k of 20826k used, 2427k free, 2396k reclaimed, 346k overhead, 1048572k capacity + alloc. from small object cache: 22881 hits 39984 attempts (57.23%) replacements 5148 + alloc. from free-list: 44736 attempts, 26021 hits (58.17%), 528465 considered (per-attempt 11.81) + alloc. from vector-expand: 3430 + allocs: 52335 2027.54 clocks/call + frees: 30291 594.36 clocks/call + +VPP CPU Load +============ + +To find the VPP CPU load or how busy VPP is use the **show runtime** command. + +With at least one interface in polling mode, the VPP CPU utilization is always 100%. + +A good indicator of CPU load is **"average vectors/node"**. A bigger number means VPP +is more busy but also more efficient. The Maximum value is 255 (unless you change VLIB_FRAME_SIZE in code). +It basically means how many packets are processed in batch. + +If VPP is not loaded it will likely poll so fast that it will just get one or few +packets from the rx queue. This is the case shown below on Thread 1. As load goes up vpp +will have more work to do, so it will poll less frequently, and that will result in more +packets waiting in rx queue. More packets will result in more efficient execution of the +code so number of clock cycles / packet will go down. When "average vectors/node" goes up +close to 255, you will likely start observing rx queue tail drops. + +.. code-block:: console + + # vppctl show run + Thread 0 vpp_main (lcore 8) + Time 6152.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 + vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + acl-plugin-fa-cleaner-process event wait 0 0 1 3.66e4 0.00 + admin-up-down-process event wait 0 0 1 2.54e3 0.00 + .... + --------------- + Thread 1 vpp_wk_0 (lcore 9) + Time 6152.9, average vectors/node 1.00, last 128 main loops 0.00 per node 0.00 + vector rates in 1.3073e2, out 1.3073e2, drop 6.5009e-4, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + TenGigabitEthernet86/0/0-outpu active 804395 804395 0 6.17e2 1.00 + TenGigabitEthernet86/0/0-tx active 804395 804395 0 7.29e2 1.00 + arp-input active 2 2 0 3.82e4 1.00 + dpdk-input polling 24239296364 804398 0 1.59e7 0.00 + error-drop active 4 4 0 4.65e3 1.00 + ethernet-input active 2 2 0 1.08e4 1.00 + interface-output active 1 1 0 3.78e3 1.00 + ip4-glean active 1 1 0 6.98e4 1.00 + ip4-icmp-echo-request active 804394 804394 0 5.02e2 1.00 + ip4-icmp-input active 804394 804394 0 4.63e2 1.00 + ip4-input-no-checksum active 804394 804394 0 8.51e2 1.00 + ip4-load-balance active 804394 804394 0 5.46e2 1.00 + ip4-local active 804394 804394 0 5.79e2 1.00 + ip4-lookup active 804394 804394 0 5.71e2 1.00 + ip4-rewrite active 804393 804393 0 5.69e2 1.00 + ip6-input active 2 2 0 5.72e3 1.00 + ip6-not-enabled active 2 2 0 1.56e4 1.00 + unix-epoll-input polling 835722 0 0 3.03e-3 0.00 diff --git a/docs/gettingstarted/troubleshooting/index.rst b/docs/gettingstarted/troubleshooting/index.rst new file mode 100644 index 00000000000..d70c19042c8 --- /dev/null +++ b/docs/gettingstarted/troubleshooting/index.rst @@ -0,0 +1,14 @@ +.. _troubleshooting: + +############### +Troubleshooting +############### + +This chapter describes some of the many techniques used to troubleshoot and diagnose +problem with FD.io VPP implementations. + +.. toctree:: + + cpuusage + sanitizer + mem diff --git a/docs/gettingstarted/troubleshooting/mem.rst b/docs/gettingstarted/troubleshooting/mem.rst new file mode 100644 index 00000000000..630b0af02f3 --- /dev/null +++ b/docs/gettingstarted/troubleshooting/mem.rst @@ -0,0 +1,87 @@ +.. _memleak: + +***************** +Memory leaks +***************** + +Memory traces +============= + +VPP supports memory traces to help debug (suspected) memory leaks. Each +allocation/deallocation is instrumented so that the number of allocations and +current global allocated size is maintained for each unique allocation stack +trace. + +Looking at a memory trace can help diagnose where memory is (over-)used, and +comparing memory traces at different point in time can help diagnose if and +where memory leaks happen. + +To enable memory traces on main-heap: + +.. code-block:: console + + $ vppctl memory-trace on main-heap + +To dump memory traces for analysis: + +.. code-block:: console + + $ vppctl show memory-trace on main-heap + Thread 0 vpp_main + base 0x7fffb6422000, size 1g, locked, unmap-on-destroy, name 'main heap' + page stats: page-size 4K, total 262144, mapped 30343, not-mapped 231801 + numa 0: 30343 pages, 118.53m bytes + total: 1023.99M, used: 115.49M, free: 908.50M, trimmable: 908.48M + free chunks 451 free fastbin blks 0 + max total allocated 1023.99M + + Bytes Count Sample Traceback + 31457440 1 0x7fffbb31ad00 clib_mem_alloc_aligned_at_offset + 0x80 + clib_mem_alloc_aligned + 0x26 + alloc_aligned_8_8 + 0xe1 + clib_bihash_instantiate_8_8 + 0x76 + clib_bihash_init2_8_8 + 0x2ec + clib_bihash_init_8_8 + 0x6a + l2fib_table_init + 0x54 + set_int_l2_mode + 0x89 + int_l3 + 0xb4 + vlib_cli_dispatch_sub_commands + 0xeee + vlib_cli_dispatch_sub_commands + 0xc62 + vlib_cli_dispatch_sub_commands + 0xc62 + 266768 5222 0x7fffbd79f978 clib_mem_alloc_aligned_at_offset + 0x80 + vec_resize_allocate_memory + 0xa8 + _vec_resize_inline + 0x240 + unix_cli_file_add + 0x83d + unix_cli_listen_read_ready + 0x10b + linux_epoll_input_inline + 0x943 + linux_epoll_input + 0x39 + dispatch_node + 0x336 + vlib_main_or_worker_loop + 0xbf1 + vlib_main_loop + 0x1a + vlib_main + 0xae7 + thread0 + 0x3e + .... + +libc memory traces +================== + +Internal VPP memory allocations rely on VPP main-heap, however when using +external libraries, esp. in plugins (e.g. OpenSSL library used by the IKEv2 +plugin), those external libraries usually manages memory using the standard +libc malloc()/free()/... calls. This, in turn, makes use of the default +libc heap. + +VPP has no knowledge of this heap and tools such as memory traces cannot be +used. + +In order to enable the use of standard VPP debugging tools, this library +replaces standard libc memory management calls with version using VPP +main-heap. + +To use it, you need to use the `LD_PRELOAD` mechanism, e.g. + +.. code-block:: console + + ~# LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libvppmem_preload.so /usr/bin/vpp -c /etc/vpp/startup.conf + +You can then use tools such as memory traces as usual. diff --git a/docs/gettingstarted/troubleshooting/sanitizer.rst b/docs/gettingstarted/troubleshooting/sanitizer.rst new file mode 100644 index 00000000000..217f5e57182 --- /dev/null +++ b/docs/gettingstarted/troubleshooting/sanitizer.rst @@ -0,0 +1,45 @@ +.. _sanitizer: + +***************** +Google Sanitizers +***************** + +VPP is instrumented to support `Google Sanitizers <https://github.com/google/sanitizers>`_. +As of today, only `AddressSanitizer <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_ +is supported, both for GCC and clang. + +AddressSanitizer +================ + +`AddressSanitizer <https://github.com/google/sanitizers/wiki/AddressSanitizer>`_ (aka ASan) is a memory +error detector for C/C++. Think Valgrind but much faster. + +In order to use it, VPP must be recompiled with ASan support. It is implemented as a cmake +build option, so all VPP targets should be supported. For example: + +.. code-block:: console + + # build a debug image with ASan support: + $ make rebuild VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON + .... + + # build a release image with ASan support: + $ make rebuild-release VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON + .... + + # build packages in debug mode with ASan support: + $ make pkg-deb-debug VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON + .... + + # run GBP plugin tests in debug mode with ASan + $ make test-debug TEST=test_gbp VPP_EXTRA_CMAKE_ARGS=-DVPP_ENABLE_SANITIZE_ADDR=ON + .... + +Once VPP has been built with ASan support you can use it as usual including +under gdb: + +.. code-block:: console + + $ gdb --args $PWD/build-root/install-vpp_debug-native/vpp/bin/vpp "unix { interactive }" + .... + |