diff options
Diffstat (limited to 'docs/usecases')
71 files changed, 5788 insertions, 5230 deletions
diff --git a/docs/usecases/2_vpp.md b/docs/usecases/2_vpp.md deleted file mode 100644 index d5f92818903..00000000000 --- a/docs/usecases/2_vpp.md +++ /dev/null @@ -1,128 +0,0 @@ -How to connect VPP instances using IKEv2 -======================================== - -This section describes how to initiate IKEv2 session between two VPP instances -using Linux veth interfaces and namespaces. - - -Create veth interfaces and namespaces and configure it: - -``` -sudo ip link add ifresp type veth peer name ifinit -sudo ip link set dev ifresp up -sudo ip link set dev ifinit up - -sudo ip netns add clientns -sudo ip netns add serverns -sudo ip link add veth_client type veth peer name client -sudo ip link add veth_server type veth peer name server -sudo ip link set dev veth_client up netns clientns -sudo ip link set dev veth_server up netns serverns - -sudo ip netns exec clientns \ - bash -c " - ip link set dev lo up - ip addr add 192.168.5.2/24 dev veth_client - ip addr add fec5::2/16 dev veth_client - ip route add 192.168.3.0/24 via 192.168.5.1 - ip route add fec3::0/16 via fec5::1 - " - -sudo ip netns exec serverns \ - bash -c " - ip link set dev lo up - ip addr add 192.168.3.2/24 dev veth_server - ip addr add fec3::2/16 dev veth_server - ip route add 192.168.5.0/24 via 192.168.3.1 - ip route add fec5::0/16 via fec3::1 - " -``` - -Run responder VPP: - -``` -sudo /usr/bin/vpp unix { \ - cli-listen /tmp/vpp_resp.sock \ - gid $(id -g) } \ - api-segment { prefix vpp } \ - plugins { plugin dpdk_plugin.so { disable } } -``` - -Configure the responder - - -``` -create host-interface name ifresp -set interface ip addr host-ifresp 192.168.10.2/24 -set interface state host-ifresp up - -create host-interface name server -set interface ip addr host-server 192.168.3.1/24 -set interface state host-server up - -ikev2 profile add pr1 -ikev2 profile set pr1 auth shared-key-mic string Vpp123 -ikev2 profile set pr1 id local ipv4 192.168.10.2 -ikev2 profile set pr1 id remote ipv4 192.168.10.1 - -ikev2 profile set pr1 traffic-selector local ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 -ikev2 profile set pr1 traffic-selector remote ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 - -create ipip tunnel src 192.168.10.2 dst 192.168.10.1 -ikev2 profile set pr1 tunnel ipip0 -ip route add 192.168.5.0/24 via 192.168.10.1 ipip0 -set interface unnumbered ipip0 use host-ifresp -``` - -Run initiator VPP: - -``` -sudo /usr/bin/vpp unix { \ - cli-listen /tmp/vpp_init.sock \ - gid $(id -g) } \ - api-segment { prefix vpp } \ - plugins { plugin dpdk_plugin.so { disable } } -``` - -Configure initiator: -``` -create host-interface name ifinit -set interface ip addr host-ifinit 192.168.10.1/24 -set interface state host-ifinit up - -create host-interface name client -set interface ip addr host-client 192.168.5.1/24 -set interface state host-client up - -ikev2 profile add pr1 -ikev2 profile set pr1 auth shared-key-mic string Vpp123 -ikev2 profile set pr1 id local ipv4 192.168.10.1 -ikev2 profile set pr1 id remote ipv4 192.168.10.2 - -ikev2 profile set pr1 traffic-selector remote ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 -ikev2 profile set pr1 traffic-selector local ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 - -ikev2 profile set pr1 responder host-ifinit 192.168.10.2 -ikev2 profile set pr1 ike-crypto-alg aes-gcm-16 256 ike-dh modp-2048 -ikev2 profile set pr1 esp-crypto-alg aes-gcm-16 256 - -create ipip tunnel src 192.168.10.1 dst 192.168.10.2 -ikev2 profile set pr1 tunnel ipip0 -ip route add 192.168.3.0/24 via 192.168.10.2 ipip0 -set interface unnumbered ipip0 use host-ifinit -``` - -Initiate the IKEv2 connection: - -``` -vpp# ikev2 initiate sa-init pr1 -``` - -Responder's and initiator's private networks are now connected with IPSEC tunnel: - -``` -$ sudo ip netns exec clientns ping 192.168.3.1 -PING 192.168.3.1 (192.168.3.1) 56(84) bytes of data. -64 bytes from 192.168.3.1: icmp_seq=1 ttl=63 time=1.64 ms -64 bytes from 192.168.3.1: icmp_seq=2 ttl=63 time=7.24 ms -``` diff --git a/docs/usecases/acls.rst b/docs/usecases/acls.rst index 0350af2d969..2dcb1f3b8dc 100644 --- a/docs/usecases/acls.rst +++ b/docs/usecases/acls.rst @@ -1,7 +1,7 @@ .. _aclwithvpp: -Access Control Lists (ACLs) with FD.io VPP -========================================== +Access Control Lists with VPP +============================= This section is overview of the options available to implement ACLs in FD.io VPP. As there are a number of way's to address ACL-like functionality, @@ -106,36 +106,36 @@ Test Case: 10ge2p1x520-ethip4udp-ip4base-iacl1sl-10kflows-ndrpdr .. code-block:: console - DUT1: - Thread 0 vpp_main (lcore 1) - Time 3.8, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 - vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - acl-plugin-fa-cleaner-process any wait 0 0 14 1.29e3 0.00 - acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 9.18e2 0.00 - api-rx-from-ring active 0 0 52 8.96e4 0.00 - dpdk-process any wait 0 0 1 1.35e4 0.00 - fib-walk any wait 0 0 2 2.69e3 0.00 - ip6-icmp-neighbor-discovery-ev any wait 0 0 4 1.32e3 0.00 - lisp-retry-service any wait 0 0 2 2.90e3 0.00 - unix-epoll-input polling 7037 0 0 1.25e6 0.00 - vpe-oam-process any wait 0 0 2 2.28e3 0.00 - - Thread 1 vpp_wk_0 (lcore 2) - Time 3.8, average vectors/node 249.02, last 128 main loops 32.00 per node 273.07 - vector rates in 6.1118e6, out 6.1118e6, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - TenGigabitEtherneta/0/0-output active 47106 11721472 0 9.47e0 248.83 - TenGigabitEtherneta/0/0-tx active 47106 11721472 0 4.22e1 248.83 - TenGigabitEtherneta/0/1-output active 47106 11721472 0 1.02e1 248.83 - TenGigabitEtherneta/0/1-tx active 47106 11721472 0 4.18e1 248.83 - acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 1.39e3 0.00 - acl-plugin-in-ip4-fa active 94107 23442944 0 1.75e2 249.11 - dpdk-input polling 47106 23442944 0 4.64e1 497.66 - ethernet-input active 94212 23442944 0 1.55e1 248.83 - ip4-input-no-checksum active 94107 23442944 0 3.23e1 249.11 - ip4-lookup active 94107 23442944 0 2.91e1 249.11 - ip4-rewrite active 94107 23442944 0 2.48e1 249.11 + DUT1: + Thread 0 vpp_main (lcore 1) + Time 3.8, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 + vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + acl-plugin-fa-cleaner-process any wait 0 0 14 1.29e3 0.00 + acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 9.18e2 0.00 + api-rx-from-ring active 0 0 52 8.96e4 0.00 + dpdk-process any wait 0 0 1 1.35e4 0.00 + fib-walk any wait 0 0 2 2.69e3 0.00 + ip6-icmp-neighbor-discovery-ev any wait 0 0 4 1.32e3 0.00 + lisp-retry-service any wait 0 0 2 2.90e3 0.00 + unix-epoll-input polling 7037 0 0 1.25e6 0.00 + vpe-oam-process any wait 0 0 2 2.28e3 0.00 + + Thread 1 vpp_wk_0 (lcore 2) + Time 3.8, average vectors/node 249.02, last 128 main loops 32.00 per node 273.07 + vector rates in 6.1118e6, out 6.1118e6, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + TenGigabitEtherneta/0/0-output active 47106 11721472 0 9.47e0 248.83 + TenGigabitEtherneta/0/0-tx active 47106 11721472 0 4.22e1 248.83 + TenGigabitEtherneta/0/1-output active 47106 11721472 0 1.02e1 248.83 + TenGigabitEtherneta/0/1-tx active 47106 11721472 0 4.18e1 248.83 + acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 1.39e3 0.00 + acl-plugin-in-ip4-fa active 94107 23442944 0 1.75e2 249.11 + dpdk-input polling 47106 23442944 0 4.64e1 497.66 + ethernet-input active 94212 23442944 0 1.55e1 248.83 + ip4-input-no-checksum active 94107 23442944 0 3.23e1 249.11 + ip4-lookup active 94107 23442944 0 2.91e1 249.11 + ip4-rewrite active 94107 23442944 0 2.48e1 249.11 unix-epoll-input polling 46 0 0 1.54e3 0.00 Input/Stateful @@ -146,41 +146,41 @@ Test Case: 64b-1t1c-ethip4udp-ip4base-iacl1sf-10kflows-ndrpdr .. code-block:: console - DUT1: - Thread 0 vpp_main (lcore 1) - Time 3.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 - vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - acl-plugin-fa-cleaner-process any wait 0 0 16 1.40e3 0.00 - acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 8.97e2 0.00 - api-rx-from-ring active 0 0 52 7.12e4 0.00 - dpdk-process any wait 0 0 1 1.69e4 0.00 - fib-walk any wait 0 0 2 2.55e3 0.00 - ip4-reassembly-expire-walk any wait 0 0 1 1.27e4 0.00 - ip6-icmp-neighbor-discovery-ev any wait 0 0 4 1.09e3 0.00 - ip6-reassembly-expire-walk any wait 0 0 1 2.57e3 0.00 - lisp-retry-service any wait 0 0 2 1.18e4 0.00 - statseg-collector-process time wait 0 0 1 6.38e3 0.00 - unix-epoll-input polling 6320 0 0 1.41e6 0.00 - vpe-oam-process any wait 0 0 2 7.53e3 0.00 - - Thread 1 vpp_wk_0 (lcore 2) - Time 3.9, average vectors/node 252.74, last 128 main loops 32.00 per node 273.07 - vector rates in 7.5833e6, out 7.5833e6, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - TenGigabitEtherneta/0/0-output active 58325 14738944 0 9.41e0 252.70 - TenGigabitEtherneta/0/0-tx active 58325 14738944 0 4.32e1 252.70 - TenGigabitEtherneta/0/1-output active 58323 14738944 0 1.02e1 252.71 - TenGigabitEtherneta/0/1-tx active 58323 14738944 0 4.31e1 252.71 - acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 1.62e3 0.00 - acl-plugin-in-ip4-fa active 116628 29477888 0 1.01e2 252.75 - dpdk-input polling 58325 29477888 0 4.63e1 505.41 - ethernet-input active 116648 29477888 0 1.53e1 252.71 - ip4-input-no-checksum active 116628 29477888 0 3.21e1 252.75 - ip4-lookup active 116628 29477888 0 2.90e1 252.75 - ip4-rewrite active 116628 29477888 0 2.48e1 252.75 - unix-epoll-input polling 57 0 0 2.39e3 0.00 - + DUT1: + Thread 0 vpp_main (lcore 1) + Time 3.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 + vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + acl-plugin-fa-cleaner-process any wait 0 0 16 1.40e3 0.00 + acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 8.97e2 0.00 + api-rx-from-ring active 0 0 52 7.12e4 0.00 + dpdk-process any wait 0 0 1 1.69e4 0.00 + fib-walk any wait 0 0 2 2.55e3 0.00 + ip4-reassembly-expire-walk any wait 0 0 1 1.27e4 0.00 + ip6-icmp-neighbor-discovery-ev any wait 0 0 4 1.09e3 0.00 + ip6-reassembly-expire-walk any wait 0 0 1 2.57e3 0.00 + lisp-retry-service any wait 0 0 2 1.18e4 0.00 + statseg-collector-process time wait 0 0 1 6.38e3 0.00 + unix-epoll-input polling 6320 0 0 1.41e6 0.00 + vpe-oam-process any wait 0 0 2 7.53e3 0.00 + + Thread 1 vpp_wk_0 (lcore 2) + Time 3.9, average vectors/node 252.74, last 128 main loops 32.00 per node 273.07 + vector rates in 7.5833e6, out 7.5833e6, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + TenGigabitEtherneta/0/0-output active 58325 14738944 0 9.41e0 252.70 + TenGigabitEtherneta/0/0-tx active 58325 14738944 0 4.32e1 252.70 + TenGigabitEtherneta/0/1-output active 58323 14738944 0 1.02e1 252.71 + TenGigabitEtherneta/0/1-tx active 58323 14738944 0 4.31e1 252.71 + acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 1.62e3 0.00 + acl-plugin-in-ip4-fa active 116628 29477888 0 1.01e2 252.75 + dpdk-input polling 58325 29477888 0 4.63e1 505.41 + ethernet-input active 116648 29477888 0 1.53e1 252.71 + ip4-input-no-checksum active 116628 29477888 0 3.21e1 252.75 + ip4-lookup active 116628 29477888 0 2.90e1 252.75 + ip4-rewrite active 116628 29477888 0 2.48e1 252.75 + unix-epoll-input polling 57 0 0 2.39e3 0.00 + Output/Stateless ~~~~~~~~~~~~~~~~ @@ -189,39 +189,39 @@ Test Case: 64b-1t1c-ethip4udp-ip4base-oacl10sl-10kflows-ndrpdr .. code-block:: console - DUT1: - Thread 0 vpp_main (lcore 1) - Time 3.8, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 - vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - acl-plugin-fa-cleaner-process any wait 0 0 14 1.43e3 0.00 - acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 9.23e2 0.00 - api-rx-from-ring active 0 0 52 8.01e4 0.00 - dpdk-process any wait 0 0 1 1.59e6 0.00 - fib-walk any wait 0 0 2 6.81e3 0.00 - ip6-icmp-neighbor-discovery-ev any wait 0 0 4 2.81e3 0.00 - lisp-retry-service any wait 0 0 2 3.64e3 0.00 - unix-epoll-input polling 4842 0 0 1.81e6 0.00 - vpe-oam-process any wait 0 0 1 2.24e4 0.00 - - Thread 1 vpp_wk_0 (lcore 2) - Time 3.8, average vectors/node 249.29, last 128 main loops 36.00 per node 271.06 - vector rates in 5.9196e6, out 5.9196e6, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - TenGigabitEtherneta/0/0-output active 45595 11363584 0 9.22e0 249.23 - TenGigabitEtherneta/0/0-tx active 45595 11363584 0 4.25e1 249.23 - TenGigabitEtherneta/0/1-output active 45594 11363584 0 9.75e0 249.23 - TenGigabitEtherneta/0/1-tx active 45594 11363584 0 4.21e1 249.23 - acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 1.28e3 0.00 - acl-plugin-out-ip4-fa active 91155 22727168 0 1.78e2 249.32 - dpdk-input polling 45595 22727168 0 4.64e1 498.46 - ethernet-input active 91189 22727168 0 1.56e1 249.23 - interface-output active 91155 22727168 0 1.13e1 249.32 - ip4-input-no-checksum active 91155 22727168 0 1.95e1 249.32 - ip4-lookup active 91155 22727168 0 2.88e1 249.32 - ip4-rewrite active 91155 22727168 0 3.53e1 249.32 - unix-epoll-input polling 44 0 0 1.53e3 0.00 - + DUT1: + Thread 0 vpp_main (lcore 1) + Time 3.8, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 + vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + acl-plugin-fa-cleaner-process any wait 0 0 14 1.43e3 0.00 + acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 9.23e2 0.00 + api-rx-from-ring active 0 0 52 8.01e4 0.00 + dpdk-process any wait 0 0 1 1.59e6 0.00 + fib-walk any wait 0 0 2 6.81e3 0.00 + ip6-icmp-neighbor-discovery-ev any wait 0 0 4 2.81e3 0.00 + lisp-retry-service any wait 0 0 2 3.64e3 0.00 + unix-epoll-input polling 4842 0 0 1.81e6 0.00 + vpe-oam-process any wait 0 0 1 2.24e4 0.00 + + Thread 1 vpp_wk_0 (lcore 2) + Time 3.8, average vectors/node 249.29, last 128 main loops 36.00 per node 271.06 + vector rates in 5.9196e6, out 5.9196e6, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + TenGigabitEtherneta/0/0-output active 45595 11363584 0 9.22e0 249.23 + TenGigabitEtherneta/0/0-tx active 45595 11363584 0 4.25e1 249.23 + TenGigabitEtherneta/0/1-output active 45594 11363584 0 9.75e0 249.23 + TenGigabitEtherneta/0/1-tx active 45594 11363584 0 4.21e1 249.23 + acl-plugin-fa-worker-cleaner-pinterrupt wa 7 0 0 1.28e3 0.00 + acl-plugin-out-ip4-fa active 91155 22727168 0 1.78e2 249.32 + dpdk-input polling 45595 22727168 0 4.64e1 498.46 + ethernet-input active 91189 22727168 0 1.56e1 249.23 + interface-output active 91155 22727168 0 1.13e1 249.32 + ip4-input-no-checksum active 91155 22727168 0 1.95e1 249.32 + ip4-lookup active 91155 22727168 0 2.88e1 249.32 + ip4-rewrite active 91155 22727168 0 3.53e1 249.32 + unix-epoll-input polling 44 0 0 1.53e3 0.00 + Output/Stateful ~~~~~~~~~~~~~~~ @@ -230,42 +230,42 @@ Test Case: 64b-1t1c-ethip4udp-ip4base-oacl10sf-10kflows-ndrpdr .. code-block:: console - DUT1: - Thread 0 vpp_main (lcore 1) - Time 3.8, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 - vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - acl-plugin-fa-cleaner-process any wait 0 0 16 1.47e3 0.00 - acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 8.51e2 0.00 - api-rx-from-ring active 0 0 50 7.24e4 0.00 - dpdk-process any wait 0 0 2 1.93e4 0.00 - fib-walk any wait 0 0 2 2.02e3 0.00 - ip4-reassembly-expire-walk any wait 0 0 1 3.96e3 0.00 - ip6-icmp-neighbor-discovery-ev any wait 0 0 4 9.84e2 0.00 - ip6-reassembly-expire-walk any wait 0 0 1 3.76e3 0.00 - lisp-retry-service any wait 0 0 2 1.49e4 0.00 - statseg-collector-process time wait 0 0 1 4.98e3 0.00 - unix-epoll-input polling 5653 0 0 1.55e6 0.00 - vpe-oam-process any wait 0 0 2 1.90e3 0.00 - - Thread 1 vpp_wk_0 (lcore 2) - Time 3.8, average vectors/node 250.85, last 128 main loops 36.00 per node 271.06 - vector rates in 7.2686e6, out 7.2686e6, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - TenGigabitEtherneta/0/0-output active 55639 13930752 0 9.33e0 250.38 - TenGigabitEtherneta/0/0-tx active 55639 13930752 0 4.27e1 250.38 - TenGigabitEtherneta/0/1-output active 55636 13930758 0 9.81e0 250.39 - TenGigabitEtherneta/0/1-tx active 55636 13930758 0 4.33e1 250.39 - acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 1.62e3 0.00 - acl-plugin-out-ip4-fa active 110988 27861510 0 1.04e2 251.03 - dpdk-input polling 55639 27861510 0 4.62e1 500.76 - ethernet-input active 111275 27861510 0 1.55e1 250.38 - interface-output active 110988 27861510 0 1.21e1 251.03 - ip4-input-no-checksum active 110988 27861510 0 1.95e1 251.03 - ip4-lookup active 110988 27861510 0 2.89e1 251.03 - ip4-rewrite active 110988 27861510 0 3.55e1 251.03 - unix-epoll-input polling 54 0 0 2.43e3 0.00 - + DUT1: + Thread 0 vpp_main (lcore 1) + Time 3.8, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 + vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + acl-plugin-fa-cleaner-process any wait 0 0 16 1.47e3 0.00 + acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 8.51e2 0.00 + api-rx-from-ring active 0 0 50 7.24e4 0.00 + dpdk-process any wait 0 0 2 1.93e4 0.00 + fib-walk any wait 0 0 2 2.02e3 0.00 + ip4-reassembly-expire-walk any wait 0 0 1 3.96e3 0.00 + ip6-icmp-neighbor-discovery-ev any wait 0 0 4 9.84e2 0.00 + ip6-reassembly-expire-walk any wait 0 0 1 3.76e3 0.00 + lisp-retry-service any wait 0 0 2 1.49e4 0.00 + statseg-collector-process time wait 0 0 1 4.98e3 0.00 + unix-epoll-input polling 5653 0 0 1.55e6 0.00 + vpe-oam-process any wait 0 0 2 1.90e3 0.00 + + Thread 1 vpp_wk_0 (lcore 2) + Time 3.8, average vectors/node 250.85, last 128 main loops 36.00 per node 271.06 + vector rates in 7.2686e6, out 7.2686e6, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + TenGigabitEtherneta/0/0-output active 55639 13930752 0 9.33e0 250.38 + TenGigabitEtherneta/0/0-tx active 55639 13930752 0 4.27e1 250.38 + TenGigabitEtherneta/0/1-output active 55636 13930758 0 9.81e0 250.39 + TenGigabitEtherneta/0/1-tx active 55636 13930758 0 4.33e1 250.39 + acl-plugin-fa-worker-cleaner-pinterrupt wa 8 0 0 1.62e3 0.00 + acl-plugin-out-ip4-fa active 110988 27861510 0 1.04e2 251.03 + dpdk-input polling 55639 27861510 0 4.62e1 500.76 + ethernet-input active 111275 27861510 0 1.55e1 250.38 + interface-output active 110988 27861510 0 1.21e1 251.03 + ip4-input-no-checksum active 110988 27861510 0 1.95e1 251.03 + ip4-lookup active 110988 27861510 0 2.89e1 251.03 + ip4-rewrite active 110988 27861510 0 3.55e1 251.03 + unix-epoll-input polling 54 0 0 2.43e3 0.00 + Performance ----------- @@ -297,21 +297,21 @@ Stateful .. code-block:: console - $ sudo vppctl ip_add_del_route 20.20.20.0/24 via 1.1.1.2 sw_if_index 1 resolve-attempts 10 count 1 - $ sudo vppctl acl_add_replace ipv4 permit src 30.30.30.1/32 dst 40.40.40.1/32 sport 1000 dport 1000, ipv4 permit+reflect src 10.10.10.0/24, ipv4 permit+reflect src 20.20.20.0/24 - $ sudo vppctl acl_interface_set_acl_list sw_if_index 2 input 0 - $ sudo vppctl acl_interface_set_acl_list sw_if_index 1 input 0 - + $ sudo vppctl ip_add_del_route 20.20.20.0/24 via 1.1.1.2 sw_if_index 1 resolve-attempts 10 count 1 + $ sudo vppctl acl_add_replace ipv4 permit src 30.30.30.1/32 dst 40.40.40.1/32 sport 1000 dport 1000, ipv4 permit+reflect src 10.10.10.0/24, ipv4 permit+reflect src 20.20.20.0/24 + $ sudo vppctl acl_interface_set_acl_list sw_if_index 2 input 0 + $ sudo vppctl acl_interface_set_acl_list sw_if_index 1 input 0 + Stateless ~~~~~~~~~ .. code-block:: console - $ sudo vppctl ip_add_del_route 20.20.20.0/24 via 1.1.1.2 sw_if_index 1 resolve-attempts 10 count 1 - $ sudo vppctl acl_add_replace ipv4 permit src 30.30.30.1/32 dst 40.40.40.1/32 sport 1000 dport 1000, ipv4 permit src 10.10.10.0/24, ipv4 permit src 20.20.20.0/24 - $ sudo vppctl acl_interface_set_acl_list sw_if_index 2 input 0 + $ sudo vppctl ip_add_del_route 20.20.20.0/24 via 1.1.1.2 sw_if_index 1 resolve-attempts 10 count 1 + $ sudo vppctl acl_add_replace ipv4 permit src 30.30.30.1/32 dst 40.40.40.1/32 sport 1000 dport 1000, ipv4 permit src 10.10.10.0/24, ipv4 permit src 20.20.20.0/24 + $ sudo vppctl acl_interface_set_acl_list sw_if_index 2 input 0 $ sudo vppctl acl_interface_set_acl_list sw_if_index 1 input 0 - + Links ~~~~~ @@ -348,40 +348,40 @@ Note: the double-pass of the ip4-lookup and ip4-rewrite. .. code-block:: console - DUT1: - Thread 0 vpp_main (lcore 1) - Time 3.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 - vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - api-rx-from-ring active 0 0 53 4.20e4 0.00 - dpdk-process any wait 0 0 1 1.75e4 0.00 - fib-walk any wait 0 0 2 1.59e3 0.00 - ip4-reassembly-expire-walk any wait 0 0 1 2.20e3 0.00 - ip6-icmp-neighbor-discovery-ev any wait 0 0 4 1.14e3 0.00 - ip6-reassembly-expire-walk any wait 0 0 1 1.50e3 0.00 - lisp-retry-service any wait 0 0 2 2.19e3 0.00 - statseg-collector-process time wait 0 0 1 2.48e3 0.00 - unix-epoll-input polling 2800 0 0 3.15e6 0.00 - vpe-oam-process any wait 0 0 2 7.00e2 0.00 - - Thread 1 vpp_wk_0 (lcore 2) - Time 3.9, average vectors/node 220.84, last 128 main loops 20.87 per node 190.86 - vector rates in 1.0724e7, out 1.0724e7, drop 0.0000e0, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call - TenGigabitEtherneta/0/0-output active 94960 20698112 0 1.03e1 217.97 - TenGigabitEtherneta/0/0-tx active 94960 20698112 0 3.97e1 217.97 - TenGigabitEtherneta/0/1-output active 92238 20698112 0 9.92e0 224.39 - TenGigabitEtherneta/0/1-tx active 92238 20698112 0 4.26e1 224.39 - cop-input active 94960 20698112 0 1.98e1 217.97 - dpdk-input polling 95154 41396224 0 4.58e1 435.04 - ethernet-input active 92238 20698112 0 1.59e1 224.39 - ip4-cop-whitelist active 94960 20698112 0 3.24e1 217.97 - ip4-input active 94960 20698112 0 3.13e1 217.97 - ip4-input-no-checksum active 92238 20698112 0 2.23e1 224.39 - ip4-lookup active 187198 41396224 0 3.08e1 221.14 - ip4-rewrite active 187198 41396224 0 2.47e1 221.14 - unix-epoll-input polling 93 0 0 1.35e3 0.00 - + DUT1: + Thread 0 vpp_main (lcore 1) + Time 3.9, average vectors/node 0.00, last 128 main loops 0.00 per node 0.00 + vector rates in 0.0000e0, out 0.0000e0, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + api-rx-from-ring active 0 0 53 4.20e4 0.00 + dpdk-process any wait 0 0 1 1.75e4 0.00 + fib-walk any wait 0 0 2 1.59e3 0.00 + ip4-reassembly-expire-walk any wait 0 0 1 2.20e3 0.00 + ip6-icmp-neighbor-discovery-ev any wait 0 0 4 1.14e3 0.00 + ip6-reassembly-expire-walk any wait 0 0 1 1.50e3 0.00 + lisp-retry-service any wait 0 0 2 2.19e3 0.00 + statseg-collector-process time wait 0 0 1 2.48e3 0.00 + unix-epoll-input polling 2800 0 0 3.15e6 0.00 + vpe-oam-process any wait 0 0 2 7.00e2 0.00 + + Thread 1 vpp_wk_0 (lcore 2) + Time 3.9, average vectors/node 220.84, last 128 main loops 20.87 per node 190.86 + vector rates in 1.0724e7, out 1.0724e7, drop 0.0000e0, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + TenGigabitEtherneta/0/0-output active 94960 20698112 0 1.03e1 217.97 + TenGigabitEtherneta/0/0-tx active 94960 20698112 0 3.97e1 217.97 + TenGigabitEtherneta/0/1-output active 92238 20698112 0 9.92e0 224.39 + TenGigabitEtherneta/0/1-tx active 92238 20698112 0 4.26e1 224.39 + cop-input active 94960 20698112 0 1.98e1 217.97 + dpdk-input polling 95154 41396224 0 4.58e1 435.04 + ethernet-input active 92238 20698112 0 1.59e1 224.39 + ip4-cop-whitelist active 94960 20698112 0 3.24e1 217.97 + ip4-input active 94960 20698112 0 3.13e1 217.97 + ip4-input-no-checksum active 92238 20698112 0 2.23e1 224.39 + ip4-lookup active 187198 41396224 0 3.08e1 221.14 + ip4-rewrite active 187198 41396224 0 2.47e1 221.14 + unix-epoll-input polling 93 0 0 1.35e3 0.00 + Performance ~~~~~~~~~~~ @@ -403,12 +403,12 @@ applied to the interface 1. .. code-block:: console - $ sudo vppctl ip_add_del_route 10.10.10.0/24 via 1.1.1.1 sw_if_index 2 resolve-attempts 10 count 1 - $ sudo vppctl ip_table_add_del table 1 - $ sudo vppctl ip_add_del_route 20.20.20.0/24 vrf 1 resolve-attempts 10 count 1 local - $ sudo vppctl cop_whitelist_enable_disable sw_if_index 1 ip4 fib-id 1 - $ sudo vppctl cop_interface_enable_disable sw_if_index 1 - + $ sudo vppctl ip_add_del_route 10.10.10.0/24 via 1.1.1.1 sw_if_index 2 resolve-attempts 10 count 1 + $ sudo vppctl ip_table_add_del table 1 + $ sudo vppctl ip_add_del_route 20.20.20.0/24 vrf 1 resolve-attempts 10 count 1 local + $ sudo vppctl cop_whitelist_enable_disable sw_if_index 1 ip4 fib-id 1 + $ sudo vppctl cop_interface_enable_disable sw_if_index 1 + Links ~~~~~ @@ -478,7 +478,7 @@ Match an IPv6…. $ sudo vppctl classify table mask l3 ip6 dst buckets 64 $ sudo vppctl classify session hit-next 0 table-index 0 match l3 ip6 dst 2001:db8:1::2 opaque-index 42 $ sudo vppctl set interface l2 input classify intfc host-s0_s1 ip6-table 0 - + Links ~~~~~ diff --git a/docs/usecases/container_test.md b/docs/usecases/container_test.md deleted file mode 100644 index ad0bc2ea098..00000000000 --- a/docs/usecases/container_test.md +++ /dev/null @@ -1,640 +0,0 @@ -Container-based network simulation -================================== - -The "make test" framework provides a good way to test individual -features. However, when testing several features at once - or -validating nontrivial configurations - it may prove difficult or -impossible to use the unit-test framework. - -This note explains how to set up lxc/lxd, and a 5-container testbed to -test a split-tunnel nat + ikev2 + ipsec + ipv6 prefix-delegation -scenario. - -OS / Distro test results ------------------------- - -This setup has been tested on an Ubuntu 18.04 LTS system. If you're -feeling adventurous, the same scenario also worked on a recent Ubuntu -20.04 "preview" daily build. - -Other distros may work fine, or not at all. - -Proxy Server ------------- - -If you need to use a proxy server e.g. from a lab system, you'll -probably need to set HTTP_PROXY, HTTPS_PROXY, http_proxy and -https_proxy in /etc/environment. Directly setting variables in the -environment doesn't work. The lxd snap _daemon_ needs the proxy settings, -not the user interface. - -Something like so: - -``` - HTTP_PROXY=http://my.proxy.server:8080 - HTTPS_PROXY=http://my.proxy.server:4333 - http_proxy=http://my.proxy.server:8080 - https_proxy=http://my.proxy.server:4333 -``` - -Install and configure lxd -------------------------- - -Install the lxd snap. The lxd snap is up to date, as opposed to the -results of "sudo apt-get install lxd". - -``` - # snap install lxd - # lxd init -``` - -"lxd init" asks several questions. With the exception of the storage -pool, take the defaults. To match the configs shown below, create a -storage pool named "vpp." Storage pools of type "zfs" and "files" have -been tested successfully. - -zfs is more space-efficient. "lxc copy" is infinitely faster with -zfs. The path for the zfs storage pool is under /var. Do not replace -it with a symbolic link, unless you want to rebuild all of your -containers from scratch. Ask me how I know that. - -Create three network segments ------------------------------ - -Aka, linux bridges. - -``` - # lxc network create respond - # lxc network create internet - # lxc network create initiate -``` - -We'll explain the test topology in a bit. Stay tuned. - -Set up the default container profile ------------------------------------- - -Execute "lxc profile edit default", and install the following -configuration. Note that the "shared" directory should mount your vpp -workspaces. With that trick, you can edit code from any of the -containers, run vpp without installing it, etc. - -``` - config: {} - description: Default LXD profile - devices: - eth0: - name: eth0 - network: lxdbr0 - type: nic - eth1: - name: eth1 - nictype: bridged - parent: internet - type: nic - eth2: - name: eth2 - nictype: bridged - parent: respond - type: nic - eth3: - name: eth3 - nictype: bridged - parent: initiate - type: nic - root: - path: / - pool: vpp - type: disk - shared: - path: /scratch - source: /scratch - type: disk - name: default -``` - -Set up the network configurations ---------------------------------- - -Edit the fake "internet" backbone: - -``` - # lxc network edit internet -``` - -Install the ip addresses shown below, to avoid having to rebuild the vpp -and host configuration: - -``` - config: - ipv4.address: 10.26.68.1/24 - ipv4.dhcp.ranges: 10.26.68.10-10.26.68.50 - ipv4.nat: "true" - ipv6.address: none - ipv6.nat: "false" - description: "" - name: internet - type: bridge - used_by: - managed: true - status: Created - locations: - - none -``` - -Repeat the process with the "respond" and "initiate" networks, using these -configurations: - -### respond network configuration - -``` - config: - ipv4.address: 10.166.14.1/24 - ipv4.dhcp.ranges: 10.166.14.10-10.166.14.50 - ipv4.nat: "true" - ipv6.address: none - ipv6.nat: "false" - description: "" - name: respond - type: bridge - used_by: - managed: true - status: Created - locations: - - none -``` -### initiate network configuration - -``` - config: - ipv4.address: 10.219.188.1/24 - ipv4.dhcp.ranges: 10.219.188.10-10.219.188.50 - ipv4.nat: "true" - ipv6.address: none - ipv6.nat: "false" - description: "" - name: initiate - type: bridge - used_by: - managed: true - status: Created - locations: - - none -``` - -Create a "master" container image ---------------------------------- - -The master container image should be set up so that you can -build vpp, ssh into the container, edit source code, run gdb, etc. - -Make sure that e.g. public key auth ssh works. - -``` - # lxd launch ubuntu:18.04 respond - <spew> - # lxc exec respond bash - respond# cd /scratch/my-vpp-workspace - respond# apt-get install make ssh - respond# make install-dep - respond# exit - # lxc stop respond -``` - -Mark the container image privileged. If you forget this step, you'll -trip over a netlink error (-11) aka EAGAIN when you try to roll in the -vpp configurations. - -``` - # lxc config set respond security.privileged "true" -``` - -Duplicate the "master" container image --------------------------------------- - -To avoid having to configure N containers, be sure that the master -container image is fully set up before you help it have children: - -``` - # lxc copy respond respondhost - # lxc copy respond initiate - # lxc copy respond initiatehost - # lxc copy respond dhcpserver # optional, to test ipv6 prefix delegation -``` - -Install handy script --------------------- - -See below for a handly script which executes lxc commands across the -current set of running containers. I call it "lxc-foreach," feel free -to call the script Ishmael if you like. - -Examples: - -``` - $ lxc-foreach start - <issues "lxc start" for each container in the list> -``` - -After a few seconds, use this one to open an ssh connection to each -container. The ssh command parses the output of "lxc info," which -displays container ip addresses. - -``` - $ lxc-foreach ssh -``` - -Here's the script: - -``` - #!/bin/bash - - set -u - export containers="respond respondhost initiate initiatehost dhcpserver" - - if [ x$1 = "x" ] ; then - echo missing command - exit 1 - fi - - if [ $1 = "ssh" ] ; then - for c in $containers - do - inet=`lxc info $c | grep eth0 | grep -v inet6 | head -1 | cut -f 3` - if [ x$inet = "x" ] ; then - echo $c not started - else - gnome-terminal --command "/usr/bin/ssh $inet" - fi - done - exit 0 - fi - - for c in $containers - do - echo lxc $1 $c - lxc $1 $c - done - - exit 0 -``` - -Test topology -------------- - -Finally, we're ready to describe a test topology. First, a picture: - -``` - ===+======== management lan/bridge lxdbr0 (dhcp) ===========+=== - | | | - | | | - | | | - v | v - eth0 | eth0 - +------+ eth1 eth1 +------+ - | respond | 10.26.88.100 <= internet bridge => 10.26.88.101 | initiate | - +------+ +------+ - eth2 / bvi0 10.166.14.2 | 10.219.188.2 eth3 / bvi0 - | | | - | ("respond" bridge) | ("initiate" bridge) | - | | | - v | v - eth2 10.166.14.3 | eth3 10.219.188.3 - +----------+ | +----------+ - | respondhost | | | respondhost | - +----------+ | +----------+ - eth0 (management lan) <========+========> eth0 (management lan) -``` - -### Test topology discussion - -This topology is suitable for testing almost any tunnel encap/decap -scenario. The two containers "respondhost" and "initiatehost" are end-stations -connected to two vpp instances running on "respond" and "initiate". - -We leverage the Linux end-station network stacks to generate traffic -of all sorts. - -The so-called "internet" bridge models the public internet. The "respond" and -"initiate" bridges connect vpp instances to local hosts - -End station configs -------------------- - -The end-station Linux configurations set up the eth2 and eth3 ip -addresses shown above, and add tunnel routes to the opposite -end-station networks. - -### respondhost configuration - -``` - ifconfig eth2 10.166.14.3/24 up - route add -net 10.219.188.0/24 gw 10.166.14.2 -``` - -### initiatehost configuration - -``` - sudo ifconfig eth3 10.219.188.3/24 up - sudo route add -net 10.166.14.0/24 gw 10.219.188.2 -``` - -VPP configs ------------ - -Split nat44 / ikev2 + ipsec tunneling, with ipv6 prefix delegation in -the "respond" config. - -### respond configuration - -``` - set term pag off - - comment { "internet" } - create host-interface name eth1 - set int ip address host-eth1 10.26.68.100/24 - set int ip6 table host-eth1 0 - set int state host-eth1 up - - comment { default route via initiate } - ip route add 0.0.0.0/0 via 10.26.68.101 - - comment { "respond-private-net" } - create host-interface name eth2 - bvi create instance 0 - set int l2 bridge bvi0 1 bvi - set int ip address bvi0 10.166.14.2/24 - set int state bvi0 up - set int l2 bridge host-eth2 1 - set int state host-eth2 up - - - nat44 add interface address host-eth1 - set interface nat44 in host-eth2 out host-eth1 - nat44 add identity mapping external host-eth1 udp 500 - nat44 add identity mapping external host-eth1 udp 4500 - comment { nat44 untranslated subnet 10.219.188.0/24 } - - comment { responder profile } - ikev2 profile add initiate - ikev2 profile set initiate udp-encap - ikev2 profile set initiate auth rsa-sig cert-file /scratch/setups/respondcert.pem - set ikev2 local key /scratch/setups/initiatekey.pem - ikev2 profile set initiate id local fqdn initiator.my.net - ikev2 profile set initiate id remote fqdn responder.my.net - ikev2 profile set initiate traffic-selector remote ip-range 10.219.188.0 - 10.219.188.255 port-range 0 - 65535 protocol 0 - ikev2 profile set initiate traffic-selector local ip-range 10.166.14.0 - 10.166.14.255 port-range 0 - 65535 protocol 0 - create ipip tunnel src 10.26.68.100 dst 10.26.68.101 - ikev2 profile set initiate tunnel ipip0 - - comment { ipv6 prefix delegation } - ip6 nd address autoconfig host-eth1 default-route - dhcp6 client host-eth1 - dhcp6 pd client host-eth1 prefix group hgw - set ip6 address bvi0 prefix group hgw ::2/56 - ip6 nd address autoconfig bvi0 default-route - ip6 nd bvi0 ra-interval 5 3 ra-lifetime 180 - - set int mtu packet 1390 ipip0 - set int unnum ipip0 use host-eth1 - ip route add 10.219.188.0/24 via ipip0 -``` - -### initiate configuration - -``` - set term pag off - - comment { "internet" } - create host-interface name eth1 - comment { set dhcp client intfc host-eth1 hostname initiate } - set int ip address host-eth1 10.26.68.101/24 - set int state host-eth1 up - - comment { default route via "internet gateway" } - comment { ip route add 0.0.0.0/0 via 10.26.68.1 } - - comment { "initiate-private-net" } - create host-interface name eth3 - bvi create instance 0 - set int l2 bridge bvi0 1 bvi - set int ip address bvi0 10.219.188.2/24 - set int state bvi0 up - set int l2 bridge host-eth3 1 - set int state host-eth3 up - - nat44 add interface address host-eth1 - set interface nat44 in bvi0 out host-eth1 - nat44 add identity mapping external host-eth1 udp 500 - nat44 add identity mapping external host-eth1 udp 4500 - comment { nat44 untranslated subnet 10.166.14.0/24 } - - comment { initiator profile } - ikev2 profile add respond - ikev2 profile set respond udp-encap - ikev2 profile set respond auth rsa-sig cert-file /scratch/setups/initiatecert.pem - set ikev2 local key /scratch/setups/respondkey.pem - ikev2 profile set respond id local fqdn responder.my.net - ikev2 profile set respond id remote fqdn initiator.my.net - - ikev2 profile set respond traffic-selector remote ip-range 10.166.14.0 - 10.166.14.255 port-range 0 - 65535 protocol 0 - ikev2 profile set respond traffic-selector local ip-range 10.219.188.0 - 10.219.188.255 port-range 0 - 65535 protocol 0 - - ikev2 profile set respond responder host-eth1 10.26.68.100 - ikev2 profile set respond ike-crypto-alg aes-cbc 256 ike-integ-alg sha1-96 ike-dh modp-2048 - ikev2 profile set respond esp-crypto-alg aes-cbc 256 esp-integ-alg sha1-96 esp-dh ecp-256 - ikev2 profile set respond sa-lifetime 3600 10 5 0 - - create ipip tunnel src 10.26.68.101 dst 10.26.68.100 - ikev2 profile set respond tunnel ipip0 - ikev2 initiate sa-init respond - - set int mtu packet 1390 ipip0 - set int unnum ipip0 use host-eth1 - ip route add 10.166.14.0/24 via ipip0 -``` - -IKEv2 certificate setup ------------------------ - -In both of the vpp configurations, you'll see "/scratch/setups/xxx.pem" -mentioned. These certificates are used in the ikev2 key exchange. - -Here's how to generate the certificates: - -``` - openssl req -x509 -nodes -newkey rsa:4096 -keyout respondkey.pem -out respondcert.pem -days 3560 - openssl x509 -text -noout -in respondcert.pem - openssl req -x509 -nodes -newkey rsa:4096 -keyout initiatekey.pem -out initiatecert.pem -days 3560 - openssl x509 -text -noout -in initiatecert.pem -``` - -Make sure that the "respond" and "initiate" configurations point to the certificates. - -DHCPv6 server setup -------------------- - -If you need an ipv6 dhcp server to test ipv6 prefix delegation, -create the "dhcpserver" container as shown above. - -Install the "isc-dhcp-server" Debian package: - -``` - sudo apt-get install isc-dhcp-server -``` - -### /etc/dhcp/dhcpd6.conf - -Edit the dhcpv6 configuration and add an ipv6 subnet with prefix -delegation. For example: - -``` - subnet6 2001:db01:0:1::/64 { - range6 2001:db01:0:1::1 2001:db01:0:1::9; - prefix6 2001:db01:0:100:: 2001:db01:0:200::/56; - } -``` - -Add an ipv6 address on eth1, which is connected to the "internet" -bridge, and start the dhcp server. I use the following trivial bash -script, which runs the dhcp6 server in the foreground and produces -dhcp traffic spew: - -``` - #!/bin/bash - ifconfig eth1 inet6 add 2001:db01:0:1::10/64 || true - dhcpd -6 -d -cf /etc/dhcp/dhcpd6.conf -``` - -The "|| true" bit keeps going if eth1 already has the indicated ipv6 -address. - -Container / Host Interoperation -------------------------------- - -Host / container interoperation is highly desirable. If the host and a -set of containers don't run the same distro _and distro version_, it's -reasonably likely that the glibc versions won't match. That, in turn, -makes vpp binaries built in one environment fail in the other. - -Trying to install multiple versions of glibc - especially at the host -level - often ends very badly and is _not recommended_. It's not just -glibc, either. The dynamic loader ld-linux-xxx-so.2 is glibc version -specific. - -Fortunately, it's reasonable easy to build lxd container images based on -specific Ubuntu or Debian versions. - -### Create a custom root filesystem image - -First, install the "debootstrap" tool: - -``` - sudo apt-get install debootstrap -``` - -Make a temp directory, and use debootstrap to populate it. In this -example, we create an Ubuntu 20.04 (focal fossa) base image: - -``` - # mkdir /tmp/myroot - # debootstrap focal /tmp/myroot http://archive.ubuntu.com/ubuntu -``` - -To tinker with the base image (if desired): - -``` - # chroot /tmp/myroot - <add packages, etc.> - # exit -``` - -Make a compressed tarball of the base image: - -``` - # tar zcf /tmp/rootfs.tar.gz -C /tmp/myroot . -``` - -Create a "metadata.yaml" file which describes the base image: - -``` - architecture: "x86_64" - # To get current date in Unix time, use `date +%s` command - creation_date: 1458040200 - properties: - architecture: "x86_64" - description: "My custom Focal Fossa image" - os: "Ubuntu" - release: "focal" -``` - -Make a compressed tarball of metadata.yaml: - -``` - # tar zcf metadata.tar.gz metadata.yaml -``` - -Import the image into lxc / lxd: - -``` - $ lxc image import metadata.tar.gz rootfd.tar.gz --alias focal-base -``` - -### Create a container which uses the customized base image: - -``` - $ lxc launch focal-base focaltest - $ lxc exec focaltest bash -``` - -The next several steps should be executed in the container, in the -bash shell spun up by "lxc exec..." - -### Configure container networking - -In the container, create /etc/netplan/50-cloud-init.yaml: - -``` - network: - version: 2 - ethernets: - eth0: - dhcp4: true -``` - -Use "cat > /etc/netplan/50-cloud-init.yaml", and cut-'n-paste if your -favorite text editor is AWOL. - -Apply the configuration: - -``` - # netplan apply -``` - -At this point, eth0 should have an ip address, and you should see -a default route with "route -n". - -### Configure apt - -Again, in the container, set up /etc/apt/sources.list via cut-'n-paste -from a recently update "focal fossa" host. Something like so: - -``` - deb http://us.archive.ubuntu.com/ubuntu/ focal main restricted - deb http://us.archive.ubuntu.com/ubuntu/ focal-updates main restricted - deb http://us.archive.ubuntu.com/ubuntu/ focal universe - deb http://us.archive.ubuntu.com/ubuntu/ focal-updates universe - deb http://us.archive.ubuntu.com/ubuntu/ focal multiverse - deb http://us.archive.ubuntu.com/ubuntu/ focal-updates multiverse - deb http://us.archive.ubuntu.com/ubuntu/ focal-backports main restricted universe multiverse - deb http://security.ubuntu.com/ubuntu focal-security main restricted - deb http://security.ubuntu.com/ubuntu focal-security universe - deb http://security.ubuntu.com/ubuntu focal-security multiverse -``` - -"apt-get update" and "apt-install" should produce reasonable results. -Suggest "apt-get install make git". - -At this point, you can use the "/scratch" sharepoint (or similar) to -execute "make install-dep install-ext-deps" to set up the container -with the vpp toolchain; proceed as desired. diff --git a/docs/usecases/container_test.rst b/docs/usecases/container_test.rst new file mode 100644 index 00000000000..8ad6285f8ed --- /dev/null +++ b/docs/usecases/container_test.rst @@ -0,0 +1,655 @@ +Simulating networks with VPP +============================ + +The “make test” framework provides a good way to test individual +features. However, when testing several features at once - or validating +nontrivial configurations - it may prove difficult or impossible to use +the unit-test framework. + +This note explains how to set up lxc/lxd, and a 5-container testbed to +test a split-tunnel nat + ikev2 + ipsec + ipv6 prefix-delegation +scenario. + +OS / Distro test results +------------------------ + +This setup has been tested on an Ubuntu 18.04 LTS system. If you’re +feeling adventurous, the same scenario also worked on a recent Ubuntu +20.04 “preview” daily build. + +Other distros may work fine, or not at all. + +Proxy Server +------------ + +If you need to use a proxy server e.g. from a lab system, you’ll +probably need to set HTTP_PROXY, HTTPS_PROXY, http_proxy and https_proxy +in /etc/environment. Directly setting variables in the environment +doesn’t work. The lxd snap *daemon* needs the proxy settings, not the +user interface. + +Something like so: + +:: + + HTTP_PROXY=http://my.proxy.server:8080 + HTTPS_PROXY=http://my.proxy.server:4333 + http_proxy=http://my.proxy.server:8080 + https_proxy=http://my.proxy.server:4333 + +Install and configure lxd +------------------------- + +Install the lxd snap. The lxd snap is up to date, as opposed to the +results of “sudo apt-get install lxd”. + +:: + + # snap install lxd + # lxd init + +“lxd init” asks several questions. With the exception of the storage +pool, take the defaults. To match the configs shown below, create a +storage pool named “vpp.” Storage pools of type “zfs” and “files” have +been tested successfully. + +zfs is more space-efficient. “lxc copy” is infinitely faster with zfs. +The path for the zfs storage pool is under /var. Do not replace it with +a symbolic link, unless you want to rebuild all of your containers from +scratch. Ask me how I know that. + +Create three network segments +----------------------------- + +Aka, linux bridges. + +:: + + # lxc network create respond + # lxc network create internet + # lxc network create initiate + +We’ll explain the test topology in a bit. Stay tuned. + +Set up the default container profile +------------------------------------ + +Execute “lxc profile edit default”, and install the following +configuration. Note that the “shared” directory should mount your vpp +workspaces. With that trick, you can edit code from any of the +containers, run vpp without installing it, etc. + +:: + + config: {} + description: Default LXD profile + devices: + eth0: + name: eth0 + network: lxdbr0 + type: nic + eth1: + name: eth1 + nictype: bridged + parent: internet + type: nic + eth2: + name: eth2 + nictype: bridged + parent: respond + type: nic + eth3: + name: eth3 + nictype: bridged + parent: initiate + type: nic + root: + path: / + pool: vpp + type: disk + shared: + path: /scratch + source: /scratch + type: disk + name: default + +Set up the network configurations +--------------------------------- + +Edit the fake “internet” backbone: + +:: + + # lxc network edit internet + +Install the ip addresses shown below, to avoid having to rebuild the vpp +and host configuration: + +:: + + config: + ipv4.address: 10.26.68.1/24 + ipv4.dhcp.ranges: 10.26.68.10-10.26.68.50 + ipv4.nat: "true" + ipv6.address: none + ipv6.nat: "false" + description: "" + name: internet + type: bridge + used_by: + managed: true + status: Created + locations: + - none + +Repeat the process with the “respond” and “initiate” networks, using +these configurations: + +respond network configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + config: + ipv4.address: 10.166.14.1/24 + ipv4.dhcp.ranges: 10.166.14.10-10.166.14.50 + ipv4.nat: "true" + ipv6.address: none + ipv6.nat: "false" + description: "" + name: respond + type: bridge + used_by: + managed: true + status: Created + locations: + - none + +initiate network configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + config: + ipv4.address: 10.219.188.1/24 + ipv4.dhcp.ranges: 10.219.188.10-10.219.188.50 + ipv4.nat: "true" + ipv6.address: none + ipv6.nat: "false" + description: "" + name: initiate + type: bridge + used_by: + managed: true + status: Created + locations: + - none + +Create a “master” container image +--------------------------------- + +The master container image should be set up so that you can build vpp, +ssh into the container, edit source code, run gdb, etc. + +Make sure that e.g. public key auth ssh works. + +:: + + # lxd launch ubuntu:18.04 respond + <spew> + # lxc exec respond bash + respond# cd /scratch/my-vpp-workspace + respond# apt-get install make ssh + respond# make install-dep + respond# exit + # lxc stop respond + +Mark the container image privileged. If you forget this step, you’ll +trip over a netlink error (-11) aka EAGAIN when you try to roll in the +vpp configurations. + +:: + + # lxc config set respond security.privileged "true" + +Duplicate the “master” container image +-------------------------------------- + +To avoid having to configure N containers, be sure that the master +container image is fully set up before you help it have children: + +:: + + # lxc copy respond respondhost + # lxc copy respond initiate + # lxc copy respond initiatehost + # lxc copy respond dhcpserver # optional, to test ipv6 prefix delegation + +Install handy script +-------------------- + +See below for a handy script which executes lxc commands across the +current set of running containers. I call it “lxc-foreach,” feel free to +call the script Ishmael if you like. + +Examples: + +:: + + $ lxc-foreach start + <issues "lxc start" for each container in the list> + +After a few seconds, use this one to open an ssh connection to each +container. The ssh command parses the output of “lxc info,” which +displays container ip addresses. + +:: + + $ lxc-foreach ssh + +Here’s the script: + +:: + + #!/bin/bash + + set -u + export containers="respond respondhost initiate initiatehost dhcpserver" + + if [ x$1 = "x" ] ; then + echo missing command + exit 1 + fi + + if [ $1 = "ssh" ] ; then + for c in $containers + do + inet=`lxc info $c | grep eth0 | grep -v inet6 | head -1 | cut -f 3` + if [ x$inet = "x" ] ; then + echo $c not started + else + gnome-terminal --command "/usr/bin/ssh $inet" + fi + done + exit 0 + fi + + for c in $containers + do + echo lxc $1 $c + lxc $1 $c + done + + exit 0 + +Test topology +------------- + +Finally, we’re ready to describe a test topology. First, a picture: + +:: + + ===+======== management lan/bridge lxdbr0 (dhcp) ===========+=== + | | | + | | | + | | | + v | v + eth0 | eth0 + +------+ eth1 eth1 +------+ + | respond | 10.26.88.100 <= internet bridge => 10.26.88.101 | initiate | + +------+ +------+ + eth2 / bvi0 10.166.14.2 | 10.219.188.2 eth3 / bvi0 + | | | + | ("respond" bridge) | ("initiate" bridge) | + | | | + v | v + eth2 10.166.14.3 | eth3 10.219.188.3 + +----------+ | +----------+ + | respondhost | | | respondhost | + +----------+ | +----------+ + eth0 (management lan) <========+========> eth0 (management lan) + +Test topology discussion +~~~~~~~~~~~~~~~~~~~~~~~~ + +This topology is suitable for testing almost any tunnel encap/decap +scenario. The two containers “respondhost” and “initiatehost” are +end-stations connected to two vpp instances running on “respond” and +“initiate”. + +We leverage the Linux end-station network stacks to generate traffic of +all sorts. + +The so-called “internet” bridge models the public internet. The +“respond” and “initiate” bridges connect vpp instances to local hosts + +End station configs +------------------- + +The end-station Linux configurations set up the eth2 and eth3 ip +addresses shown above, and add tunnel routes to the opposite end-station +networks. + +respondhost configuration +~~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + ifconfig eth2 10.166.14.3/24 up + route add -net 10.219.188.0/24 gw 10.166.14.2 + +initiatehost configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + sudo ifconfig eth3 10.219.188.3/24 up + sudo route add -net 10.166.14.0/24 gw 10.219.188.2 + +VPP configs +----------- + +Split nat44 / ikev2 + ipsec tunneling, with ipv6 prefix delegation in +the “respond” config. + +respond configuration +~~~~~~~~~~~~~~~~~~~~~ + +:: + + set term pag off + + comment { "internet" } + create host-interface name eth1 + set int ip address host-eth1 10.26.68.100/24 + set int ip6 table host-eth1 0 + set int state host-eth1 up + + comment { default route via initiate } + ip route add 0.0.0.0/0 via 10.26.68.101 + + comment { "respond-private-net" } + create host-interface name eth2 + bvi create instance 0 + set int l2 bridge bvi0 1 bvi + set int ip address bvi0 10.166.14.2/24 + set int state bvi0 up + set int l2 bridge host-eth2 1 + set int state host-eth2 up + + + nat44 add interface address host-eth1 + set interface nat44 in host-eth2 out host-eth1 + nat44 add identity mapping external host-eth1 udp 500 + nat44 add identity mapping external host-eth1 udp 4500 + comment { nat44 untranslated subnet 10.219.188.0/24 } + + comment { responder profile } + ikev2 profile add initiate + ikev2 profile set initiate udp-encap + ikev2 profile set initiate auth rsa-sig cert-file /scratch/setups/respondcert.pem + set ikev2 local key /scratch/setups/initiatekey.pem + ikev2 profile set initiate id local fqdn initiator.my.net + ikev2 profile set initiate id remote fqdn responder.my.net + ikev2 profile set initiate traffic-selector remote ip-range 10.219.188.0 - 10.219.188.255 port-range 0 - 65535 protocol 0 + ikev2 profile set initiate traffic-selector local ip-range 10.166.14.0 - 10.166.14.255 port-range 0 - 65535 protocol 0 + create ipip tunnel src 10.26.68.100 dst 10.26.68.101 + ikev2 profile set initiate tunnel ipip0 + + comment { ipv6 prefix delegation } + ip6 nd address autoconfig host-eth1 default-route + dhcp6 client host-eth1 + dhcp6 pd client host-eth1 prefix group hgw + set ip6 address bvi0 prefix group hgw ::2/56 + ip6 nd address autoconfig bvi0 default-route + ip6 nd bvi0 ra-interval 5 3 ra-lifetime 180 + + set int mtu packet 1390 ipip0 + set int unnum ipip0 use host-eth1 + ip route add 10.219.188.0/24 via ipip0 + +initiate configuration +~~~~~~~~~~~~~~~~~~~~~~ + +:: + + set term pag off + + comment { "internet" } + create host-interface name eth1 + comment { set dhcp client intfc host-eth1 hostname initiate } + set int ip address host-eth1 10.26.68.101/24 + set int state host-eth1 up + + comment { default route via "internet gateway" } + comment { ip route add 0.0.0.0/0 via 10.26.68.1 } + + comment { "initiate-private-net" } + create host-interface name eth3 + bvi create instance 0 + set int l2 bridge bvi0 1 bvi + set int ip address bvi0 10.219.188.2/24 + set int state bvi0 up + set int l2 bridge host-eth3 1 + set int state host-eth3 up + + nat44 add interface address host-eth1 + set interface nat44 in bvi0 out host-eth1 + nat44 add identity mapping external host-eth1 udp 500 + nat44 add identity mapping external host-eth1 udp 4500 + comment { nat44 untranslated subnet 10.166.14.0/24 } + + comment { initiator profile } + ikev2 profile add respond + ikev2 profile set respond udp-encap + ikev2 profile set respond auth rsa-sig cert-file /scratch/setups/initiatecert.pem + set ikev2 local key /scratch/setups/respondkey.pem + ikev2 profile set respond id local fqdn responder.my.net + ikev2 profile set respond id remote fqdn initiator.my.net + + ikev2 profile set respond traffic-selector remote ip-range 10.166.14.0 - 10.166.14.255 port-range 0 - 65535 protocol 0 + ikev2 profile set respond traffic-selector local ip-range 10.219.188.0 - 10.219.188.255 port-range 0 - 65535 protocol 0 + + ikev2 profile set respond responder host-eth1 10.26.68.100 + ikev2 profile set respond ike-crypto-alg aes-cbc 256 ike-integ-alg sha1-96 ike-dh modp-2048 + ikev2 profile set respond esp-crypto-alg aes-cbc 256 esp-integ-alg sha1-96 esp-dh ecp-256 + ikev2 profile set respond sa-lifetime 3600 10 5 0 + + create ipip tunnel src 10.26.68.101 dst 10.26.68.100 + ikev2 profile set respond tunnel ipip0 + ikev2 initiate sa-init respond + + set int mtu packet 1390 ipip0 + set int unnum ipip0 use host-eth1 + ip route add 10.166.14.0/24 via ipip0 + +IKEv2 certificate setup +----------------------- + +In both of the vpp configurations, you’ll see “/scratch/setups/xxx.pem” +mentioned. These certificates are used in the ikev2 key exchange. + +Here’s how to generate the certificates: + +:: + + openssl req -x509 -nodes -newkey rsa:4096 -keyout respondkey.pem -out respondcert.pem -days 3560 + openssl x509 -text -noout -in respondcert.pem + openssl req -x509 -nodes -newkey rsa:4096 -keyout initiatekey.pem -out initiatecert.pem -days 3560 + openssl x509 -text -noout -in initiatecert.pem + +Make sure that the “respond” and “initiate” configurations point to the +certificates. + +DHCPv6 server setup +------------------- + +If you need an ipv6 dhcp server to test ipv6 prefix delegation, create +the “dhcpserver” container as shown above. + +Install the “isc-dhcp-server” Debian package: + +:: + + sudo apt-get install isc-dhcp-server + +/etc/dhcp/dhcpd6.conf +~~~~~~~~~~~~~~~~~~~~~ + +Edit the dhcpv6 configuration and add an ipv6 subnet with prefix +delegation. For example: + +:: + + subnet6 2001:db01:0:1::/64 { + range6 2001:db01:0:1::1 2001:db01:0:1::9; + prefix6 2001:db01:0:100:: 2001:db01:0:200::/56; + } + +Add an ipv6 address on eth1, which is connected to the “internet” +bridge, and start the dhcp server. I use the following trivial bash +script, which runs the dhcp6 server in the foreground and produces dhcp +traffic spew: + +:: + + #!/bin/bash + ifconfig eth1 inet6 add 2001:db01:0:1::10/64 || true + dhcpd -6 -d -cf /etc/dhcp/dhcpd6.conf + +The “\|\| true” bit keeps going if eth1 already has the indicated ipv6 +address. + +Container / Host Interoperation +------------------------------- + +Host / container interoperation is highly desirable. If the host and a +set of containers don’t run the same distro *and distro version*, it’s +reasonably likely that the glibc versions won’t match. That, in turn, +makes vpp binaries built in one environment fail in the other. + +Trying to install multiple versions of glibc - especially at the host +level - often ends very badly and is *not recommended*. It’s not just +glibc, either. The dynamic loader ld-linux-xxx-so.2 is glibc version +specific. + +Fortunately, it’s reasonable easy to build lxd container images based on +specific Ubuntu or Debian versions. + +Create a custom root filesystem image +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +First, install the “debootstrap” tool: + +:: + + sudo apt-get install debootstrap + +Make a temp directory, and use debootstrap to populate it. In this +example, we create an Ubuntu 20.04 (focal fossa) base image: + +:: + + # mkdir /tmp/myroot + # debootstrap focal /tmp/myroot http://archive.ubuntu.com/ubuntu + +To tinker with the base image (if desired): + +:: + + # chroot /tmp/myroot + <add packages, etc.> + # exit + +Make a compressed tarball of the base image: + +:: + + # tar zcf /tmp/rootfs.tar.gz -C /tmp/myroot . + +Create a “metadata.yaml” file which describes the base image: + +:: + + architecture: "x86_64" + # To get current date in Unix time, use `date +%s` command + creation_date: 1458040200 + properties: + architecture: "x86_64" + description: "My custom Focal Fossa image" + os: "Ubuntu" + release: "focal" + +Make a compressed tarball of metadata.yaml: + +:: + + # tar zcf metadata.tar.gz metadata.yaml + +Import the image into lxc / lxd: + +:: + + $ lxc image import metadata.tar.gz rootfd.tar.gz --alias focal-base + +Create a container which uses the customized base image: +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +:: + + $ lxc launch focal-base focaltest + $ lxc exec focaltest bash + +The next several steps should be executed in the container, in the bash +shell spun up by “lxc exec…” + +Configure container networking +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In the container, create /etc/netplan/50-cloud-init.yaml: + +:: + + network: + version: 2 + ethernets: + eth0: + dhcp4: true + +Use “cat > /etc/netplan/50-cloud-init.yaml”, and cut-’n-paste if your +favorite text editor is AWOL. + +Apply the configuration: + +:: + + # netplan apply + +At this point, eth0 should have an ip address, and you should see a +default route with “route -n”. + +Configure apt +~~~~~~~~~~~~~ + +Again, in the container, set up /etc/apt/sources.list via cut-’n-paste +from a recently update “focal fossa” host. Something like so: + +:: + + deb http://us.archive.ubuntu.com/ubuntu/ focal main restricted + deb http://us.archive.ubuntu.com/ubuntu/ focal-updates main restricted + deb http://us.archive.ubuntu.com/ubuntu/ focal universe + deb http://us.archive.ubuntu.com/ubuntu/ focal-updates universe + deb http://us.archive.ubuntu.com/ubuntu/ focal multiverse + deb http://us.archive.ubuntu.com/ubuntu/ focal-updates multiverse + deb http://us.archive.ubuntu.com/ubuntu/ focal-backports main restricted universe multiverse + deb http://security.ubuntu.com/ubuntu focal-security main restricted + deb http://security.ubuntu.com/ubuntu focal-security universe + deb http://security.ubuntu.com/ubuntu focal-security multiverse + +“apt-get update” and “apt-install” should produce reasonable results. +Suggest “apt-get install make git”. + +At this point, you can use the “/scratch” sharepoint (or similar) to +execute “make install-dep install-ext-deps” to set up the container with +the vpp toolchain; proceed as desired. diff --git a/docs/usecases/Routing.rst b/docs/usecases/containers/Routing.rst index 31929d31603..b9d3bc97638 100644 --- a/docs/usecases/Routing.rst +++ b/docs/usecases/containers/Routing.rst @@ -10,7 +10,7 @@ Now for connecting these two linux containers to VPP and pinging between them. Enter container *cone*, and check the current network configuration: .. code-block:: console - + root@cone:/# ip -o a 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 1: lo inet6 ::1/128 scope host \ valid_lft forever preferred_lft forever @@ -43,7 +43,7 @@ Check if the interfaces are down or up: Make sure your loopback interface is up, and assign an IP and gateway to veth_link1. .. code-block:: console - + root@cone:/# ip link set dev lo up root@cone:/# ip addr add 172.16.1.2/24 dev veth_link1 root@cone:/# ip link set dev veth_link1 up @@ -55,7 +55,7 @@ Here, the IP is 172.16.1.2/24 and the gateway is 172.16.1.1. Run some commands to verify the changes: .. code-block:: console - + root@cone:/# ip -o a 1: lo inet 127.0.0.1/8 scope host lo\ valid_lft forever preferred_lft forever 1: lo inet6 ::1/128 scope host \ valid_lft forever preferred_lft forever @@ -78,7 +78,7 @@ Now exit this container and repeat this process with container *ctwo*, except wi After that's done for *both* containers, exit from the container if you're in one: .. code-block:: console - + root@ctwo:/# exit exit root@localhost:~# @@ -86,7 +86,7 @@ After that's done for *both* containers, exit from the container if you're in on In the machine running the containers, run **ip link** to see the host *veth* network interfaces, and their link with their respective *container veth's*. .. code-block:: console - + root@localhost:~# ip link 1: lo: <LOOPBACK> mtu 65536 qdisc noqueue state DOWN mode DEFAULT group default qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 @@ -113,29 +113,29 @@ Remember our network interface index 32 in *cone* from this :ref:`note <networkN With VPP in the host machine, show current VPP interfaces: .. code-block:: console - + root@localhost:~# vppctl show inter - Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count - local0 0 down 0/0/0/0 + Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count + local0 0 down 0/0/0/0 Which should only output local0. Based on the names of the network interfaces discussed previously, which are specific to my systems, we can create VPP host-interfaces: .. code-block:: console - + root@localhost:~# vppctl create host-interface name vethQL7K0C root@localhost:~# vppctl create host-interface name veth8NA72P Verify they have been set up properly: .. code-block:: console - + root@localhost:~# vppctl show inter - Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count - host-vethQL7K0C 1 down 9000/0/0/0 - host-veth8NA72P 2 down 9000/0/0/0 - local0 0 down 0/0/0/0 + Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count + host-vethQL7K0C 1 down 9000/0/0/0 + host-veth8NA72P 2 down 9000/0/0/0 + local0 0 down 0/0/0/0 Which should output *three network interfaces*, local0, and the other two host network interfaces linked to the container veth's. @@ -143,7 +143,7 @@ Which should output *three network interfaces*, local0, and the other two host n Set their state to up: .. code-block:: console - + root@localhost:~# vppctl set interface state host-vethQL7K0C up root@localhost:~# vppctl set interface state host-veth8NA72P up @@ -152,16 +152,16 @@ Verify they are now up: .. code-block:: console root@localhost:~# vppctl show inter - Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count - host-vethQL7K0C 1 up 9000/0/0/0 - host-veth8NA72P 2 up 9000/0/0/0 - local0 0 down 0/0/0/0 + Name Idx State MTU (L3/IP4/IP6/MPLS) Counter Count + host-vethQL7K0C 1 up 9000/0/0/0 + host-veth8NA72P 2 up 9000/0/0/0 + local0 0 down 0/0/0/0 Add IP addresses for the other end of each veth link: .. code-block:: console - + root@localhost:~# vppctl set interface ip address host-vethQL7K0C 172.16.1.1/24 root@localhost:~# vppctl set interface ip address host-veth8NA72P 172.16.2.1/24 @@ -180,7 +180,7 @@ Verify the addresses are set properly by looking at the L3 table: Or looking at the FIB by doing: .. code-block:: console - + root@localhost:~# vppctl show ip fib ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto ] locks:[src:plugin-hi:2, src:default-route:1, ] 0.0.0.0/0 @@ -239,7 +239,7 @@ Or looking at the FIB by doing: At long last you probably want to see some pings: .. code-block:: console - + root@localhost:~# lxc-attach -n cone -- ping -c3 172.16.2.2 PING 172.16.2.2 (172.16.2.2) 56(84) bytes of data. 64 bytes from 172.16.2.2: icmp_seq=1 ttl=63 time=0.102 ms @@ -263,4 +263,4 @@ At long last you probably want to see some pings: Which should send/receive three packets for each command. -This is the end of this guide. Great work! +This is the end of this guide. Great work! diff --git a/docs/usecases/containerCreation.rst b/docs/usecases/containers/containerCreation.rst index 9b2cc126133..bb116883e7d 100644 --- a/docs/usecases/containerCreation.rst +++ b/docs/usecases/containers/containerCreation.rst @@ -29,7 +29,7 @@ Since we want to ping between two containers, we'll need to **add to this file** Look at the contents of *default.conf*, which should initially look like this: .. code-block:: console - + # cat /etc/lxc/default.conf lxc.network.type = veth lxc.network.link = lxcbr0 @@ -40,7 +40,7 @@ As you can see, by default there is one veth interface. Now you will *append to this file* so that each container you create will have an interface for a Linux bridge and an unconsumed second interface. -You can do this by piping *echo* output into *tee*, where each line is separated with a newline character *\\n* as shown below. Alternatively, you can manually add to this file with a text editor such as **vi**, but make sure you have root privileges. +You can do this by piping *echo* output into *tee*, where each line is separated with a newline character *\\n* as shown below. Alternatively, you can manually add to this file with a text editor such as **vi**, but make sure you have root privileges. .. code-block:: console @@ -72,7 +72,7 @@ Creates an Ubuntu Xenial container named "cone". If successful, you'll get an output similar to this: .. code-block:: console - + You just created an Ubuntu xenial amd64 (20180625_07:42) container. To enable SSH, run: apt install openssh-server @@ -98,17 +98,17 @@ List your containers to verify they exist: Start the first container: .. code-block:: console - + # lxc-start --name cone And verify its running: .. code-block:: console - + # lxc-ls --fancy - NAME STATE AUTOSTART GROUPS IPV4 IPV6 - cone RUNNING 0 - - - - ctwo STOPPED 0 - - - + NAME STATE AUTOSTART GROUPS IPV4 IPV6 + cone RUNNING 0 - - - + ctwo STOPPED 0 - - - .. note:: diff --git a/docs/usecases/containerSetup.rst b/docs/usecases/containers/containerSetup.rst index e0fd81eebc3..8c458f77cfd 100644 --- a/docs/usecases/containerSetup.rst +++ b/docs/usecases/containers/containerSetup.rst @@ -3,21 +3,21 @@ .. toctree:: Container packages -__________________ +================== Now we can go into container *cone* and install prerequisites such as VPP, and perform some additional commands: To enter our container via the shell, type: .. code-block:: console - + # lxc-attach -n cone root@cone:/# -Run the linux DHCP setup and install VPP: +Run the linux DHCP setup and install VPP: .. code-block:: console - + root@cone:/# resolvconf -d eth0 root@cone:/# dhclient root@cone:/# apt-get install -y wget @@ -29,7 +29,7 @@ Run the linux DHCP setup and install VPP: After this is done, start VPP in this container: .. code-block:: console - + root@cone:/# service vpp start Exit this container with the **exit** command (you *may* need to run **exit** twice): diff --git a/docs/usecases/containers.rst b/docs/usecases/containers/index.rst index 65bf2aee5de..65bf2aee5de 100644 --- a/docs/usecases/containers.rst +++ b/docs/usecases/containers/index.rst diff --git a/docs/usecases/contiv/BUG_REPORTS.md b/docs/usecases/contiv/BUG_REPORTS.md deleted file mode 100644 index 5b9c3cf4875..00000000000 --- a/docs/usecases/contiv/BUG_REPORTS.md +++ /dev/null @@ -1,333 +0,0 @@ -# Debugging and Reporting Bugs in Contiv-VPP - -## Bug Report Structure - -- [Deployment description](#describe-deployment): -Briefly describes the deployment, where an issue was spotted, -number of k8s nodes, is DHCP/STN/TAP used. - -- [Logs](#collecting-the-logs): -Attach corresponding logs, at least from the vswitch pods. - -- [VPP config](#inspect-vpp-config): -Attach output of the show commands. - -- [Basic Collection Example](#basic-example) - -### Describe Deployment -Since contiv-vpp can be used with different configurations, it is helpful -to attach the config that was applied. Either attach `values.yaml` passed to the helm chart, -or attach the [corresponding part](https://github.com/contiv/vpp/blob/42b3bfbe8735508667b1e7f1928109a65dfd5261/k8s/contiv-vpp.yaml#L24-L38) from the deployment yaml file. - -``` - contiv.yaml: |- - TCPstackDisabled: true - UseTAPInterfaces: true - TAPInterfaceVersion: 2 - NatExternalTraffic: true - MTUSize: 1500 - IPAMConfig: - PodSubnetCIDR: 10.1.0.0/16 - PodNetworkPrefixLen: 24 - PodIfIPCIDR: 10.2.1.0/24 - VPPHostSubnetCIDR: 172.30.0.0/16 - VPPHostNetworkPrefixLen: 24 - NodeInterconnectCIDR: 192.168.16.0/24 - VxlanCIDR: 192.168.30.0/24 - NodeInterconnectDHCP: False -``` - -Information that might be helpful: - - Whether node IPs are statically assigned, or if DHCP is used - - STN is enabled - - Version of TAP interfaces used - - Output of `kubectl get pods -o wide --all-namespaces` - - -### Collecting the Logs - -The most essential thing that needs to be done when debugging and **reporting an issue** -in Contiv-VPP is **collecting the logs from the contiv-vpp vswitch containers**. - -#### a) Collecting Vswitch Logs Using kubectl -In order to collect the logs from individual vswitches in the cluster, connect to the master node -and then find the POD names of the individual vswitch containers: - -``` -$ kubectl get pods --all-namespaces | grep vswitch -kube-system contiv-vswitch-lqxfp 2/2 Running 0 1h -kube-system contiv-vswitch-q6kwt 2/2 Running 0 1h -``` - -Then run the following command, with *pod name* replaced by the actual POD name: -``` -$ kubectl logs <pod name> -n kube-system -c contiv-vswitch -``` - -Redirect the output to a file to save the logs, for example: - -``` -kubectl logs contiv-vswitch-lqxfp -n kube-system -c contiv-vswitch > logs-master.txt -``` - -#### b) Collecting Vswitch Logs Using Docker -If option a) does not work, then you can still collect the same logs using the plain docker -command. For that, you need to connect to each individual node in the k8s cluster, and find the container ID of the vswitch container: - -``` -$ docker ps | grep contivvpp/vswitch -b682b5837e52 contivvpp/vswitch "/usr/bin/supervisor…" 2 hours ago Up 2 hours k8s_contiv-vswitch_contiv-vswitch-q6kwt_kube-system_d09b6210-2903-11e8-b6c9-08002723b076_0 -``` - -Now use the ID from the first column to dump the logs into the `logs-master.txt` file: -``` -$ docker logs b682b5837e52 > logs-master.txt -``` - -#### Reviewing the Vswitch Logs - -In order to debug an issue, it is good to start by grepping the logs for the `level=error` string, for example: -``` -$ cat logs-master.txt | grep level=error -``` - -Also, VPP or contiv-agent may crash with some bugs. To check if some process crashed, grep for the string `exit`, for example: -``` -$ cat logs-master.txt | grep exit -2018-03-20 06:03:45,948 INFO exited: vpp (terminated by SIGABRT (core dumped); not expected) -2018-03-20 06:03:48,948 WARN received SIGTERM indicating exit request -``` - -#### Collecting the STN Daemon Logs -In STN (Steal The NIC) deployment scenarios, often need to collect and review the logs -from the STN daemon. This needs to be done on each node: -``` -$ docker logs contiv-stn > logs-stn-master.txt -``` - -#### Collecting Logs in Case of Crash Loop -If the vswitch is crashing in a loop (which can be determined by increasing the number in the `RESTARTS` -column of the `kubectl get pods --all-namespaces` output), the `kubectl logs` or `docker logs` would -give us the logs of the latest incarnation of the vswitch. That might not be the original root cause -of the very first crash, so in order to debug that, we need to disable k8s health check probes to not -restart the vswitch after the very first crash. This can be done by commenting-out the `readinessProbe` -and `livenessProbe` in the contiv-vpp deployment YAML: - -```diff -diff --git a/k8s/contiv-vpp.yaml b/k8s/contiv-vpp.yaml -index 3676047..ffa4473 100644 ---- a/k8s/contiv-vpp.yaml -+++ b/k8s/contiv-vpp.yaml -@@ -224,18 +224,18 @@ spec: - ports: - # readiness + liveness probe - - containerPort: 9999 -- readinessProbe: -- httpGet: -- path: /readiness -- port: 9999 -- periodSeconds: 1 -- initialDelaySeconds: 15 -- livenessProbe: -- httpGet: -- path: /liveness -- port: 9999 -- periodSeconds: 1 -- initialDelaySeconds: 60 -+ # readinessProbe: -+ # httpGet: -+ # path: /readiness -+ # port: 9999 -+ # periodSeconds: 1 -+ # initialDelaySeconds: 15 -+ # livenessProbe: -+ # httpGet: -+ # path: /liveness -+ # port: 9999 -+ # periodSeconds: 1 -+ # initialDelaySeconds: 60 - env: - - name: MICROSERVICE_LABEL - valueFrom: -``` - -If VPP is the crashing process, please follow the \[CORE_FILES\](CORE_FILES.html) guide and provide the coredump file. - - -### Inspect VPP Config -Inspect the following areas: -- Configured interfaces (issues related basic node/pod connectivity issues): -``` -vpp# sh int addr -GigabitEthernet0/9/0 (up): - 192.168.16.1/24 -local0 (dn): -loop0 (up): - l2 bridge bd_id 1 bvi shg 0 - 192.168.30.1/24 -tapcli-0 (up): - 172.30.1.1/24 -``` - -- IP forwarding table: -``` -vpp# sh ip fib -ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto ] locks:[src:(nil):2, src:adjacency:3, src:default-route:1, ] -0.0.0.0/0 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:1 buckets:1 uRPF:0 to:[7:552]] - [0] [@0]: dpo-drop ip4 -0.0.0.0/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:2 buckets:1 uRPF:1 to:[0:0]] - [0] [@0]: dpo-drop ip4 - -... -... - -255.255.255.255/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:5 buckets:1 uRPF:4 to:[0:0]] - [0] [@0]: dpo-drop ip4 -``` -- ARP Table: -``` -vpp# sh ip arp - Time IP4 Flags Ethernet Interface - 728.6616 192.168.16.2 D 08:00:27:9c:0e:9f GigabitEthernet0/8/0 - 542.7045 192.168.30.2 S 1a:2b:3c:4d:5e:02 loop0 - 1.4241 172.30.1.2 D 86:41:d5:92:fd:24 tapcli-0 - 15.2485 10.1.1.2 SN 00:00:00:00:00:02 tapcli-1 - 739.2339 10.1.1.3 SN 00:00:00:00:00:02 tapcli-2 - 739.4119 10.1.1.4 SN 00:00:00:00:00:02 tapcli-3 -``` -- NAT configuration (issues related to services): -``` -DBGvpp# sh nat44 addresses -NAT44 pool addresses: -192.168.16.10 - tenant VRF independent - 0 busy udp ports - 0 busy tcp ports - 0 busy icmp ports -NAT44 twice-nat pool addresses: -``` -``` -vpp# sh nat44 static mappings -NAT44 static mappings: - tcp local 192.168.42.1:6443 external 10.96.0.1:443 vrf 0 out2in-only - tcp local 192.168.42.1:12379 external 192.168.42.2:32379 vrf 0 out2in-only - tcp local 192.168.42.1:12379 external 192.168.16.2:32379 vrf 0 out2in-only - tcp local 192.168.42.1:12379 external 192.168.42.1:32379 vrf 0 out2in-only - tcp local 192.168.42.1:12379 external 192.168.16.1:32379 vrf 0 out2in-only - tcp local 192.168.42.1:12379 external 10.109.143.39:12379 vrf 0 out2in-only - udp local 10.1.2.2:53 external 10.96.0.10:53 vrf 0 out2in-only - tcp local 10.1.2.2:53 external 10.96.0.10:53 vrf 0 out2in-only -``` -``` -vpp# sh nat44 interfaces -NAT44 interfaces: - loop0 in out - GigabitEthernet0/9/0 out - tapcli-0 in out -``` -``` -vpp# sh nat44 sessions -NAT44 sessions: - 192.168.20.2: 0 dynamic translations, 3 static translations - 10.1.1.3: 0 dynamic translations, 0 static translations - 10.1.1.4: 0 dynamic translations, 0 static translations - 10.1.1.2: 0 dynamic translations, 6 static translations - 10.1.2.18: 0 dynamic translations, 2 static translations -``` -- ACL config (issues related to policies): -``` -vpp# sh acl-plugin acl -``` -- "Steal the NIC (STN)" config (issues related to host connectivity when STN is active): -``` -vpp# sh stn rules -- rule_index: 0 - address: 10.1.10.47 - iface: tapcli-0 (2) - next_node: tapcli-0-output (410) -``` -- Errors: -``` -vpp# sh errors -``` -- Vxlan tunnels: -``` -vpp# sh vxlan tunnels -``` -- Vxlan tunnels: -``` -vpp# sh vxlan tunnels -``` -- Hardware interface information: -``` -vpp# sh hardware-interfaces -``` - -### Basic Example - -[contiv-vpp-bug-report.sh][1] is an example of a script that may be a useful starting point to gathering the above information using kubectl. - -Limitations: -- The script does not include STN daemon logs nor does it handle the special - case of a crash loop - -Prerequisites: -- The user specified in the script must have passwordless access to all nodes - in the cluster; on each node in the cluster the user must have passwordless - access to sudo. - -#### Setting up Prerequisites -To enable logging into a node without a password, copy your public key to the following -node: -``` -ssh-copy-id <user-id>@<node-name-or-ip-address> -``` - -To enable running sudo without a password for a given user, enter: -``` -$ sudo visudo -``` - -Append the following entry to run ALL command without a password for a given -user: -``` -<userid> ALL=(ALL) NOPASSWD:ALL -``` - -You can also add user `<user-id>` to group `sudo` and edit the `sudo` -entry as follows: - -``` -# Allow members of group sudo to execute any command -%sudo ALL=(ALL:ALL) NOPASSWD:ALL -``` - -Add user `<user-id>` to group `<group-id>` as follows: -``` -sudo adduser <user-id> <group-id> -``` -or as follows: -``` -usermod -a -G <group-id> <user-id> -``` -#### Working with the Contiv-VPP Vagrant Test Bed -The script can be used to collect data from the [Contiv-VPP test bed created with Vagrant][2]. -To collect debug information from this Contiv-VPP test bed, do the -following steps: -* In the directory where you created your vagrant test bed, do: -``` - vagrant ssh-config > vagrant-ssh.conf -``` -* To collect the debug information do: -``` - ./contiv-vpp-bug-report.sh -u vagrant -m k8s-master -f <path-to-your-vagrant-ssh-config-file>/vagrant-ssh.conf -``` - -[1]: https://github.com/contiv/vpp/tree/master/scripts/contiv-vpp-bug-report.sh -[2]: https://github.com/contiv/vpp/blob/master/vagrant/README.md diff --git a/docs/usecases/contiv/BUG_REPORTS.rst b/docs/usecases/contiv/BUG_REPORTS.rst new file mode 100644 index 00000000000..8e55d5b3c8d --- /dev/null +++ b/docs/usecases/contiv/BUG_REPORTS.rst @@ -0,0 +1,401 @@ +Debugging and Reporting Bugs in Contiv-VPP +========================================== + +Bug Report Structure +-------------------- + +- `Deployment description <#describe-deployment>`__: Briefly describes + the deployment, where an issue was spotted, number of k8s nodes, is + DHCP/STN/TAP used. + +- `Logs <#collecting-the-logs>`__: Attach corresponding logs, at least + from the vswitch pods. + +- `VPP config <#inspect-vpp-config>`__: Attach output of the show + commands. + +- `Basic Collection Example <#basic-example>`__ + +Describe Deployment +~~~~~~~~~~~~~~~~~~~ + +Since contiv-vpp can be used with different configurations, it is +helpful to attach the config that was applied. Either attach +``values.yaml`` passed to the helm chart, or attach the `corresponding +part <https://github.com/contiv/vpp/blob/42b3bfbe8735508667b1e7f1928109a65dfd5261/k8s/contiv-vpp.yaml#L24-L38>`__ +from the deployment yaml file. + +.. code:: yaml + + contiv.yaml: |- + TCPstackDisabled: true + UseTAPInterfaces: true + TAPInterfaceVersion: 2 + NatExternalTraffic: true + MTUSize: 1500 + IPAMConfig: + PodSubnetCIDR: 10.1.0.0/16 + PodNetworkPrefixLen: 24 + PodIfIPCIDR: 10.2.1.0/24 + VPPHostSubnetCIDR: 172.30.0.0/16 + VPPHostNetworkPrefixLen: 24 + NodeInterconnectCIDR: 192.168.16.0/24 + VxlanCIDR: 192.168.30.0/24 + NodeInterconnectDHCP: False + +Information that might be helpful: - Whether node IPs are statically +assigned, or if DHCP is used - STN is enabled - Version of TAP +interfaces used - Output of +``kubectl get pods -o wide --all-namespaces`` + +Collecting the Logs +~~~~~~~~~~~~~~~~~~~ + +The most essential thing that needs to be done when debugging and +**reporting an issue** in Contiv-VPP is **collecting the logs from the +contiv-vpp vswitch containers**. + +a) Collecting Vswitch Logs Using kubectl +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In order to collect the logs from individual vswitches in the cluster, +connect to the master node and then find the POD names of the individual +vswitch containers: + +:: + + $ kubectl get pods --all-namespaces | grep vswitch + kube-system contiv-vswitch-lqxfp 2/2 Running 0 1h + kube-system contiv-vswitch-q6kwt 2/2 Running 0 1h + +Then run the following command, with *pod name* replaced by the actual +POD name: + +:: + + $ kubectl logs <pod name> -n kube-system -c contiv-vswitch + +Redirect the output to a file to save the logs, for example: + +:: + + kubectl logs contiv-vswitch-lqxfp -n kube-system -c contiv-vswitch > logs-master.txt + +b) Collecting Vswitch Logs Using Docker +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If option a) does not work, then you can still collect the same logs +using the plain docker command. For that, you need to connect to each +individual node in the k8s cluster, and find the container ID of the +vswitch container: + +:: + + $ docker ps | grep contivvpp/vswitch + b682b5837e52 contivvpp/vswitch "/usr/bin/supervisor…" 2 hours ago Up 2 hours k8s_contiv-vswitch_contiv-vswitch-q6kwt_kube-system_d09b6210-2903-11e8-b6c9-08002723b076_0 + +Now use the ID from the first column to dump the logs into the +``logs-master.txt`` file: + +:: + + $ docker logs b682b5837e52 > logs-master.txt + +Reviewing the Vswitch Logs +^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In order to debug an issue, it is good to start by grepping the logs for +the ``level=error`` string, for example: + +:: + + $ cat logs-master.txt | grep level=error + +Also, VPP or contiv-agent may crash with some bugs. To check if some +process crashed, grep for the string ``exit``, for example: + +:: + + $ cat logs-master.txt | grep exit + 2018-03-20 06:03:45,948 INFO exited: vpp (terminated by SIGABRT (core dumped); not expected) + 2018-03-20 06:03:48,948 WARN received SIGTERM indicating exit request + +Collecting the STN Daemon Logs +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In STN (Steal The NIC) deployment scenarios, often need to collect and +review the logs from the STN daemon. This needs to be done on each node: + +:: + + $ docker logs contiv-stn > logs-stn-master.txt + +Collecting Logs in Case of Crash Loop +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +If the vswitch is crashing in a loop (which can be determined by +increasing the number in the ``RESTARTS`` column of the +``kubectl get pods --all-namespaces`` output), the ``kubectl logs`` or +``docker logs`` would give us the logs of the latest incarnation of the +vswitch. That might not be the original root cause of the very first +crash, so in order to debug that, we need to disable k8s health check +probes to not restart the vswitch after the very first crash. This can +be done by commenting-out the ``readinessProbe`` and ``livenessProbe`` +in the contiv-vpp deployment YAML: + +.. code:: diff + + diff --git a/k8s/contiv-vpp.yaml b/k8s/contiv-vpp.yaml + index 3676047..ffa4473 100644 + --- a/k8s/contiv-vpp.yaml + +++ b/k8s/contiv-vpp.yaml + @@ -224,18 +224,18 @@ spec: + ports: + # readiness + liveness probe + - containerPort: 9999 + - readinessProbe: + - httpGet: + - path: /readiness + - port: 9999 + - periodSeconds: 1 + - initialDelaySeconds: 15 + - livenessProbe: + - httpGet: + - path: /liveness + - port: 9999 + - periodSeconds: 1 + - initialDelaySeconds: 60 + + # readinessProbe: + + # httpGet: + + # path: /readiness + + # port: 9999 + + # periodSeconds: 1 + + # initialDelaySeconds: 15 + + # livenessProbe: + + # httpGet: + + # path: /liveness + + # port: 9999 + + # periodSeconds: 1 + + # initialDelaySeconds: 60 + env: + - name: MICROSERVICE_LABEL + valueFrom: + +If VPP is the crashing process, please follow the +[CORE_FILES](CORE_FILES.html) guide and provide the coredump file. + +Inspect VPP Config +~~~~~~~~~~~~~~~~~~ + +Inspect the following areas: - Configured interfaces (issues related +basic node/pod connectivity issues): + +:: + + vpp# sh int addr + GigabitEthernet0/9/0 (up): + 192.168.16.1/24 + local0 (dn): + loop0 (up): + l2 bridge bd_id 1 bvi shg 0 + 192.168.30.1/24 + tapcli-0 (up): + 172.30.1.1/24 + +- IP forwarding table: + +:: + + vpp# sh ip fib + ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto ] locks:[src:(nil):2, src:adjacency:3, src:default-route:1, ] + 0.0.0.0/0 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:1 buckets:1 uRPF:0 to:[7:552]] + [0] [@0]: dpo-drop ip4 + 0.0.0.0/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:2 buckets:1 uRPF:1 to:[0:0]] + [0] [@0]: dpo-drop ip4 + + ... + ... + + 255.255.255.255/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:5 buckets:1 uRPF:4 to:[0:0]] + [0] [@0]: dpo-drop ip4 + +- ARP Table: + +:: + + vpp# sh ip arp + Time IP4 Flags Ethernet Interface + 728.6616 192.168.16.2 D 08:00:27:9c:0e:9f GigabitEthernet0/8/0 + 542.7045 192.168.30.2 S 1a:2b:3c:4d:5e:02 loop0 + 1.4241 172.30.1.2 D 86:41:d5:92:fd:24 tapcli-0 + 15.2485 10.1.1.2 SN 00:00:00:00:00:02 tapcli-1 + 739.2339 10.1.1.3 SN 00:00:00:00:00:02 tapcli-2 + 739.4119 10.1.1.4 SN 00:00:00:00:00:02 tapcli-3 + +- NAT configuration (issues related to services): + +:: + + DBGvpp# sh nat44 addresses + NAT44 pool addresses: + 192.168.16.10 + tenant VRF independent + 0 busy udp ports + 0 busy tcp ports + 0 busy icmp ports + NAT44 twice-nat pool addresses: + +:: + + vpp# sh nat44 static mappings + NAT44 static mappings: + tcp local 192.168.42.1:6443 external 10.96.0.1:443 vrf 0 out2in-only + tcp local 192.168.42.1:12379 external 192.168.42.2:32379 vrf 0 out2in-only + tcp local 192.168.42.1:12379 external 192.168.16.2:32379 vrf 0 out2in-only + tcp local 192.168.42.1:12379 external 192.168.42.1:32379 vrf 0 out2in-only + tcp local 192.168.42.1:12379 external 192.168.16.1:32379 vrf 0 out2in-only + tcp local 192.168.42.1:12379 external 10.109.143.39:12379 vrf 0 out2in-only + udp local 10.1.2.2:53 external 10.96.0.10:53 vrf 0 out2in-only + tcp local 10.1.2.2:53 external 10.96.0.10:53 vrf 0 out2in-only + +:: + + vpp# sh nat44 interfaces + NAT44 interfaces: + loop0 in out + GigabitEthernet0/9/0 out + tapcli-0 in out + +:: + + vpp# sh nat44 sessions + NAT44 sessions: + 192.168.20.2: 0 dynamic translations, 3 static translations + 10.1.1.3: 0 dynamic translations, 0 static translations + 10.1.1.4: 0 dynamic translations, 0 static translations + 10.1.1.2: 0 dynamic translations, 6 static translations + 10.1.2.18: 0 dynamic translations, 2 static translations + +- ACL config (issues related to policies): + +:: + + vpp# sh acl-plugin acl + +- “Steal the NIC (STN)” config (issues related to host connectivity + when STN is active): + +:: + + vpp# sh stn rules + - rule_index: 0 + address: 10.1.10.47 + iface: tapcli-0 (2) + next_node: tapcli-0-output (410) + +- Errors: + +:: + + vpp# sh errors + +- Vxlan tunnels: + +:: + + vpp# sh vxlan tunnels + +- Vxlan tunnels: + +:: + + vpp# sh vxlan tunnels + +- Hardware interface information: + +:: + + vpp# sh hardware-interfaces + +Basic Example +~~~~~~~~~~~~~ + +`contiv-vpp-bug-report.sh <https://github.com/contiv/vpp/tree/master/scripts/contiv-vpp-bug-report.sh>`__ +is an example of a script that may be a useful starting point to +gathering the above information using kubectl. + +Limitations: - The script does not include STN daemon logs nor does it +handle the special case of a crash loop + +Prerequisites: - The user specified in the script must have passwordless +access to all nodes in the cluster; on each node in the cluster the user +must have passwordless access to sudo. + +Setting up Prerequisites +^^^^^^^^^^^^^^^^^^^^^^^^ + +To enable logging into a node without a password, copy your public key +to the following node: + +:: + + ssh-copy-id <user-id>@<node-name-or-ip-address> + +To enable running sudo without a password for a given user, enter: + +:: + + $ sudo visudo + +Append the following entry to run ALL command without a password for a +given user: + +:: + + <userid> ALL=(ALL) NOPASSWD:ALL + +You can also add user ``<user-id>`` to group ``sudo`` and edit the +``sudo`` entry as follows: + +:: + + # Allow members of group sudo to execute any command + %sudo ALL=(ALL:ALL) NOPASSWD:ALL + +Add user ``<user-id>`` to group ``<group-id>`` as follows: + +:: + + sudo adduser <user-id> <group-id> + +or as follows: + +:: + + usermod -a -G <group-id> <user-id> + +Working with the Contiv-VPP Vagrant Test Bed +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The script can be used to collect data from the `Contiv-VPP test bed +created with +Vagrant <https://github.com/contiv/vpp/blob/master/vagrant/README.md>`__. +To collect debug information from this Contiv-VPP test bed, do the +following steps: \* In the directory where you created your vagrant test +bed, do: + +:: + + vagrant ssh-config > vagrant-ssh.conf + +- To collect the debug information do: + +:: + + ./contiv-vpp-bug-report.sh -u vagrant -m k8s-master -f <path-to-your-vagrant-ssh-config-file>/vagrant-ssh.conf diff --git a/docs/usecases/contiv/CORE_FILES.md b/docs/usecases/contiv/CORE_FILES.md deleted file mode 100644 index 5d269cd1504..00000000000 --- a/docs/usecases/contiv/CORE_FILES.md +++ /dev/null @@ -1,73 +0,0 @@ -# Capturing VPP core dumps -In order to debug a crash of VPP, it is required to provide a coredump file, which allows backtracing of the VPP issue. The following items are the requirements for capturing a coredump: - -#### 1. Disable k8s Probes to Prevent k8s from Restarting the POD with a Crashed VPP -As described in [BUG_REPORTS.md](BUG_REPORTS.html#collecting-the-logs-in-case-of-crash-loop). - -#### 2. Modify VPP Startup config file -In `/etc/vpp/contiv-vswitch.conf`, add the following lines into the `unix` section: - -``` -unix { - ... - coredump-size unlimited - full-coredump -} -``` - -#### 3. Turn on Coredumps in the Vswitch Container -After re-deploying Contiv-VPP networking, enter bash shell in the vswitch -container (use actual name of the vswitch POD - `contiv-vswitch-7whk7` in this case): -``` -kubectl exec -it contiv-vswitch-7whk7 -n kube-system -c contiv-vswitch bash -``` - -Enable coredumps: -``` -mkdir -p /tmp/dumps -sysctl -w debug.exception-trace=1 -sysctl -w kernel.core_pattern="/tmp/dumps/%e-%t" -ulimit -c unlimited -echo 2 > /proc/sys/fs/suid_dumpable -``` - -#### 4. Let VPP Crash -Now repeat the steps that lead to the VPP crash. You can also force VPP to crash at the point where it is -running (e.g., if it is stuck) by using the SIGQUIT signal: -``` -kill -3 `pidof vpp` -``` - -#### 5. Locate and Inspect the Core File -The core file should appear in `/tmp/dumps` in the container: -``` -cd /tmp/dumps -ls -vpp_main-1524124440 -``` - -You can try to backtrace, after installing gdb: -``` -apt-get update && apt-get install gdb -gdb vpp vpp_main-1524124440 -(gdb) bt -``` - -#### 6. Copy the Core File Out of the Container -Finally, copy the core file out of the container. First, while still inside the container, -pack the core file into an archive: - -``` -cd /tmp/dumps -tar cvzf vppdump.tar.gz vpp_main-1524124440 -``` - -Now, on the host, determine the docker ID of the container, and then copy the file out of the host: -``` -docker ps | grep vswitch_contiv -d7aceb2e4876 c43a70ac3d01 "/usr/bin/supervisor…" 25 minutes ago Up 25 minutes k8s_contiv-vswitch_contiv-vswitch-zqzn6_kube-system_9923952f-43a6-11e8-be84-080027de08ea_0 - -docker cp d7aceb2e4876:/tmp/dumps/vppdump.tar.gz . -``` - -Now you are ready to file a bug in [jira.fd.io](https://jira.fd.io/) and attach the core file.
\ No newline at end of file diff --git a/docs/usecases/contiv/CORE_FILES.rst b/docs/usecases/contiv/CORE_FILES.rst new file mode 100644 index 00000000000..188884827dd --- /dev/null +++ b/docs/usecases/contiv/CORE_FILES.rst @@ -0,0 +1,101 @@ +Capturing VPP core dumps +======================== + +In order to debug a crash of VPP, it is required to provide a coredump +file, which allows backtracing of the VPP issue. The following items are +the requirements for capturing a coredump: + +1. Disable k8s Probes to Prevent k8s from Restarting the POD with a Crashed VPP +------------------------------------------------------------------------------- + +As described in +`BUG_REPORTS.md <BUG_REPORTS.html#collecting-the-logs-in-case-of-crash-loop>`__. + +2. Modify VPP Startup config file +--------------------------------- + +In ``/etc/vpp/contiv-vswitch.conf``, add the following lines into the +``unix`` section: + +:: + + unix { + ... + coredump-size unlimited + full-coredump + } + +3. Turn on Coredumps in the Vswitch Container +--------------------------------------------- + +After re-deploying Contiv-VPP networking, enter bash shell in the +vswitch container (use actual name of the vswitch POD - +``contiv-vswitch-7whk7`` in this case): + +:: + + kubectl exec -it contiv-vswitch-7whk7 -n kube-system -c contiv-vswitch bash + +Enable coredumps: + +:: + + mkdir -p /tmp/dumps + sysctl -w debug.exception-trace=1 + sysctl -w kernel.core_pattern="/tmp/dumps/%e-%t" + ulimit -c unlimited + echo 2 > /proc/sys/fs/suid_dumpable + +4. Let VPP Crash +---------------- + +Now repeat the steps that lead to the VPP crash. You can also force VPP +to crash at the point where it is running (e.g., if it is stuck) by +using the SIGQUIT signal: + +:: + + kill -3 `pidof vpp` + +5. Locate and Inspect the Core File +----------------------------------- + +The core file should appear in ``/tmp/dumps`` in the container: + +:: + + cd /tmp/dumps + ls + vpp_main-1524124440 + +You can try to backtrace, after installing gdb: + +:: + + apt-get update && apt-get install gdb + gdb vpp vpp_main-1524124440 + (gdb) bt + +6. Copy the Core File Out of the Container +------------------------------------------ + +Finally, copy the core file out of the container. First, while still +inside the container, pack the core file into an archive: + +:: + + cd /tmp/dumps + tar cvzf vppdump.tar.gz vpp_main-1524124440 + +Now, on the host, determine the docker ID of the container, and then +copy the file out of the host: + +:: + + docker ps | grep vswitch_contiv + d7aceb2e4876 c43a70ac3d01 "/usr/bin/supervisor…" 25 minutes ago Up 25 minutes k8s_contiv-vswitch_contiv-vswitch-zqzn6_kube-system_9923952f-43a6-11e8-be84-080027de08ea_0 + + docker cp d7aceb2e4876:/tmp/dumps/vppdump.tar.gz . + +Now you are ready to file a bug in `jira.fd.io <https://jira.fd.io/>`__ +and attach the core file. diff --git a/docs/usecases/contiv/CUSTOM_MGMT_NETWORK.md b/docs/usecases/contiv/CUSTOM_MGMT_NETWORK.md deleted file mode 100644 index bf2937f2016..00000000000 --- a/docs/usecases/contiv/CUSTOM_MGMT_NETWORK.md +++ /dev/null @@ -1,26 +0,0 @@ -### Setting Up a Custom Management Network on Multi-Homed Nodes - -If the interface you use for Kubernetes management traffic (for example, the -IP address used for `kubeadm join`) is not the one that contains the default -route out of the host, then you need to specify the management node IP address in -the Kubelet config file. Add the following line to: -(`/etc/systemd/system/kubelet.service.d/10-kubeadm.conf`): -``` -Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false --node-ip=<node-management-ip-address>" -``` -#### Example -Consider a 2 node deployment where each node is connected to 2 networks - -`10.0.2.0/24` and `192.168.56.0/24`, and the default route on each node points -to the interface connected to the `10.0.2.0/24` subnet. We want to use subnet -`192.168.56.0/24` for Kubernetes management traffic. Assume the addresses of -nodes connected to `192.168.56.0/24` are `192.168.56.105` and `192.168.56.106`. - -On the `192.168.56.105` node you add the following line to `10-kubeadm.conf`: -``` -Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false --node-ip=192.168.56.105" -``` -On the `192.168.56.106` node you add the following line to `10-kubeadm.conf`: -``` -Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false --node-ip=192.168.56.106" -``` - diff --git a/docs/usecases/contiv/CUSTOM_MGMT_NETWORK.rst b/docs/usecases/contiv/CUSTOM_MGMT_NETWORK.rst new file mode 100644 index 00000000000..b8cf2e6dd86 --- /dev/null +++ b/docs/usecases/contiv/CUSTOM_MGMT_NETWORK.rst @@ -0,0 +1,36 @@ +Setting Up a Custom Management Network on Multi-Homed Nodes +=========================================================== + +If the interface you use for Kubernetes management traffic (for example, +the IP address used for ``kubeadm join``) is not the one that contains +the default route out of the host, then you need to specify the +management node IP address in the Kubelet config file. Add the following +line to: (``/etc/systemd/system/kubelet.service.d/10-kubeadm.conf``): + +:: + + Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false --node-ip=<node-management-ip-address>" + +Example +------- + +Consider a 2 node deployment where each node is connected to 2 networks +- ``10.0.2.0/24`` and ``192.168.56.0/24``, and the default route on each +node points to the interface connected to the ``10.0.2.0/24`` subnet. We +want to use subnet ``192.168.56.0/24`` for Kubernetes management +traffic. Assume the addresses of nodes connected to ``192.168.56.0/24`` +are ``192.168.56.105`` and ``192.168.56.106``. + +On the ``192.168.56.105`` node you add the following line to +``10-kubeadm.conf``: + +:: + + Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false --node-ip=192.168.56.105" + +On the ``192.168.56.106`` node you add the following line to +``10-kubeadm.conf``: + +:: + + Environment="KUBELET_EXTRA_ARGS=--fail-swap-on=false --node-ip=192.168.56.106" diff --git a/docs/usecases/contiv/K8s_Overview.md b/docs/usecases/contiv/K8s_Overview.md deleted file mode 100644 index 69f144b2953..00000000000 --- a/docs/usecases/contiv/K8s_Overview.md +++ /dev/null @@ -1,109 +0,0 @@ -# Contiv/VPP Kubernetes Network Plugin - - -## Overview - -Kubernetes is a container orchestration system that efficiently manages Docker containers. The Docker containers and container platforms provide many advantages over traditional virtualization. Container isolation is done on the kernel level, which eliminates the need for a guest virtual operating system, and therefore makes containers much more efficient, faster, and lightweight. The containers in Contiv/VPP are referred to as PODs. - -Contiv/VPP is a Kubernetes network plugin that uses [FD.io VPP](https://fd.io/) -to provide network connectivity between PODs in a k8s cluster (k8s is an abbreviated reference for kubernetes). -It deploys itself as a set of system PODs in the `kube-system` namespace, -some of them (`contiv-ksr`, `contiv-etcd`) on the master node, and some -of them (`contiv-cni`, `contiv-vswitch`, `contiv-stn`) on each node in the cluster. - -Contiv/VPP is fully integrated with k8s via its components, -and it automatically reprograms itself upon each change in the cluster -via k8s API. - -The main component of the [VPP](https://fd.io/technology/#vpp) solution, which -runs within the `contiv-vswitch` POD on each node in the cluster. The VPP solution also provides -POD-to-POD connectivity across the nodes in the cluster, as well as host-to-POD -and outside-to-POD connectivity. This solution also leverages -VPP's fast data processing that runs completely in userspace, and uses -[DPDK](https://dpdk.org/) for fast access to the network IO layer. - -Kubernetes services and policies are also a part of the VPP configuration, -which means they are fully supported on VPP, without the need of forwarding -packets into the Linux network stack (Kube Proxy), which makes them very -effective and scalable. - - -## Architecture - -Contiv/VPP consists of several components, each of them packed and shipped as -a Docker container. Two of them deploy on Kubernetes master node only: - - - [Contiv KSR](#contiv-ksr) - - [Contiv ETCD](#contiv-etcd) - -The rest of them deploy on all nodes within the k8s cluster (including the master node): - -- [Contiv vSwitch](#contiv-vswitch) -- [Contiv CNI](#contiv-cni) -- [Contiv STN](#contiv-stn-daemon) - - -The following section briefly describes the individual Contiv components, which are displayed -as orange boxes on the picture below: - -![Contiv/VPP Architecture](../../_images/contiv-arch.png) - - -### Contiv KSR -Contiv KSR (Kubernetes State Reflector) is an agent that subscribes to k8s control plane, watches k8s resources and -propagates all relevant cluster-related information into the Contiv ETCD data store. -Other Contiv components do not access the k8s API directly, they subscribe to -Contiv ETCD instead. For more information on KSR, read the -[KSR Readme](https://github.com/contiv/vpp/blob/master/cmd/contiv-ksr/README.md). - - -### Contiv ETCD -Contiv/VPP uses its own instance of the ETCD database for storage of k8s cluster-related data -reflected by KSR, which are then accessed by Contiv vSwitch Agents running on -individual nodes. Apart from the data reflected by KSR, ETCD also stores persisted VPP -configuration of individual vswitches (mainly used to restore the operation after restarts), -as well as some more internal metadata. - - -### Contiv vSwitch -vSwitch is the main networking component that provides the connectivity to PODs. -It deploys on each node in the cluster, and consists of two main components packed -into a single Docker container: VPP and Contiv VPP Agent. - -**VPP** is the data plane software that provides the connectivity between PODs, host Linux -network stack, and data-plane NIC interface controlled by VPP: - - PODs are connected to VPP using TAP interfaces wired between VPP, and each POD network namespace. - - host network stack is connected to VPP using another TAP interface connected - to the main (default) network namespace. - - data-plane NIC is controlled directly by VPP using DPDK. Note, this means that - this interface is not visible to the host Linux network stack, and the node either needs another - management interface for k8s control plane communication, or - \[STN (Steal The NIC)\](SINGLE_NIC_SETUP.html) deployment must be applied. - -**Contiv VPP Agent** is the control plane part of the vSwitch container. It is responsible -for configuring the VPP according to the information gained from ETCD, and requests -from Contiv STN. It is based on the [Ligato VPP Agent](https://github.com/ligato/vpp-agent) code with extensions that are related to k8s. - -For communication with VPP, it uses VPP binary API messages sent via shared memory using -[GoVPP](https://wiki.fd.io/view/GoVPP). -For connection with Contiv STN, the agent acts as a GRPC server serving CNI requests -forwarded from the Contiv CNI. - -### Contiv CNI -Contiv CNI (Container Network Interface) is a simple binary that implements the -[Container Network Interface](https://github.com/containernetworking/cni) -API and is being executed by Kubelet upon POD creation and deletion. The CNI binary -just packs the request into a GRPC request and forwards it to the Contiv VPP Agent -running on the same node, which then processes it (wires/unwires the container) -and replies with a response, which is then forwarded back to Kubelet. - - -### Contiv STN Daemon -This section discusses how the Contiv \[STN (Steal The NIC)\](SINGLE_NIC_SETUP.html) daemon operation works. As already mentioned, the default setup of Contiv/VPP requires two network interfaces -per node: one controlled by VPP for data facing PODs, and one controlled by the host -network stack for k8s control plane communication. In case that your k8s nodes -do not provide two network interfaces, Contiv/VPP can work in the single NIC setup, -when the interface will be "stolen" from the host network stack just before starting -the VPP and configured with the same IP address on VPP, as well as -on the host-VPP interconnect TAP interface, as it had in the host before it. -For more information on STN setup, read the \[Single NIC Setup README\](./SINGLE_NIC_SETUP.html) diff --git a/docs/usecases/contiv/K8s_Overview.rst b/docs/usecases/contiv/K8s_Overview.rst new file mode 100644 index 00000000000..62e9e10926b --- /dev/null +++ b/docs/usecases/contiv/K8s_Overview.rst @@ -0,0 +1,144 @@ +Contiv/VPP Kubernetes Network Plugin +==================================== + +Overview +-------- + +Kubernetes is a container orchestration system that efficiently manages +Docker containers. The Docker containers and container platforms provide +many advantages over traditional virtualization. Container isolation is +done on the kernel level, which eliminates the need for a guest virtual +operating system, and therefore makes containers much more efficient, +faster, and lightweight. The containers in Contiv/VPP are referred to as +PODs. + +Contiv/VPP is a Kubernetes network plugin that uses `FD.io +VPP <https://fd.io/>`__ to provide network connectivity between PODs in +a k8s cluster (k8s is an abbreviated reference for kubernetes). It +deploys itself as a set of system PODs in the ``kube-system`` namespace, +some of them (``contiv-ksr``, ``contiv-etcd``) on the master node, and +some of them (``contiv-cni``, ``contiv-vswitch``, ``contiv-stn``) on +each node in the cluster. + +Contiv/VPP is fully integrated with k8s via its components, and it +automatically reprograms itself upon each change in the cluster via k8s +API. + +The main component of the `VPP <https://fd.io/technology/#vpp>`__ +solution, which runs within the ``contiv-vswitch`` POD on each node in +the cluster. The VPP solution also provides POD-to-POD connectivity +across the nodes in the cluster, as well as host-to-POD and +outside-to-POD connectivity. This solution also leverages VPP’s fast +data processing that runs completely in userspace, and uses +`DPDK <https://dpdk.org/>`__ for fast access to the network IO layer. + +Kubernetes services and policies are also a part of the VPP +configuration, which means they are fully supported on VPP, without the +need of forwarding packets into the Linux network stack (Kube Proxy), +which makes them very effective and scalable. + +Architecture +------------ + +Contiv/VPP consists of several components, each of them packed and +shipped as a Docker container. Two of them deploy on Kubernetes master +node only: + +- `Contiv KSR <#contiv-ksr>`__ +- `Contiv ETCD <#contiv-etcd>`__ + +The rest of them deploy on all nodes within the k8s cluster (including +the master node): + +- `Contiv vSwitch <#contiv-vswitch>`__ +- `Contiv CNI <#contiv-cni>`__ +- `Contiv STN <#contiv-stn-daemon>`__ + +The following section briefly describes the individual Contiv +components, which are displayed as orange boxes on the picture below: + +.. figure:: ../../_images/contiv-arch.png + :alt: Contiv/VPP Architecture + + Contiv/VPP Architecture + +Contiv KSR +~~~~~~~~~~ + +Contiv KSR (Kubernetes State Reflector) is an agent that subscribes to +k8s control plane, watches k8s resources and propagates all relevant +cluster-related information into the Contiv ETCD data store. Other +Contiv components do not access the k8s API directly, they subscribe to +Contiv ETCD instead. For more information on KSR, read the `KSR +Readme <https://github.com/contiv/vpp/blob/master/cmd/contiv-ksr/README.md>`__. + +Contiv ETCD +~~~~~~~~~~~ + +Contiv/VPP uses its own instance of the ETCD database for storage of k8s +cluster-related data reflected by KSR, which are then accessed by Contiv +vSwitch Agents running on individual nodes. Apart from the data +reflected by KSR, ETCD also stores persisted VPP configuration of +individual vswitches (mainly used to restore the operation after +restarts), as well as some more internal metadata. + +Contiv vSwitch +~~~~~~~~~~~~~~ + +vSwitch is the main networking component that provides the connectivity +to PODs. It deploys on each node in the cluster, and consists of two +main components packed into a single Docker container: VPP and Contiv +VPP Agent. + +**VPP** is the data plane software that provides the connectivity +between PODs, host Linux network stack, and data-plane NIC interface +controlled by VPP: + +- PODs are connected to VPP using TAP interfaces wired between VPP, and + each POD network namespace. +- host network stack is connected to VPP using another TAP interface + connected to the main (default) network namespace. +- data-plane NIC is controlled directly by VPP using DPDK. Note, this + means that this interface is not visible to the host Linux network + stack, and the node either needs another management interface for k8s + control plane communication, or [STN (Steal The + NIC)](SINGLE_NIC_SETUP.html) deployment must be applied. + +**Contiv VPP Agent** is the control plane part of the vSwitch container. +It is responsible for configuring the VPP according to the information +gained from ETCD, and requests from Contiv STN. It is based on the +`Ligato VPP Agent <https://github.com/ligato/vpp-agent>`__ code with +extensions that are related to k8s. + +For communication with VPP, it uses VPP binary API messages sent via +shared memory using `GoVPP <https://wiki.fd.io/view/GoVPP>`__. For +connection with Contiv STN, the agent acts as a GRPC server serving CNI +requests forwarded from the Contiv CNI. + +Contiv CNI +~~~~~~~~~~ + +Contiv CNI (Container Network Interface) is a simple binary that +implements the `Container Network +Interface <https://github.com/containernetworking/cni>`__ API and is +being executed by Kubelet upon POD creation and deletion. The CNI binary +just packs the request into a GRPC request and forwards it to the Contiv +VPP Agent running on the same node, which then processes it +(wires/unwires the container) and replies with a response, which is then +forwarded back to Kubelet. + +Contiv STN Daemon +~~~~~~~~~~~~~~~~~ + +This section discusses how the Contiv [STN (Steal The +NIC)](SINGLE_NIC_SETUP.html) daemon operation works. As already +mentioned, the default setup of Contiv/VPP requires two network +interfaces per node: one controlled by VPP for data facing PODs, and one +controlled by the host network stack for k8s control plane +communication. In case that your k8s nodes do not provide two network +interfaces, Contiv/VPP can work in the single NIC setup, when the +interface will be “stolen” from the host network stack just before +starting the VPP and configured with the same IP address on VPP, as well +as on the host-VPP interconnect TAP interface, as it had in the host +before it. For more information on STN setup, read the [Single NIC Setup +README](./SINGLE_NIC_SETUP.html) diff --git a/docs/usecases/contiv/MANUAL_INSTALL.md b/docs/usecases/contiv/MANUAL_INSTALL.md deleted file mode 100644 index 35506db0d16..00000000000 --- a/docs/usecases/contiv/MANUAL_INSTALL.md +++ /dev/null @@ -1,482 +0,0 @@ -# Manual Installation -This document describes how to clone the Contiv repository and then use [kubeadm][1] to manually install Kubernetes -with Contiv-VPP networking on one or more bare metal or VM hosts. - -## Clone the Contiv Repository -To clone the Contiv repository enter the following command: -``` -git clone https://github.com/contiv/vpp/<repository-name> -``` -**Note:** Replace *<repository-name>* with the name you want assigned to your cloned contiv repository. - -The cloned repository has important folders that contain content that are referenced in this Contiv documentation; those folders are noted below: -``` -vpp-contiv2$ ls -build build-root doxygen gmod LICENSE Makefile RELEASE.md src -build-data docs extras INFO.yaml MAINTAINERS README.md sphinx_venv test -``` -## Preparing Your Hosts - -### Host-specific Configurations -- **VmWare VMs**: the vmxnet3 driver is required on each interface that will - be used by VPP. Please see [here][13] for instructions how to install the - vmxnet3 driver on VmWare Fusion. - -### Setting up Network Adapter(s) -#### Setting up DPDK -DPDK setup must be completed **on each node** as follows: - -- Load the PCI UIO driver: - ``` - $ sudo modprobe uio_pci_generic - ``` - -- Verify that the PCI UIO driver has loaded successfully: - ``` - $ lsmod | grep uio - uio_pci_generic 16384 0 - uio 20480 1 uio_pci_generic - ``` - - Please note that this driver needs to be loaded upon each server bootup, - so you may want to add `uio_pci_generic` into the `/etc/modules` file, - or a file in the `/etc/modules-load.d/` directory. For example, the - `/etc/modules` file could look as follows: - ``` - # /etc/modules: kernel modules to load at boot time. - # - # This file contains the names of kernel modules that should be loaded - # at boot time, one per line. Lines beginning with "#" are ignored. - uio_pci_generic - ``` -#### Determining Network Adapter PCI Addresses -You need the PCI address of the network interface that VPP will use for the multi-node pod interconnect. On Debian-based -distributions, you can use `lshw`(*): - -``` -$ sudo lshw -class network -businfo -Bus info Device Class Description -==================================================== -pci@0000:00:03.0 ens3 network Virtio network device -pci@0000:00:04.0 ens4 network Virtio network device -``` -**Note:** On CentOS/RedHat/Fedora distributions, `lshw` may not be available by default, install it by issuing the following command: - ``` - yum -y install lshw - ``` - -#### Configuring vswitch to Use Network Adapters -Finally, you need to set up the vswitch to use the network adapters: - -- [Setup on a node with a single NIC][14] -- [Setup a node with multiple NICs][15] - -### Using a Node Setup Script -You can perform the above steps using the [node setup script][17]. - -## Installing Kubernetes with Contiv-VPP CNI plugin -After the nodes you will be using in your K8s cluster are prepared, you can -install the cluster using [kubeadm][1]. - -### (1/4) Installing Kubeadm on Your Hosts -For first-time installation, see [Installing kubeadm][6]. To update an -existing installation, you should do a `apt-get update && apt-get upgrade` -or `yum update` to get the latest version of kubeadm. - -On each host with multiple NICs where the NIC that will be used for Kubernetes -management traffic is not the one pointed to by the default route out of the -host, a [custom management network][12] for Kubernetes must be configured. - -#### Using Kubernetes 1.10 and Above -In K8s 1.10, support for huge pages in a pod has been introduced. For now, this -feature must be either disabled or memory limit must be defined for vswitch container. - -To disable huge pages, perform the following -steps as root: -* Using your favorite editor, disable huge pages in the kubelet configuration - file (`/etc/systemd/system/kubelet.service.d/10-kubeadm.conf` or `/etc/default/kubelet` for version 1.11+): -``` - Environment="KUBELET_EXTRA_ARGS=--feature-gates HugePages=false" -``` -* Restart the kubelet daemon: -``` - systemctl daemon-reload - systemctl restart kubelet -``` - -To define memory limit, append the following snippet to vswitch container in deployment yaml file: -``` - resources: - limits: - hugepages-2Mi: 1024Mi - memory: 1024Mi - -``` -or set `contiv.vswitch.defineMemoryLimits` to `true` in [helm values](https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/README.md). - -### (2/4) Initializing Your Master -Before initializing the master, you may want to [remove][8] any -previously installed K8s components. Then, proceed with master initialization -as described in the [kubeadm manual][3]. Execute the following command as -root: -``` -kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16 -``` -**Note:** `kubeadm init` will autodetect the network interface to advertise -the master on as the interface with the default gateway. If you want to use a -different interface (i.e. a custom management network setup), specify the -`--apiserver-advertise-address=<ip-address>` argument to kubeadm init. For -example: -``` -kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16 --apiserver-advertise-address=192.168.56.106 -``` -**Note:** The CIDR specified with the flag `--pod-network-cidr` is used by -kube-proxy, and it **must include** the `PodSubnetCIDR` from the `IPAMConfig` -section in the Contiv-vpp config map in Contiv-vpp's deployment file -[contiv-vpp.yaml](https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/values.yaml). Pods in the host network namespace -are a special case; they share their respective interfaces and IP addresses with -the host. For proxying to work properly it is therefore required for services -with backends running on the host to also **include the node management IP** -within the `--pod-network-cidr` subnet. For example, with the default -`PodSubnetCIDR=10.1.0.0/16` and `PodIfIPCIDR=10.2.1.0/24`, the subnet -`10.3.0.0/16` could be allocated for the management network and -`--pod-network-cidr` could be defined as `10.0.0.0/8`, so as to include IP -addresses of all pods in all network namespaces: -``` -kubeadm init --token-ttl 0 --pod-network-cidr=10.0.0.0/8 --apiserver-advertise-address=10.3.1.1 -``` - -If Kubernetes was initialized successfully, it prints out this message: -``` -Your Kubernetes master has initialized successfully! -``` - -After successful initialization, don't forget to set up your .kube directory -as a regular user (as instructed by `kubeadm`): -```bash -mkdir -p $HOME/.kube -sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config -sudo chown $(id -u):$(id -g) $HOME/.kube/config -``` - -### (3/4) Installing the Contiv-VPP Pod Network -If you have already used the Contiv-VPP plugin before, you may need to pull -the most recent Docker images on each node: -``` -bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/pull-images.sh) -``` - -Install the Contiv-VPP network for your cluster as follows: - -- If you do not use the STN feature, install Contiv-vpp as follows: - ``` - kubectl apply -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml - ``` - -- If you use the STN feature, download the `contiv-vpp.yaml` file: - ``` - wget https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml - ``` - Then edit the STN configuration as described [here][16]. Finally, create - the Contiv-vpp deployment from the edited file: - ``` - kubectl apply -f ./contiv-vpp.yaml - ``` - -Beware contiv-etcd data is persisted in `/var/etcd` by default. It has to be cleaned up manually after `kubeadm reset`. -Otherwise outdated data will be loaded by a subsequent deployment. - -You can also generate random subfolder, alternatively: - -``` -curl --silent https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml | sed "s/\/var\/etcd\/contiv-data/\/var\/etcd\/contiv-data\/$RANDOM/g" | kubectl apply -f - -``` - -#### Deployment Verification -After some time, all contiv containers should enter the running state: -``` -root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv -NAME READY STATUS RESTARTS AGE IP NODE -... -contiv-etcd-gwc84 1/1 Running 0 14h 192.168.56.106 cvpp -contiv-ksr-5c2vk 1/1 Running 2 14h 192.168.56.106 cvpp -contiv-vswitch-l59nv 2/2 Running 0 14h 192.168.56.106 cvpp -``` -In particular, make sure that the Contiv-VPP pod IP addresses are the same as -the IP address specified in the `--apiserver-advertise-address=<ip-address>` -argument to kubeadm init. - -Verify that the VPP successfully grabbed the network interface specified in -the VPP startup config (`GigabitEthernet0/4/0` in our case): -``` -$ sudo vppctl -vpp# sh inter - Name Idx State Counter Count -GigabitEthernet0/4/0 1 up rx packets 1294 - rx bytes 153850 - tx packets 512 - tx bytes 21896 - drops 962 - ip4 1032 -host-40df9b44c3d42f4 3 up rx packets 126601 - rx bytes 44628849 - tx packets 132155 - tx bytes 27205450 - drops 24 - ip4 126585 - ip6 16 -host-vppv2 2 up rx packets 132162 - rx bytes 27205824 - tx packets 126658 - tx bytes 44634963 - drops 15 - ip4 132147 - ip6 14 -local0 0 down -``` - -You should also see the interface to kube-dns (`host-40df9b44c3d42f4`) and to the -node's IP stack (`host-vppv2`). - -#### Master Isolation (Optional) -By default, your cluster will not schedule pods on the master for security -reasons. If you want to be able to schedule pods on the master, (e.g., for a -single-machine Kubernetes cluster for development), then run: - -``` -kubectl taint nodes --all node-role.kubernetes.io/master- -``` -More details about installing the pod network can be found in the -[kubeadm manual][4]. - -### (4/4) Joining Your Nodes -To add a new node to your cluster, run as root the command that was output -by kubeadm init. For example: -``` -kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash> -``` -More details can be found int the [kubeadm manual][5]. - -#### Deployment Verification -After some time, all contiv containers should enter the running state: -``` -root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv -NAME READY STATUS RESTARTS AGE IP NODE -contiv-etcd-gwc84 1/1 Running 0 14h 192.168.56.106 cvpp -contiv-ksr-5c2vk 1/1 Running 2 14h 192.168.56.106 cvpp -contiv-vswitch-h6759 2/2 Running 0 14h 192.168.56.105 cvpp-slave2 -contiv-vswitch-l59nv 2/2 Running 0 14h 192.168.56.106 cvpp -etcd-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp -kube-apiserver-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp -kube-controller-manager-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp -kube-dns-545bc4bfd4-fr6j9 3/3 Running 0 14h 10.1.134.2 cvpp -kube-proxy-q8sv2 1/1 Running 0 14h 192.168.56.106 cvpp -kube-proxy-s8kv9 1/1 Running 0 14h 192.168.56.105 cvpp-slave2 -kube-scheduler-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp -``` -In particular, verify that a vswitch pod and a kube-proxy pod is running on -each joined node, as shown above. - -On each joined node, verify that the VPP successfully grabbed the network -interface specified in the VPP startup config (`GigabitEthernet0/4/0` in -our case): -``` -$ sudo vppctl -vpp# sh inter - Name Idx State Counter Count -GigabitEthernet0/4/0 1 up -... -``` -From the vpp CLI on a joined node you can also ping kube-dns to verify -node-to-node connectivity. For example: -``` -vpp# ping 10.1.134.2 -64 bytes from 10.1.134.2: icmp_seq=1 ttl=64 time=.1557 ms -64 bytes from 10.1.134.2: icmp_seq=2 ttl=64 time=.1339 ms -64 bytes from 10.1.134.2: icmp_seq=3 ttl=64 time=.1295 ms -64 bytes from 10.1.134.2: icmp_seq=4 ttl=64 time=.1714 ms -64 bytes from 10.1.134.2: icmp_seq=5 ttl=64 time=.1317 ms - -Statistics: 5 sent, 5 received, 0% packet loss -``` -### Deploying Example Applications -#### Simple Deployment -You can go ahead and create a simple deployment: -``` -$ kubectl run nginx --image=nginx --replicas=2 -``` - -Use `kubectl describe pod` to get the IP address of a pod, e.g.: -``` -$ kubectl describe pod nginx | grep IP -``` -You should see two ip addresses, for example: -``` -IP: 10.1.1.3 -IP: 10.1.1.4 -``` - -You can check the pods' connectivity in one of the following ways: -* Connect to the VPP debug CLI and ping any pod: -``` - sudo vppctl - vpp# ping 10.1.1.3 -``` -* Start busybox and ping any pod: -``` - kubectl run busybox --rm -ti --image=busybox /bin/sh - If you don't see a command prompt, try pressing enter. - / # - / # ping 10.1.1.3 - -``` -* You should be able to ping any pod from the host: -``` - ping 10.1.1.3 -``` - -#### Deploying Pods on Different Nodes -to enable pod deployment on the master, untaint the master first: -``` -kubectl taint nodes --all node-role.kubernetes.io/master- -``` - -In order to verify inter-node pod connectivity, we need to tell Kubernetes -to deploy one pod on the master node and one POD on the worker. For this, -we can use node selectors. - -In your deployment YAMLs, add the `nodeSelector` sections that refer to -preferred node hostnames, e.g.: -``` - nodeSelector: - kubernetes.io/hostname: vm5 -``` - -Example of whole JSONs: -``` -apiVersion: v1 -kind: Pod -metadata: - name: nginx1 -spec: - nodeSelector: - kubernetes.io/hostname: vm5 - containers: - - name: nginx - - : nginx -``` - -``` -apiVersion: v1 -kind: Pod -metadata: - name: nginx2 -spec: - nodeSelector: - kubernetes.io/hostname: vm6 - containers: - - name: nginx - image: nginx -``` - -After deploying the JSONs, verify they were deployed on different hosts: -``` -$ kubectl get pods -o wide -NAME READY STATUS RESTARTS AGE IP NODE -nginx1 1/1 Running 0 13m 10.1.36.2 vm5 -nginx2 1/1 Running 0 13m 10.1.219.3 vm6 -``` - -Now you can verify the connectivity to both nginx PODs from a busybox POD: -``` -kubectl run busybox --rm -it --image=busybox /bin/sh - -/ # wget 10.1.36.2 -Connecting to 10.1.36.2 (10.1.36.2:80) -index.html 100% |*******************************************************************************************************************************************************************| 612 0:00:00 ETA - -/ # rm index.html - -/ # wget 10.1.219.3 -Connecting to 10.1.219.3 (10.1.219.3:80) -index.html 100% |*******************************************************************************************************************************************************************| 612 0:00:00 ETA -``` - -### Uninstalling Contiv-VPP -To uninstall the network plugin itself, use `kubectl`: -``` -kubectl delete -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml -``` - -### Tearing down Kubernetes -* First, drain the node and make sure that the node is empty before -shutting it down: -``` - kubectl drain <node name> --delete-local-data --force --ignore-daemonsets - kubectl delete node <node name> -``` -* Next, on the node being removed, reset all kubeadm installed state: -``` - rm -rf $HOME/.kube - sudo su - kubeadm reset -``` - -* If you added environment variable definitions into - `/etc/systemd/system/kubelet.service.d/10-kubeadm.conf`, this would have been a process from the [Custom Management Network file][10], then remove the definitions now. - -### Troubleshooting -Some of the issues that can occur during the installation are: - -- Forgetting to create and initialize the `.kube` directory in your home - directory (As instructed by `kubeadm init --token-ttl 0`). This can manifest - itself as the following error: - ``` - W1017 09:25:43.403159 2233 factory_object_mapping.go:423] Failed to download OpenAPI (Get https://192.168.209.128:6443/swagger-2.0.0.pb-v1: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")), falling back to swagger - Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes") - ``` -- Previous installation lingering on the file system. - `'kubeadm init --token-ttl 0` fails to initialize kubelet with one or more - of the following error messages: - ``` - ... - [kubelet-check] It seems like the kubelet isn't running or healthy. - [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. - ... - ``` - -If you run into any of the above issues, try to clean up and reinstall as root: -``` -sudo su -rm -rf $HOME/.kube -kubeadm reset -kubeadm init --token-ttl 0 -rm -rf /var/etcd/contiv-data -rm -rf /var/bolt/bolt.db -``` - -## Contiv-specific kubeadm installation on Aarch64 -Supplemental instructions apply when using Contiv-VPP for Aarch64. Most -installation steps for Aarch64 are the same as that described earlier in this -chapter, so you should firstly read it before you start the installation on -Aarch64 platform. - -Use the [Aarch64-specific kubeadm install instructions][18] to manually install -Kubernetes with Contiv-VPP networking on one or more bare-metals of Aarch64 platform. - -[1]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/ -[3]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#initializing-your-master -[4]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network -[5]: https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#joining-your-nodes -[6]: https://kubernetes.io/docs/setup/independent/install-kubeadm/ -[8]: #tearing-down-kubernetes -[10]: https://github.com/contiv/vpp/blob/master/docs/CUSTOM_MGMT_NETWORK.md#setting-up-a-custom-management-network-on-multi-homed-nodes -[11]: ../vagrant/README.md -[12]: https://github.com/contiv/vpp/tree/master/docs/CUSTOM_MGMT_NETWORK.md -[13]: https://github.com/contiv/vpp/tree/master/docs/VMWARE_FUSION_HOST.md -[14]: https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md -[15]: https://github.com/contiv/vpp/tree/master/docs/MULTI_NIC_SETUP.md -[16]: https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md#configuring-stn-in-contiv-vpp-k8s-deployment-files -[17]: https://github.com/contiv/vpp/tree/master/k8s/README.md#setup-node-sh -[18]: https://github.com/contiv/vpp/blob/master/docs/arm64/MANUAL_INSTALL_ARM64.md diff --git a/docs/usecases/contiv/MANUAL_INSTALL.rst b/docs/usecases/contiv/MANUAL_INSTALL.rst new file mode 100644 index 00000000000..4e0d0c6c52b --- /dev/null +++ b/docs/usecases/contiv/MANUAL_INSTALL.rst @@ -0,0 +1,609 @@ +Manual Installation +=================== + +This document describes how to clone the Contiv repository and then use +`kubeadm <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/>`__ +to manually install Kubernetes with Contiv-VPP networking on one or more +bare metal or VM hosts. + +Clone the Contiv Repository +--------------------------- + +To clone the Contiv repository enter the following command: + +:: + + git clone https://github.com/contiv/vpp/<repository-name> + +**Note:** Replace ** with the name you want assigned to your cloned +contiv repository. + +The cloned repository has important folders that contain content that +are referenced in this Contiv documentation; those folders are noted +below: + +:: + + vpp-contiv2$ ls + build build-root doxygen gmod LICENSE Makefile RELEASE.md src + build-data docs extras INFO.yaml MAINTAINERS README.md sphinx_venv test + +Preparing Your Hosts +-------------------- + +Host-specific Configurations +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- **VmWare VMs**: the vmxnet3 driver is required on each interface that + will be used by VPP. Please see + `here <https://github.com/contiv/vpp/tree/master/docs/VMWARE_FUSION_HOST.md>`__ + for instructions how to install the vmxnet3 driver on VmWare Fusion. + +Setting up Network Adapter(s) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Setting up DPDK +^^^^^^^^^^^^^^^ + +DPDK setup must be completed **on each node** as follows: + +- Load the PCI UIO driver: + + :: + + $ sudo modprobe uio_pci_generic + +- Verify that the PCI UIO driver has loaded successfully: + + :: + + $ lsmod | grep uio + uio_pci_generic 16384 0 + uio 20480 1 uio_pci_generic + + Please note that this driver needs to be loaded upon each server + bootup, so you may want to add ``uio_pci_generic`` into the + ``/etc/modules`` file, or a file in the ``/etc/modules-load.d/`` + directory. For example, the ``/etc/modules`` file could look as + follows: + + :: + + # /etc/modules: kernel modules to load at boot time. + # + # This file contains the names of kernel modules that should be loaded + # at boot time, one per line. Lines beginning with "#" are ignored. + uio_pci_generic + + .. rubric:: Determining Network Adapter PCI Addresses + :name: determining-network-adapter-pci-addresses + + You need the PCI address of the network interface that VPP will use + for the multi-node pod interconnect. On Debian-based distributions, + you can use ``lshw``\ (*): + +:: + + $ sudo lshw -class network -businfo + Bus info Device Class Description + ==================================================== + pci@0000:00:03.0 ens3 network Virtio network device + pci@0000:00:04.0 ens4 network Virtio network device + +**Note:** On CentOS/RedHat/Fedora distributions, ``lshw`` may not be +available by default, install it by issuing the following command: +``yum -y install lshw`` + +Configuring vswitch to Use Network Adapters +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Finally, you need to set up the vswitch to use the network adapters: + +- `Setup on a node with a single + NIC <https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md>`__ +- `Setup a node with multiple + NICs <https://github.com/contiv/vpp/tree/master/docs/MULTI_NIC_SETUP.md>`__ + +Using a Node Setup Script +~~~~~~~~~~~~~~~~~~~~~~~~~ + +You can perform the above steps using the `node setup +script <https://github.com/contiv/vpp/tree/master/k8s/README.md#setup-node-sh>`__. + +Installing Kubernetes with Contiv-VPP CNI plugin +------------------------------------------------ + +After the nodes you will be using in your K8s cluster are prepared, you +can install the cluster using +`kubeadm <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/>`__. + +(1/4) Installing Kubeadm on Your Hosts +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +For first-time installation, see `Installing +kubeadm <https://kubernetes.io/docs/setup/independent/install-kubeadm/>`__. +To update an existing installation, you should do a +``apt-get update && apt-get upgrade`` or ``yum update`` to get the +latest version of kubeadm. + +On each host with multiple NICs where the NIC that will be used for +Kubernetes management traffic is not the one pointed to by the default +route out of the host, a `custom management +network <https://github.com/contiv/vpp/tree/master/docs/CUSTOM_MGMT_NETWORK.md>`__ +for Kubernetes must be configured. + +Using Kubernetes 1.10 and Above +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In K8s 1.10, support for huge pages in a pod has been introduced. For +now, this feature must be either disabled or memory limit must be +defined for vswitch container. + +To disable huge pages, perform the following steps as root: \* Using +your favorite editor, disable huge pages in the kubelet configuration +file (``/etc/systemd/system/kubelet.service.d/10-kubeadm.conf`` or +``/etc/default/kubelet`` for version 1.11+): + +:: + + Environment="KUBELET_EXTRA_ARGS=--feature-gates HugePages=false" + +- Restart the kubelet daemon: + +:: + + systemctl daemon-reload + systemctl restart kubelet + +To define memory limit, append the following snippet to vswitch +container in deployment yaml file: + +:: + + resources: + limits: + hugepages-2Mi: 1024Mi + memory: 1024Mi + +or set ``contiv.vswitch.defineMemoryLimits`` to ``true`` in `helm +values <https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/README.md>`__. + +(2/4) Initializing Your Master +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Before initializing the master, you may want to +`remove <#tearing-down-kubernetes>`__ any previously installed K8s +components. Then, proceed with master initialization as described in the +`kubeadm +manual <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#initializing-your-master>`__. +Execute the following command as root: + +:: + + kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16 + +**Note:** ``kubeadm init`` will autodetect the network interface to +advertise the master on as the interface with the default gateway. If +you want to use a different interface (i.e. a custom management network +setup), specify the ``--apiserver-advertise-address=<ip-address>`` +argument to kubeadm init. For example: + +:: + + kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16 --apiserver-advertise-address=192.168.56.106 + +**Note:** The CIDR specified with the flag ``--pod-network-cidr`` is +used by kube-proxy, and it **must include** the ``PodSubnetCIDR`` from +the ``IPAMConfig`` section in the Contiv-vpp config map in Contiv-vpp’s +deployment file +`contiv-vpp.yaml <https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/values.yaml>`__. +Pods in the host network namespace are a special case; they share their +respective interfaces and IP addresses with the host. For proxying to +work properly it is therefore required for services with backends +running on the host to also **include the node management IP** within +the ``--pod-network-cidr`` subnet. For example, with the default +``PodSubnetCIDR=10.1.0.0/16`` and ``PodIfIPCIDR=10.2.1.0/24``, the +subnet ``10.3.0.0/16`` could be allocated for the management network and +``--pod-network-cidr`` could be defined as ``10.0.0.0/8``, so as to +include IP addresses of all pods in all network namespaces: + +:: + + kubeadm init --token-ttl 0 --pod-network-cidr=10.0.0.0/8 --apiserver-advertise-address=10.3.1.1 + +If Kubernetes was initialized successfully, it prints out this message: + +:: + + Your Kubernetes master has initialized successfully! + +After successful initialization, don’t forget to set up your .kube +directory as a regular user (as instructed by ``kubeadm``): + +.. code:: bash + + mkdir -p $HOME/.kube + sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config + sudo chown $(id -u):$(id -g) $HOME/.kube/config + +(3/4) Installing the Contiv-VPP Pod Network +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If you have already used the Contiv-VPP plugin before, you may need to +pull the most recent Docker images on each node: + +:: + + bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/pull-images.sh) + +Install the Contiv-VPP network for your cluster as follows: + +- If you do not use the STN feature, install Contiv-vpp as follows: + + :: + + kubectl apply -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml + +- If you use the STN feature, download the ``contiv-vpp.yaml`` file: + + :: + + wget https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml + + Then edit the STN configuration as described + `here <https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md#configuring-stn-in-contiv-vpp-k8s-deployment-files>`__. + Finally, create the Contiv-vpp deployment from the edited file: + + :: + + kubectl apply -f ./contiv-vpp.yaml + +Beware contiv-etcd data is persisted in ``/var/etcd`` by default. It has +to be cleaned up manually after ``kubeadm reset``. Otherwise outdated +data will be loaded by a subsequent deployment. + +You can also generate random subfolder, alternatively: + +:: + + curl --silent https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml | sed "s/\/var\/etcd\/contiv-data/\/var\/etcd\/contiv-data\/$RANDOM/g" | kubectl apply -f - + +Deployment Verification +^^^^^^^^^^^^^^^^^^^^^^^ + +After some time, all contiv containers should enter the running state: + +:: + + root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv + NAME READY STATUS RESTARTS AGE IP NODE + ... + contiv-etcd-gwc84 1/1 Running 0 14h 192.168.56.106 cvpp + contiv-ksr-5c2vk 1/1 Running 2 14h 192.168.56.106 cvpp + contiv-vswitch-l59nv 2/2 Running 0 14h 192.168.56.106 cvpp + +In particular, make sure that the Contiv-VPP pod IP addresses are the +same as the IP address specified in the +``--apiserver-advertise-address=<ip-address>`` argument to kubeadm init. + +Verify that the VPP successfully grabbed the network interface specified +in the VPP startup config (``GigabitEthernet0/4/0`` in our case): + +:: + + $ sudo vppctl + vpp# sh inter + Name Idx State Counter Count + GigabitEthernet0/4/0 1 up rx packets 1294 + rx bytes 153850 + tx packets 512 + tx bytes 21896 + drops 962 + ip4 1032 + host-40df9b44c3d42f4 3 up rx packets 126601 + rx bytes 44628849 + tx packets 132155 + tx bytes 27205450 + drops 24 + ip4 126585 + ip6 16 + host-vppv2 2 up rx packets 132162 + rx bytes 27205824 + tx packets 126658 + tx bytes 44634963 + drops 15 + ip4 132147 + ip6 14 + local0 0 down + +You should also see the interface to kube-dns (``host-40df9b44c3d42f4``) +and to the node’s IP stack (``host-vppv2``). + +Master Isolation (Optional) +^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +By default, your cluster will not schedule pods on the master for +security reasons. If you want to be able to schedule pods on the master, +(e.g., for a single-machine Kubernetes cluster for development), then +run: + +:: + + kubectl taint nodes --all node-role.kubernetes.io/master- + +More details about installing the pod network can be found in the +`kubeadm +manual <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network>`__. + +(4/4) Joining Your Nodes +~~~~~~~~~~~~~~~~~~~~~~~~ + +To add a new node to your cluster, run as root the command that was +output by kubeadm init. For example: + +:: + + kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash> + +More details can be found int the `kubeadm +manual <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#joining-your-nodes>`__. + +.. _deployment-verification-1: + +Deployment Verification +^^^^^^^^^^^^^^^^^^^^^^^ + +After some time, all contiv containers should enter the running state: + +:: + + root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv + NAME READY STATUS RESTARTS AGE IP NODE + contiv-etcd-gwc84 1/1 Running 0 14h 192.168.56.106 cvpp + contiv-ksr-5c2vk 1/1 Running 2 14h 192.168.56.106 cvpp + contiv-vswitch-h6759 2/2 Running 0 14h 192.168.56.105 cvpp-slave2 + contiv-vswitch-l59nv 2/2 Running 0 14h 192.168.56.106 cvpp + etcd-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp + kube-apiserver-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp + kube-controller-manager-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp + kube-dns-545bc4bfd4-fr6j9 3/3 Running 0 14h 10.1.134.2 cvpp + kube-proxy-q8sv2 1/1 Running 0 14h 192.168.56.106 cvpp + kube-proxy-s8kv9 1/1 Running 0 14h 192.168.56.105 cvpp-slave2 + kube-scheduler-cvpp 1/1 Running 0 14h 192.168.56.106 cvpp + +In particular, verify that a vswitch pod and a kube-proxy pod is running +on each joined node, as shown above. + +On each joined node, verify that the VPP successfully grabbed the +network interface specified in the VPP startup config +(``GigabitEthernet0/4/0`` in our case): + +:: + + $ sudo vppctl + vpp# sh inter + Name Idx State Counter Count + GigabitEthernet0/4/0 1 up + ... + +From the vpp CLI on a joined node you can also ping kube-dns to verify +node-to-node connectivity. For example: + +:: + + vpp# ping 10.1.134.2 + 64 bytes from 10.1.134.2: icmp_seq=1 ttl=64 time=.1557 ms + 64 bytes from 10.1.134.2: icmp_seq=2 ttl=64 time=.1339 ms + 64 bytes from 10.1.134.2: icmp_seq=3 ttl=64 time=.1295 ms + 64 bytes from 10.1.134.2: icmp_seq=4 ttl=64 time=.1714 ms + 64 bytes from 10.1.134.2: icmp_seq=5 ttl=64 time=.1317 ms + + Statistics: 5 sent, 5 received, 0% packet loss + +Deploying Example Applications +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Simple Deployment +^^^^^^^^^^^^^^^^^ + +You can go ahead and create a simple deployment: + +:: + + $ kubectl run nginx --image=nginx --replicas=2 + +Use ``kubectl describe pod`` to get the IP address of a pod, e.g.: + +:: + + $ kubectl describe pod nginx | grep IP + +You should see two ip addresses, for example: + +:: + + IP: 10.1.1.3 + IP: 10.1.1.4 + +You can check the pods’ connectivity in one of the following ways: \* +Connect to the VPP debug CLI and ping any pod: + +:: + + sudo vppctl + vpp# ping 10.1.1.3 + +- Start busybox and ping any pod: + +:: + + kubectl run busybox --rm -ti --image=busybox /bin/sh + If you don't see a command prompt, try pressing enter. + / # + / # ping 10.1.1.3 + +- You should be able to ping any pod from the host: + +:: + + ping 10.1.1.3 + +Deploying Pods on Different Nodes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +to enable pod deployment on the master, untaint the master first: + +:: + + kubectl taint nodes --all node-role.kubernetes.io/master- + +In order to verify inter-node pod connectivity, we need to tell +Kubernetes to deploy one pod on the master node and one POD on the +worker. For this, we can use node selectors. + +In your deployment YAMLs, add the ``nodeSelector`` sections that refer +to preferred node hostnames, e.g.: + +:: + + nodeSelector: + kubernetes.io/hostname: vm5 + +Example of whole JSONs: + +:: + + apiVersion: v1 + kind: Pod + metadata: + name: nginx1 + spec: + nodeSelector: + kubernetes.io/hostname: vm5 + containers: + - name: nginx + + : nginx + +:: + + apiVersion: v1 + kind: Pod + metadata: + name: nginx2 + spec: + nodeSelector: + kubernetes.io/hostname: vm6 + containers: + - name: nginx + image: nginx + +After deploying the JSONs, verify they were deployed on different hosts: + +:: + + $ kubectl get pods -o wide + NAME READY STATUS RESTARTS AGE IP NODE + nginx1 1/1 Running 0 13m 10.1.36.2 vm5 + nginx2 1/1 Running 0 13m 10.1.219.3 vm6 + +Now you can verify the connectivity to both nginx PODs from a busybox +POD: + +:: + + kubectl run busybox --rm -it --image=busybox /bin/sh + + / # wget 10.1.36.2 + Connecting to 10.1.36.2 (10.1.36.2:80) + index.html 100% |*******************************************************************************************************************************************************************| 612 0:00:00 ETA + + / # rm index.html + + / # wget 10.1.219.3 + Connecting to 10.1.219.3 (10.1.219.3:80) + index.html 100% |*******************************************************************************************************************************************************************| 612 0:00:00 ETA + +Uninstalling Contiv-VPP +~~~~~~~~~~~~~~~~~~~~~~~ + +To uninstall the network plugin itself, use ``kubectl``: + +:: + + kubectl delete -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml + +Tearing down Kubernetes +~~~~~~~~~~~~~~~~~~~~~~~ + +- First, drain the node and make sure that the node is empty before + shutting it down: + +:: + + kubectl drain <node name> --delete-local-data --force --ignore-daemonsets + kubectl delete node <node name> + +- Next, on the node being removed, reset all kubeadm installed state: + +:: + + rm -rf $HOME/.kube + sudo su + kubeadm reset + +- If you added environment variable definitions into + ``/etc/systemd/system/kubelet.service.d/10-kubeadm.conf``, this would + have been a process from the `Custom Management Network + file <https://github.com/contiv/vpp/blob/master/docs/CUSTOM_MGMT_NETWORK.md#setting-up-a-custom-management-network-on-multi-homed-nodes>`__, + then remove the definitions now. + +Troubleshooting +~~~~~~~~~~~~~~~ + +Some of the issues that can occur during the installation are: + +- Forgetting to create and initialize the ``.kube`` directory in your + home directory (As instructed by ``kubeadm init --token-ttl 0``). + This can manifest itself as the following error: + + :: + + W1017 09:25:43.403159 2233 factory_object_mapping.go:423] Failed to download OpenAPI (Get https://192.168.209.128:6443/swagger-2.0.0.pb-v1: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")), falling back to swagger + Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes") + +- Previous installation lingering on the file system. + ``'kubeadm init --token-ttl 0`` fails to initialize kubelet with one + or more of the following error messages: + + :: + + ... + [kubelet-check] It seems like the kubelet isn't running or healthy. + [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused. + ... + +If you run into any of the above issues, try to clean up and reinstall +as root: + +:: + + sudo su + rm -rf $HOME/.kube + kubeadm reset + kubeadm init --token-ttl 0 + rm -rf /var/etcd/contiv-data + rm -rf /var/bolt/bolt.db + +Contiv-specific kubeadm installation on Aarch64 +----------------------------------------------- + +Supplemental instructions apply when using Contiv-VPP for Aarch64. Most +installation steps for Aarch64 are the same as that described earlier in +this chapter, so you should firstly read it before you start the +installation on Aarch64 platform. + +Use the `Aarch64-specific kubeadm install +instructions <https://github.com/contiv/vpp/blob/master/docs/arm64/MANUAL_INSTALL_ARM64.md>`__ +to manually install Kubernetes with Contiv-VPP networking on one or more +bare-metals of Aarch64 platform. diff --git a/docs/usecases/contiv/MULTI_NIC_SETUP.md b/docs/usecases/contiv/MULTI_NIC_SETUP.md deleted file mode 100644 index cacbcbb464b..00000000000 --- a/docs/usecases/contiv/MULTI_NIC_SETUP.md +++ /dev/null @@ -1,21 +0,0 @@ -### Setting Up a Node with Multiple NICs - -* First, configure hardware interfaces in the VPP startup config, as -described [here](https://github.com/contiv/vpp/blob/master/docs/VPP_CONFIG.md#multi-nic-configuration). - -* For each interface owned by Linux, you need to provide individual - configuration for each interface used by VPP in the Node Configuration - for the node in the `contiv-vpp.yaml`. For example, if both `ens3` and - `ens4` are known to Linux, then put the following stanza into the node's - NodeConfig: -``` -... - NodeConfig: - - NodeName: "ubuntu-1" - StealInterface: "ens3" - StealInterface: "ens4" -... -``` - If only `ens3` is known to Linux, you only put a line for `ens3` into the - above NodeConfig. - diff --git a/docs/usecases/contiv/MULTI_NIC_SETUP.rst b/docs/usecases/contiv/MULTI_NIC_SETUP.rst new file mode 100644 index 00000000000..57242d18f87 --- /dev/null +++ b/docs/usecases/contiv/MULTI_NIC_SETUP.rst @@ -0,0 +1,24 @@ +Setting Up a Node with Multiple NICs +==================================== + +- First, configure hardware interfaces in the VPP startup config, as + described + `here <https://github.com/contiv/vpp/blob/master/docs/VPP_CONFIG.md#multi-nic-configuration>`__. + +- For each interface owned by Linux, you need to provide individual + configuration for each interface used by VPP in the Node + Configuration for the node in the ``contiv-vpp.yaml``. For example, + if both ``ens3`` and ``ens4`` are known to Linux, then put the + following stanza into the node’s NodeConfig: + +:: + + ... + NodeConfig: + - NodeName: "ubuntu-1" + StealInterface: "ens3" + StealInterface: "ens4" + ... + +If only ``ens3`` is known to Linux, you only put a line for ``ens3`` +into the above NodeConfig. diff --git a/docs/usecases/contiv/NETWORKING.md b/docs/usecases/contiv/NETWORKING.md deleted file mode 100644 index 25ce3ce0410..00000000000 --- a/docs/usecases/contiv/NETWORKING.md +++ /dev/null @@ -1,137 +0,0 @@ -# Contiv/VPP Network Operation - -This document describes the network operation of the Contiv/VPP k8s network plugin. It -elaborates the operation and config options of the Contiv IPAM, as well as -details on how the VPP gets programmed by Contiv/VPP control plane. - -The following picture shows 2-node k8s deployment of Contiv/VPP, with a VXLAN tunnel -established between the nodes to forward inter-node POD traffic. The IPAM options -are depicted on the Node 1, whereas the VPP programming is depicted on the Node 2. - -![Contiv/VPP Architecture](/_images/contiv-networking.png "contiv-networking.png") - -## Contiv/VPP IPAM (IP Address Management) - -IPAM in Contiv/VPP is based on the concept of **Node ID**. The Node ID is a number -that uniquely identifies a node in the k8s cluster. The first node is assigned -the ID of 1, the second node 2, etc. If a node leaves the cluster, its -ID is released back to the pool and will be re-used by the next node. - -The Node ID is used to calculate per-node IP subnets for PODs -and other internal subnets that need to be unique on each node. Apart from the Node ID, -the input for IPAM calculations is a set of config knobs, which can be specified -in the `IPAMConfig` section of the \[Contiv/VPP deployment YAML\](../../../k8s/contiv-vpp.yaml): - -- **PodSubnetCIDR** (default `10.1.0.0/16`): each pod gets an IP address assigned -from this range. The size of this range (default `/16`) dictates upper limit of -POD count for the entire k8s cluster (default 65536 PODs). - -- **PodNetworkPrefixLen** (default `24`): per-node dedicated podSubnet range. -From the allocatable range defined in `PodSubnetCIDR`, this value will dictate the -allocation for each node. With the default value (`24`) this indicates that each node -has a `/24` slice of the `PodSubnetCIDR`. The Node ID is used to address the node. -In case of `PodSubnetCIDR = 10.1.0.0/16`, `PodNetworkPrefixLen = 24` and `NodeID = 5`, -the resulting POD subnet for the node would be `10.1.5.0/24`. - -- **PodIfIPCIDR** (default `10.2.1.0/24`): VPP-internal addresses put the VPP interfaces -facing towards the PODs into L3 mode. This IP range will be reused -on each node, thereby it is never externally addressable outside of the node itself. -The only requirement is that this subnet should not collide with any other IPAM subnet. - -- **VPPHostSubnetCIDR** (default `172.30.0.0/16`): used for addressing -the interconnect of VPP with the Linux network stack, within the same node. -Since this subnet needs to be unique on each node, the Node ID is used to determine -the actual subnet used on the node with the combination of `VPPHostNetworkPrefixLen`, `PodSubnetCIDR` and `PodNetworkPrefixLen`. - -- **VPPHostNetworkPrefixLen** (default `24`): used to calculate the subnet -for addressing the interconnect of VPP with the Linux network stack, within the same node. -With `VPPHostSubnetCIDR = 172.30.0.0/16`, `VPPHostNetworkPrefixLen = 24` and -`NodeID = 5` the resulting subnet for the node would be `172.30.5.0/24`. - -- **NodeInterconnectCIDR** (default `192.168.16.0/24`): range for the addresses -assigned to the data plane interfaces managed by VPP. Unless DHCP is used -(`NodeInterconnectDHCP = True`), the Contiv/VPP control plane automatically assigns -an IP address from this range to the DPDK-managed ethernet interface bound to VPP -on each node. The actual IP address will be calculated from the Node ID (e.g., with -`NodeInterconnectCIDR = 192.168.16.0/24` and `NodeID = 5`, the resulting IP -address assigned to the ethernet interface on VPP will be `192.168.16.5` ). - -- **NodeInterconnectDHCP** (default `False`): instead of assigning the IPs -for the data plane interfaces, which are managed by VPP from `NodeInterconnectCIDR` by the Contiv/VPP -control plane, DHCP assigns the IP addresses. The DHCP must be running in the network where the data -plane interface is connected, in case `NodeInterconnectDHCP = True`, -`NodeInterconnectCIDR` is ignored. - -- **VxlanCIDR** (default `192.168.30.0/24`): in order to provide inter-node -POD to POD connectivity via any underlay network (not necessarily an L2 network), -Contiv/VPP sets up a VXLAN tunnel overlay between each of the 2 nodes within the cluster. Each node needs its unique IP address of the VXLAN BVI interface. This IP address -is automatically calculated from the Node ID, (e.g., with `VxlanCIDR = 192.168.30.0/24` -and `NodeID = 5`, the resulting IP address assigned to the VXLAN BVI interface will be `192.168.30.5`). - -## VPP Programming -This section describes how the Contiv/VPP control plane programs VPP, based on the -events it receives from k8s. This section is not necessarily for understanding -basic Contiv/VPP operation, but is very useful for debugging purposes. - -Contiv/VPP currently uses a single VRF to forward the traffic between PODs on a node, -PODs on different nodes, host network stack, and DPDK-managed dataplane interface. The forwarding -between each of them is purely L3-based, even for cases of communication -between 2 PODs within the same node. - -#### DPDK-Managed Data Interface -In order to allow inter-node communication between PODs on different -nodes and between PODs and outside world, Contiv/VPP uses data-plane interfaces -bound to VPP using DPDK. Each node should have one "main" VPP interface, -which is unbound from the host network stack and bound to VPP. -The Contiv/VPP control plane automatically configures the interface either -via DHCP, or with a statically assigned address (see `NodeInterconnectCIDR` and -`NodeInterconnectDHCP` yaml settings). - -#### PODs on the Same Node -PODs are connected to VPP using virtio-based TAP interfaces created by VPP, -with the POD-end of the interface placed into the POD container network namespace. -Each POD is assigned an IP address from the `PodSubnetCIDR`. The allocated IP -is configured with the prefix length `/32`. Additionally, a static route pointing -towards the VPP is configured in the POD network namespace. -The prefix length `/32` means that all IP traffic will be forwarded to the -default route - VPP. To get rid of unnecessary broadcasts between POD and VPP, -a static ARP entry is configured for the gateway IP in the POD namespace, as well -as for POD IP on VPP. Both ends of the TAP interface have a static (non-default) -MAC address applied. - -#### PODs with hostNetwork=true -PODs with a `hostNetwork=true` attribute are not placed into a separate network namespace, they instead use the main host Linux network namespace; therefore, they are not directly connected to the VPP. They rely on the interconnection between the VPP and the host Linux network stack, -which is described in the next paragraph. Note, when these PODs access some service IP, their network communication will be NATed in Linux (by iptables rules programmed by kube-proxy) -as opposed to VPP, which is the case for the PODs connected to VPP directly. - -#### Linux Host Network Stack -In order to interconnect the Linux host network stack with VPP (to allow access -to the cluster resources from the host itself, as well as for the PODs with `hostNetwork=true`), -VPP creates a TAP interface between VPP and the main network namespace. The TAP interface is configured with IP addresses from the `VPPHostSubnetCIDR` range, with `.1` in the latest octet on the VPP side, and `.2` on the host side. The name of the host interface is `vpp1`. The host has static routes pointing to VPP configured with: -- A route to the whole `PodSubnetCIDR` to route traffic targeting PODs towards VPP. -- A route to `ServiceCIDR` (default `10.96.0.0/12`), to route service IP targeted traffic that has not been translated by kube-proxy for some reason towards VPP. -- The host also has a static ARP entry configured for the IP of the VPP-end TAP interface, to get rid of unnecessary broadcasts between the main network namespace and VPP. - -#### VXLANs to Other Nodes -In order to provide inter-node POD to POD connectivity via any underlay network -(not necessarily an L2 network), Contiv/VPP sets up a VXLAN tunnel overlay between -each 2 nodes within the cluster (full mesh). - -All VXLAN tunnels are terminated in one bridge domain on each VPP. The bridge domain -has learning and flooding disabled, the l2fib of the bridge domain contains a static entry for each VXLAN tunnel. Each bridge domain has a BVI interface, which -interconnects the bridge domain with the main VRF (L3 forwarding). This interface needs -a unique IP address, which is assigned from the `VxlanCIDR` as describe above. - -The main VRF contains several static routes that point to the BVI IP addresses of other nodes. -For each node, it is a route to PODSubnet and VppHostSubnet of the remote node, as well as a route -to the management IP address of the remote node. For each of these routes, the next hop IP is the -BVI interface IP of the remote node, which goes via the BVI interface of the local node. - -The VXLAN tunnels and the static routes pointing to them are added/deleted on each VPP, -whenever a node is added/deleted in the k8s cluster. - - -#### More Info -Please refer to the \[Packet Flow Dev Guide\](../dev-guide/PACKET_FLOW.html) for more -detailed description of paths traversed by request and response packets -inside Contiv/VPP Kubernetes cluster under different situations.
\ No newline at end of file diff --git a/docs/usecases/contiv/NETWORKING.rst b/docs/usecases/contiv/NETWORKING.rst new file mode 100644 index 00000000000..b6799961c1d --- /dev/null +++ b/docs/usecases/contiv/NETWORKING.rst @@ -0,0 +1,196 @@ +Contiv/VPP Network Operation +============================ + +This document describes the network operation of the Contiv/VPP k8s +network plugin. It elaborates the operation and config options of the +Contiv IPAM, as well as details on how the VPP gets programmed by +Contiv/VPP control plane. + +The following picture shows 2-node k8s deployment of Contiv/VPP, with a +VXLAN tunnel established between the nodes to forward inter-node POD +traffic. The IPAM options are depicted on the Node 1, whereas the VPP +programming is depicted on the Node 2. + +.. figure:: /_images/contiv-networking.png + :alt: contiv-networking.png + + Contiv/VPP Architecture + +Contiv/VPP IPAM (IP Address Management) +--------------------------------------- + +IPAM in Contiv/VPP is based on the concept of **Node ID**. The Node ID +is a number that uniquely identifies a node in the k8s cluster. The +first node is assigned the ID of 1, the second node 2, etc. If a node +leaves the cluster, its ID is released back to the pool and will be +re-used by the next node. + +The Node ID is used to calculate per-node IP subnets for PODs and other +internal subnets that need to be unique on each node. Apart from the +Node ID, the input for IPAM calculations is a set of config knobs, which +can be specified in the ``IPAMConfig`` section of the [Contiv/VPP +deployment YAML](../../../k8s/contiv-vpp.yaml): + +- **PodSubnetCIDR** (default ``10.1.0.0/16``): each pod gets an IP + address assigned from this range. The size of this range (default + ``/16``) dictates upper limit of POD count for the entire k8s cluster + (default 65536 PODs). + +- **PodNetworkPrefixLen** (default ``24``): per-node dedicated + podSubnet range. From the allocatable range defined in + ``PodSubnetCIDR``, this value will dictate the allocation for each + node. With the default value (``24``) this indicates that each node + has a ``/24`` slice of the ``PodSubnetCIDR``. The Node ID is used to + address the node. In case of ``PodSubnetCIDR = 10.1.0.0/16``, + ``PodNetworkPrefixLen = 24`` and ``NodeID = 5``, the resulting POD + subnet for the node would be ``10.1.5.0/24``. + +- **PodIfIPCIDR** (default ``10.2.1.0/24``): VPP-internal addresses put + the VPP interfaces facing towards the PODs into L3 mode. This IP + range will be reused on each node, thereby it is never externally + addressable outside of the node itself. The only requirement is that + this subnet should not collide with any other IPAM subnet. + +- **VPPHostSubnetCIDR** (default ``172.30.0.0/16``): used for + addressing the interconnect of VPP with the Linux network stack, + within the same node. Since this subnet needs to be unique on each + node, the Node ID is used to determine the actual subnet used on the + node with the combination of ``VPPHostNetworkPrefixLen``, + ``PodSubnetCIDR`` and ``PodNetworkPrefixLen``. + +- **VPPHostNetworkPrefixLen** (default ``24``): used to calculate the + subnet for addressing the interconnect of VPP with the Linux network + stack, within the same node. With + ``VPPHostSubnetCIDR = 172.30.0.0/16``, + ``VPPHostNetworkPrefixLen = 24`` and ``NodeID = 5`` the resulting + subnet for the node would be ``172.30.5.0/24``. + +- **NodeInterconnectCIDR** (default ``192.168.16.0/24``): range for the + addresses assigned to the data plane interfaces managed by VPP. + Unless DHCP is used (``NodeInterconnectDHCP = True``), the Contiv/VPP + control plane automatically assigns an IP address from this range to + the DPDK-managed ethernet interface bound to VPP on each node. The + actual IP address will be calculated from the Node ID (e.g., with + ``NodeInterconnectCIDR = 192.168.16.0/24`` and ``NodeID = 5``, the + resulting IP address assigned to the ethernet interface on VPP will + be ``192.168.16.5`` ). + +- **NodeInterconnectDHCP** (default ``False``): instead of assigning + the IPs for the data plane interfaces, which are managed by VPP from + ``NodeInterconnectCIDR`` by the Contiv/VPP control plane, DHCP + assigns the IP addresses. The DHCP must be running in the network + where the data plane interface is connected, in case + ``NodeInterconnectDHCP = True``, ``NodeInterconnectCIDR`` is ignored. + +- **VxlanCIDR** (default ``192.168.30.0/24``): in order to provide + inter-node POD to POD connectivity via any underlay network (not + necessarily an L2 network), Contiv/VPP sets up a VXLAN tunnel overlay + between each of the 2 nodes within the cluster. Each node needs its + unique IP address of the VXLAN BVI interface. This IP address is + automatically calculated from the Node ID, (e.g., with + ``VxlanCIDR = 192.168.30.0/24`` and ``NodeID = 5``, the resulting IP + address assigned to the VXLAN BVI interface will be + ``192.168.30.5``). + +VPP Programming +--------------- + +This section describes how the Contiv/VPP control plane programs VPP, +based on the events it receives from k8s. This section is not +necessarily for understanding basic Contiv/VPP operation, but is very +useful for debugging purposes. + +Contiv/VPP currently uses a single VRF to forward the traffic between +PODs on a node, PODs on different nodes, host network stack, and +DPDK-managed dataplane interface. The forwarding between each of them is +purely L3-based, even for cases of communication between 2 PODs within +the same node. + +DPDK-Managed Data Interface +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In order to allow inter-node communication between PODs on different +nodes and between PODs and outside world, Contiv/VPP uses data-plane +interfaces bound to VPP using DPDK. Each node should have one “main” VPP +interface, which is unbound from the host network stack and bound to +VPP. The Contiv/VPP control plane automatically configures the interface +either via DHCP, or with a statically assigned address (see +``NodeInterconnectCIDR`` and ``NodeInterconnectDHCP`` yaml settings). + +PODs on the Same Node +~~~~~~~~~~~~~~~~~~~~~ + +PODs are connected to VPP using virtio-based TAP interfaces created by +VPP, with the POD-end of the interface placed into the POD container +network namespace. Each POD is assigned an IP address from the +``PodSubnetCIDR``. The allocated IP is configured with the prefix length +``/32``. Additionally, a static route pointing towards the VPP is +configured in the POD network namespace. The prefix length ``/32`` means +that all IP traffic will be forwarded to the default route - VPP. To get +rid of unnecessary broadcasts between POD and VPP, a static ARP entry is +configured for the gateway IP in the POD namespace, as well as for POD +IP on VPP. Both ends of the TAP interface have a static (non-default) +MAC address applied. + +PODs with hostNetwork=true +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +PODs with a ``hostNetwork=true`` attribute are not placed into a +separate network namespace, they instead use the main host Linux network +namespace; therefore, they are not directly connected to the VPP. They +rely on the interconnection between the VPP and the host Linux network +stack, which is described in the next paragraph. Note, when these PODs +access some service IP, their network communication will be NATed in +Linux (by iptables rules programmed by kube-proxy) as opposed to VPP, +which is the case for the PODs connected to VPP directly. + +Linux Host Network Stack +~~~~~~~~~~~~~~~~~~~~~~~~ + +In order to interconnect the Linux host network stack with VPP (to allow +access to the cluster resources from the host itself, as well as for the +PODs with ``hostNetwork=true``), VPP creates a TAP interface between VPP +and the main network namespace. The TAP interface is configured with IP +addresses from the ``VPPHostSubnetCIDR`` range, with ``.1`` in the +latest octet on the VPP side, and ``.2`` on the host side. The name of +the host interface is ``vpp1``. The host has static routes pointing to +VPP configured with: - A route to the whole ``PodSubnetCIDR`` to route +traffic targeting PODs towards VPP. - A route to ``ServiceCIDR`` +(default ``10.96.0.0/12``), to route service IP targeted traffic that +has not been translated by kube-proxy for some reason towards VPP. - The +host also has a static ARP entry configured for the IP of the VPP-end +TAP interface, to get rid of unnecessary broadcasts between the main +network namespace and VPP. + +VXLANs to Other Nodes +~~~~~~~~~~~~~~~~~~~~~ + +In order to provide inter-node POD to POD connectivity via any underlay +network (not necessarily an L2 network), Contiv/VPP sets up a VXLAN +tunnel overlay between each 2 nodes within the cluster (full mesh). + +All VXLAN tunnels are terminated in one bridge domain on each VPP. The +bridge domain has learning and flooding disabled, the l2fib of the +bridge domain contains a static entry for each VXLAN tunnel. Each bridge +domain has a BVI interface, which interconnects the bridge domain with +the main VRF (L3 forwarding). This interface needs a unique IP address, +which is assigned from the ``VxlanCIDR`` as describe above. + +The main VRF contains several static routes that point to the BVI IP +addresses of other nodes. For each node, it is a route to PODSubnet and +VppHostSubnet of the remote node, as well as a route to the management +IP address of the remote node. For each of these routes, the next hop IP +is the BVI interface IP of the remote node, which goes via the BVI +interface of the local node. + +The VXLAN tunnels and the static routes pointing to them are +added/deleted on each VPP, whenever a node is added/deleted in the k8s +cluster. + +More Info +~~~~~~~~~ + +Please refer to the [Packet Flow Dev +Guide](../dev-guide/PACKET_FLOW.html) for more detailed description of +paths traversed by request and response packets inside Contiv/VPP +Kubernetes cluster under different situations. diff --git a/docs/usecases/contiv/Prometheus.md b/docs/usecases/contiv/Prometheus.md deleted file mode 100644 index ba61be3c739..00000000000 --- a/docs/usecases/contiv/Prometheus.md +++ /dev/null @@ -1,159 +0,0 @@ -# Prometheus Statistics - -Each contiv-agent exposes statistics in Prometheus format at port `9999` by default. -Exposed data is split into two groups: -- `/stats` provides statistics for VPP interfaces managed by contiv-agent - Prometheus data is a set of counters with labels. For each interface, - the following counters are exposed: - * *inPackets* - * *outPackets* - * *inBytes* - * *outBytes* - * *ipv4Packets* - * *ipv6Packets* - * *outErrorPackets* - * *dropPackets* - * *inMissPackets* - * *inNobufPackets* - * *puntPackets* - - Labels let you add additional information to a counter. The *interfaceName* and *node* - labels are specified for all counters. If an interface is associated with a particular - pod, then the *podName* and *podNamespace* labels are also specified for its counters; - otherwise, a placeholder value (`--`) is used (for example, for node interconnect - interfaces). -- `/metrics` provides general go runtime statistics - -To access Prometheus stats of a node you can use `curl localhost:9999/stats` from the node. The output of contiv-agent running at k8s master node looks similar to the following: - -``` -$ curl localhost:9999/stats -# HELP dropPackets Number of dropped packets for interface -# TYPE dropPackets gauge -dropPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -dropPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 52 -dropPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 9 -dropPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 12 -dropPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP inBytes Number of received bytes for interface -# TYPE inBytes gauge -inBytes{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -inBytes{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 24716 -inBytes{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 726 -inBytes{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 6113 -inBytes{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP inErrorPackets Number of received packets with error for interface -# TYPE inErrorPackets gauge -inErrorPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -inErrorPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 -inErrorPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -inErrorPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 -inErrorPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP inMissPackets Number of missed packets for interface -# TYPE inMissPackets gauge -inMissPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -inMissPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 -inMissPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -inMissPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 -inMissPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP inNobufPackets Number of received packets ??? for interface -# TYPE inNobufPackets gauge -inNobufPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -inNobufPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 -inNobufPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -inNobufPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 -inNobufPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP inPackets Number of received packets for interface -# TYPE inPackets gauge -inPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -inPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 97 -inPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 9 -inPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 60 -inPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP ipv4Packets Number of ipv4 packets for interface -# TYPE ipv4Packets gauge -ipv4Packets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -ipv4Packets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 68 -ipv4Packets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -ipv4Packets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 52 -ipv4Packets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP ipv6Packets Number of ipv6 packets for interface -# TYPE ipv6Packets gauge -ipv6Packets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -ipv6Packets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 26 -ipv6Packets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 9 -ipv6Packets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 8 -ipv6Packets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP outBytes Number of transmitted bytes for interface -# TYPE outBytes gauge -outBytes{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -outBytes{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 5203 -outBytes{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -outBytes{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 17504 -outBytes{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP outErrorPackets Number of transmitted packets with error for interface -# TYPE outErrorPackets gauge -outErrorPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -outErrorPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 -outErrorPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -outErrorPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 -outErrorPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP outPackets Number of transmitted packets for interface -# TYPE outPackets gauge -outPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -outPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 49 -outPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -outPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 45 -outPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 -# HELP puntPackets Number of punt packets for interface -# TYPE puntPackets gauge -puntPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 -puntPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 -puntPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 -puntPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 -puntPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 - -``` - - -In order to browse stats in web UI Prometheus, it must be started locally by following the information in -the [Prometheus Getting Started Guide](https://prometheus.io/docs/prometheus/latest/getting_started/). - -If you start Prometheus on a node, the following sample config can be used: -```yaml -global: - scrape_interval: 15s - -scrape_configs: - - job_name: 'contiv_stats' - metrics_path: '/stats' - static_configs: - - targets: ['localhost:9999'] - - job_name: 'contiv_agent' - # metrics_path defaults to '/metrics' - static_configs: - - targets: ['localhost:9999'] -``` - -Once Prometheus is started with the specified config, you should be able access its web UI at -`localhost:9090`. -``` -tester@dev:~/Downloads/prometheus-2.1.0.linux-amd64$ ./prometheus --config.file=config.yml -``` - -If security features are enabled for the HTTP endpoint, then the config must be adjusted: -```yaml - - job_name: 'contiv_secured' - - scheme: https - basic_auth: - username: user - password: pass - metrics_path: /stats - tls_config: - insecure_skip_verify: true - # CA certificate to validate API server certificate with. - #[ ca_file: <filename> ] - static_configs: - - targets: ['localhost:9191'] -```
\ No newline at end of file diff --git a/docs/usecases/contiv/Prometheus.rst b/docs/usecases/contiv/Prometheus.rst new file mode 100644 index 00000000000..cb4c1646adb --- /dev/null +++ b/docs/usecases/contiv/Prometheus.rst @@ -0,0 +1,158 @@ +Prometheus Statistics +===================== + +Each contiv-agent exposes statistics in Prometheus format at port +``9999`` by default. Exposed data is split into two groups: - ``/stats`` +provides statistics for VPP interfaces managed by contiv-agent +Prometheus data is a set of counters with labels. For each interface, +the following counters are exposed: \* *inPackets* \* *outPackets* \* +*inBytes* \* *outBytes* \* *ipv4Packets* \* *ipv6Packets* \* +*outErrorPackets* \* *dropPackets* \* *inMissPackets* \* +*inNobufPackets* \* *puntPackets* + +Labels let you add additional information to a counter. The +*interfaceName* and *node* labels are specified for all counters. If an +interface is associated with a particular pod, then the *podName* and +*podNamespace* labels are also specified for its counters; otherwise, a +placeholder value (``--``) is used (for example, for node interconnect +interfaces). - ``/metrics`` provides general go runtime statistics + +To access Prometheus stats of a node you can use +``curl localhost:9999/stats`` from the node. The output of contiv-agent +running at k8s master node looks similar to the following: + +:: + + $ curl localhost:9999/stats + # HELP dropPackets Number of dropped packets for interface + # TYPE dropPackets gauge + dropPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + dropPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 52 + dropPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 9 + dropPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 12 + dropPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP inBytes Number of received bytes for interface + # TYPE inBytes gauge + inBytes{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + inBytes{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 24716 + inBytes{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 726 + inBytes{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 6113 + inBytes{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP inErrorPackets Number of received packets with error for interface + # TYPE inErrorPackets gauge + inErrorPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + inErrorPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 + inErrorPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + inErrorPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 + inErrorPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP inMissPackets Number of missed packets for interface + # TYPE inMissPackets gauge + inMissPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + inMissPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 + inMissPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + inMissPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 + inMissPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP inNobufPackets Number of received packets ??? for interface + # TYPE inNobufPackets gauge + inNobufPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + inNobufPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 + inNobufPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + inNobufPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 + inNobufPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP inPackets Number of received packets for interface + # TYPE inPackets gauge + inPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + inPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 97 + inPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 9 + inPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 60 + inPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP ipv4Packets Number of ipv4 packets for interface + # TYPE ipv4Packets gauge + ipv4Packets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + ipv4Packets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 68 + ipv4Packets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + ipv4Packets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 52 + ipv4Packets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP ipv6Packets Number of ipv6 packets for interface + # TYPE ipv6Packets gauge + ipv6Packets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + ipv6Packets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 26 + ipv6Packets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 9 + ipv6Packets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 8 + ipv6Packets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP outBytes Number of transmitted bytes for interface + # TYPE outBytes gauge + outBytes{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + outBytes{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 5203 + outBytes{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + outBytes{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 17504 + outBytes{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP outErrorPackets Number of transmitted packets with error for interface + # TYPE outErrorPackets gauge + outErrorPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + outErrorPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 + outErrorPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + outErrorPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 + outErrorPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP outPackets Number of transmitted packets for interface + # TYPE outPackets gauge + outPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + outPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 49 + outPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + outPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 45 + outPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + # HELP puntPackets Number of punt packets for interface + # TYPE puntPackets gauge + puntPackets{interfaceName="GigabitEthernet0/9/0",node="dev",podName="--",podNamespace="--"} 0 + puntPackets{interfaceName="tap-vpp2",node="dev",podName="--",podNamespace="--"} 0 + puntPackets{interfaceName="tap0e6439a7a934336",node="dev",podName="web-667bdcb4d8-pxkfs",podNamespace="default"} 0 + puntPackets{interfaceName="tap5338a3285ad6bd7",node="dev",podName="kube-dns-6f4fd4bdf-rsz9b",podNamespace="kube-system"} 0 + puntPackets{interfaceName="vxlanBVI",node="dev",podName="--",podNamespace="--"} 0 + +In order to browse stats in web UI Prometheus, it must be started +locally by following the information in the `Prometheus Getting Started +Guide <https://prometheus.io/docs/prometheus/latest/getting_started/>`__. + +If you start Prometheus on a node, the following sample config can be +used: + +.. code:: yaml + + global: + scrape_interval: 15s + + scrape_configs: + - job_name: 'contiv_stats' + metrics_path: '/stats' + static_configs: + - targets: ['localhost:9999'] + - job_name: 'contiv_agent' + # metrics_path defaults to '/metrics' + static_configs: + - targets: ['localhost:9999'] + +Once Prometheus is started with the specified config, you should be able +access its web UI at ``localhost:9090``. + +:: + + tester@dev:~/Downloads/prometheus-2.1.0.linux-amd64$ ./prometheus --config.file=config.yml + +If security features are enabled for the HTTP endpoint, then the config +must be adjusted: + +.. code:: yaml + + - job_name: 'contiv_secured' + + scheme: https + basic_auth: + username: user + password: pass + metrics_path: /stats + tls_config: + insecure_skip_verify: true + # CA certificate to validate API server certificate with. + #[ ca_file: <filename> ] + static_configs: + - targets: ['localhost:9191'] diff --git a/docs/usecases/contiv/SECURITY.md b/docs/usecases/contiv/SECURITY.md deleted file mode 100644 index 40c5250e311..00000000000 --- a/docs/usecases/contiv/SECURITY.md +++ /dev/null @@ -1,104 +0,0 @@ -# Security - -There are two types of security that are utilized in Contiv, and are discussed in this section: [HTTP](#http-security) and [ETCD](#etcd-security). - -## HTTP Security - -By default, the access to endpoints (liveness, readiness probe, prometheus stats, ...) served by Contiv-vswitch and -Contiv-ksr is open to anybody. Contiv-vswitch exposes endpoints using port `9999` and contiv-ksr uses `9191`. - -To secure access to the endpoints, the SSL/TLS server certificate and basic auth (username password) can be configured. - -In Contiv-VPP, this can be done using the Helm charts in [k8s/contiv-vpp folder](https://github.com/contiv/vpp/tree/master/k8s/contiv-vpp). - -To generate server certificate the approach described in [ETCD security](#etcd-security) can be leveraged. - -## ETCD Security - -By default, the access to Contiv-VPP ETCD is open to anybody. ETCD gets deployed -on the master node, on port `12379`, and is exposed using the NodePort service -on port `32379`, on each node. - -To secure access to ETCD, we recommend using the SSL/TLS certificates to authenticate -both the client and server side, and encrypt the communication. In Contiv-VPP, this can be done using the Helm charts in [k8s/contiv-vpp folder](https://github.com/contiv/vpp/tree/master/k8s/contiv-vpp). - -The prerequisite for that is the generation of SSL certificates. - - -### Generate Self-Signed Certificates -In order to secure ETCD, we need to create our own certificate authority, -and then generate the private keys and certificates for both the ETCD server and ETCD clients. - -This guide uses CloudFlare's [cfssl](https://github.com/cloudflare/cfssl) tools to do this job. -It follows the steps described in this [CoreOS guide](https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md). - -Perform the following steps to generate private keys and certificates: - -##### 1. Install cfssl -``` -mkdir ~/bin -curl -s -L -o ~/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 -curl -s -L -o ~/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 -chmod +x ~/bin/{cfssl,cfssljson} -export PATH=$PATH:~/bin -``` - -##### 2. Initialize a Certificate Authority -``` -echo '{"CN":"CA","key":{"algo":"rsa","size":2048}}' | cfssl gencert -initca - | cfssljson -bare ca - -echo '{"signing":{"default":{"expiry":"43800h","usages":["signing","key encipherment","server auth","client auth"]}}}' > ca-config.json -``` - -##### 3. Generate Server Key + Certificate -Replace the IP address `10.0.2.15` below with the IP address of your master node: -``` -export ADDRESS=127.0.0.1,10.0.2.15 -export NAME=server -echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME -``` - -##### 4. Generate Client Key + Certificate -``` -export ADDRESS= -export NAME=client -echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME -``` - -The above commands produce the following files that will be needed in order to secure ETCD: - - `ca.pem`: certificate of the certificate authority - - `server.pem`: certificate of the ETCD server - - `server-key.pem`: private key of the ETCD server - - `client.pem`: certificate for the ETCD clients - - `client-key.pem`: private key for the ETCD clients - - -### Distribute Certificates and Generate Contiv-VPP Deployment Yaml -There are two options for distributing the certificates to all nodes in a k8s cluster. -You can either distribute the certificates -[manually](#distribute-certificates-manually), or embed the certificates into the deployment yaml file and -distribute them as [k8s secrets](https://kubernetes.io/docs/concepts/configuration/secret/). - -##### Distribute Certificates Manually -In this case, you need to copy the `ca.pem`, `client.pem` and `client-key.pem` files -into a specific folder (`/var/contiv/etcd-secrets` by default) on each worker node. -On the master node, you also need to add the `server.pem` and `server-key.pem` into that location. - -Then you can generate the Contiv-VPP deployment YAML as follows: -``` -cd k8s -helm template --name my-release contiv-vpp --set etcd.secureTransport=True > contiv-vpp.yaml -``` -Then you can go ahead and deploy Contiv-VPP using this yaml file. - -##### Embed the certificates into deployment the yaml and use k8s secret to distribute them {: #Embed-certificates } -In this case, you need to copy all 5 generated files into the folder with helm definitions -(`k8s/contiv-vpp`) and generate the Contiv-VPP deployment YAML as follows: -``` -cd k8s -helm template --name my-release contiv-vpp --set etcd.secureTransport=True --set etcd.secrets.mountFromHost=False > contiv-vpp.yaml -``` -Then just deploy Contiv-VPP using this yaml file. - -Please note that the path of the mount folder with certificates, as well as the certificate -file names can be customized using the config parameters of the Contiv-VPP chart, -as described in [this README](https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/README.md).
\ No newline at end of file diff --git a/docs/usecases/contiv/SECURITY.rst b/docs/usecases/contiv/SECURITY.rst new file mode 100644 index 00000000000..8e8308e8ba5 --- /dev/null +++ b/docs/usecases/contiv/SECURITY.rst @@ -0,0 +1,145 @@ +Security +======== + +There are two types of security that are utilized in Contiv, and are +discussed in this section: `HTTP <#http-security>`__ and +`ETCD <#etcd-security>`__. + +HTTP Security +------------- + +By default, the access to endpoints (liveness, readiness probe, +prometheus stats, …) served by Contiv-vswitch and Contiv-ksr is open to +anybody. Contiv-vswitch exposes endpoints using port ``9999`` and +contiv-ksr uses ``9191``. + +To secure access to the endpoints, the SSL/TLS server certificate and +basic auth (username password) can be configured. + +In Contiv-VPP, this can be done using the Helm charts in `k8s/contiv-vpp +folder <https://github.com/contiv/vpp/tree/master/k8s/contiv-vpp>`__. + +To generate server certificate the approach described in `ETCD +security <#etcd-security>`__ can be leveraged. + +ETCD Security +------------- + +By default, the access to Contiv-VPP ETCD is open to anybody. ETCD gets +deployed on the master node, on port ``12379``, and is exposed using the +NodePort service on port ``32379``, on each node. + +To secure access to ETCD, we recommend using the SSL/TLS certificates to +authenticate both the client and server side, and encrypt the +communication. In Contiv-VPP, this can be done using the Helm charts in +`k8s/contiv-vpp +folder <https://github.com/contiv/vpp/tree/master/k8s/contiv-vpp>`__. + +The prerequisite for that is the generation of SSL certificates. + +Generate Self-Signed Certificates +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In order to secure ETCD, we need to create our own certificate +authority, and then generate the private keys and certificates for both +the ETCD server and ETCD clients. + +This guide uses CloudFlare’s +`cfssl <https://github.com/cloudflare/cfssl>`__ tools to do this job. It +follows the steps described in this `CoreOS +guide <https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md>`__. + +Perform the following steps to generate private keys and certificates: + +1. Install cfssl +^^^^^^^^^^^^^^^^ + +:: + + mkdir ~/bin + curl -s -L -o ~/bin/cfssl https://pkg.cfssl.org/R1.2/cfssl_linux-amd64 + curl -s -L -o ~/bin/cfssljson https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64 + chmod +x ~/bin/{cfssl,cfssljson} + export PATH=$PATH:~/bin + +2. Initialize a Certificate Authority +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + echo '{"CN":"CA","key":{"algo":"rsa","size":2048}}' | cfssl gencert -initca - | cfssljson -bare ca - + echo '{"signing":{"default":{"expiry":"43800h","usages":["signing","key encipherment","server auth","client auth"]}}}' > ca-config.json + +3. Generate Server Key + Certificate +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Replace the IP address ``10.0.2.15`` below with the IP address of your +master node: + +:: + + export ADDRESS=127.0.0.1,10.0.2.15 + export NAME=server + echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME + +4. Generate Client Key + Certificate +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +:: + + export ADDRESS= + export NAME=client + echo '{"CN":"'$NAME'","hosts":[""],"key":{"algo":"rsa","size":2048}}' | cfssl gencert -config=ca-config.json -ca=ca.pem -ca-key=ca-key.pem -hostname="$ADDRESS" - | cfssljson -bare $NAME + +The above commands produce the following files that will be needed in +order to secure ETCD: - ``ca.pem``: certificate of the certificate +authority - ``server.pem``: certificate of the ETCD server - +``server-key.pem``: private key of the ETCD server - ``client.pem``: +certificate for the ETCD clients - ``client-key.pem``: private key for +the ETCD clients + +Distribute Certificates and Generate Contiv-VPP Deployment Yaml +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +There are two options for distributing the certificates to all nodes in +a k8s cluster. You can either distribute the certificates +`manually <#distribute-certificates-manually>`__, or embed the +certificates into the deployment yaml file and distribute them as `k8s +secrets <https://kubernetes.io/docs/concepts/configuration/secret/>`__. + +Distribute Certificates Manually +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In this case, you need to copy the ``ca.pem``, ``client.pem`` and +``client-key.pem`` files into a specific folder +(``/var/contiv/etcd-secrets`` by default) on each worker node. On the +master node, you also need to add the ``server.pem`` and +``server-key.pem`` into that location. + +Then you can generate the Contiv-VPP deployment YAML as follows: + +:: + + cd k8s + helm template --name my-release contiv-vpp --set etcd.secureTransport=True > contiv-vpp.yaml + +Then you can go ahead and deploy Contiv-VPP using this yaml file. + +Embed the certificates into deployment the yaml and use k8s secret to distribute them {: #Embed-certificates } +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +In this case, you need to copy all 5 generated files into the folder +with helm definitions (``k8s/contiv-vpp``) and generate the Contiv-VPP +deployment YAML as follows: + +:: + + cd k8s + helm template --name my-release contiv-vpp --set etcd.secureTransport=True --set etcd.secrets.mountFromHost=False > contiv-vpp.yaml + +Then just deploy Contiv-VPP using this yaml file. + +Please note that the path of the mount folder with certificates, as well +as the certificate file names can be customized using the config +parameters of the Contiv-VPP chart, as described in `this +README <https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/README.md>`__. diff --git a/docs/usecases/contiv/SINGLE_NIC_SETUP.md b/docs/usecases/contiv/SINGLE_NIC_SETUP.md deleted file mode 100644 index 83dd47d99a6..00000000000 --- a/docs/usecases/contiv/SINGLE_NIC_SETUP.md +++ /dev/null @@ -1,111 +0,0 @@ -### Setting up a Node with a Single NIC - -#### Installing the STN Daemon -The STN (Steal the NIC) daemon must be installed on every node in the cluster that has only -one NIC. The STN daemon installation(*) should be performed before deployment -of the Contiv-VPP plugin. - -\* Docker daemon must be present when installing STN. Also, Docker must be configured to allow shared mount. -On CentOS, this may not be the case by default. You can enable it by following the instructions at -[https://docs.portworx.com/knowledgebase/shared-mount-propagation.html](https://docs.portworx.com/knowledgebase/shared-mount-propagation.html). - - -Run as root (not using sudo): -``` -bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/stn-install.sh) -``` -The install script should output the following: -``` -Installing Contiv STN daemon. -Starting contiv-stn Docker container: -550334308f85f05b2690f5cfb5dd945bd9c501ab9d074231f15c14d7098ef212 -``` - -Check that the STN daemon is running: -``` -docker ps -a -CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES -550334308f85 contivvpp/stn "/stn" 33 seconds ago Up 33 seconds contiv-stn -``` - -Check that the STN daemon is operational: -``` -docker logs contiv-stn -``` -The expected logs would look like the following excerpt: -``` -2018/02/23 10:08:34 Starting the STN GRPC server at port 50051 -``` - -For more details, please read the Go documentation for [contiv-stn](https://github.com/contiv/vpp/blob/master/cmd/contiv-stn/doc.go) -and [contiv-init](https://github.com/contiv/vpp/blob/master/cmd/contiv-init/doc.go). - -#### Creating a VPP Interface Configuration -Create the VPP configuration for the hardware interface as described -[here](https://github.com/contiv/vpp/blob/master/docs/VPP_CONFIG.md#single-nic-configuration). - -#### Configuring STN in Contiv-VPP K8s Deployment Files -The STN feature is disabled by default. It needs to be enabled either globally, -or individually for every node in the cluster. - -##### Global Configuration: -Global configuration is used in homogeneous environments where all nodes in -a given cluster have the same hardware configuration, for example only a single -Network Adapter. To enable the STN feature globally, put the `StealFirstNIC: True` -stanza into the \[`contiv-vpp.yaml`\]\[1\] deployment file, for example: -``` -data: - contiv.yaml: |- - TCPstackDisabled: true - ... - StealFirstNIC: True - ... - IPAMConfig: -``` - -Setting `StealFirstNIC` to `True` will tell the STN Daemon on every node in the -cluster to steal the first NIC from the kernel and assign it to VPP. Note that -the Network Adapters on different nodes do not need to be of the same type. You -still need to create the respective vswitch configurations on every node in the -cluster, as shown \[above\](#creating-a-vpp-interface-configuration). - -##### Individual Configuration: -Individual configuration is used in heterogeneous environments where each node -in a given cluster may be configured differently. To enable the STN feature -for a specific node in the cluster, put the following stanza into its Node -Configuration in the \[`contiv-vpp.yaml`\]\[1\] deployment file, for example: -``` -... - NodeConfig: - - NodeName: "ubuntu-1" - StealInterface: "enp0s8" - - NodeName: "ubuntu-2" - StealInterface: "enp0s8" -... -``` -Note that you still have to create the vswitch configuration on the node as -shown [here](#creating-a-vpp-interface-configuration). - - - -#### Uninstalling the STN Daemon - -Run as root (not using sudo): -``` -bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/stn-install.sh) --uninstall -``` -The install script should output the following: -``` -Uninstalling Contiv STN daemon. -Stopping contiv-stn Docker container: -contiv-stn -contiv-stn -contiv-stn -``` -Make sure that the STN daemon has been uninstalled: -``` -docker ps -q -f name=contiv-stn -``` -No containers should be listed. - -[1]: ../k8s/contiv-vpp.yaml diff --git a/docs/usecases/contiv/SINGLE_NIC_SETUP.rst b/docs/usecases/contiv/SINGLE_NIC_SETUP.rst new file mode 100644 index 00000000000..43d4c3f5491 --- /dev/null +++ b/docs/usecases/contiv/SINGLE_NIC_SETUP.rst @@ -0,0 +1,140 @@ +Setting up a Node with a Single NIC +=================================== + +Installing the STN Daemon +------------------------- + +The STN (Steal the NIC) daemon must be installed on every node in the +cluster that has only one NIC. The STN daemon installation(*) should be +performed before deployment of the Contiv-VPP plugin. + +\* Docker daemon must be present when installing STN. Also, Docker must +be configured to allow shared mount. On CentOS, this may not be the case +by default. You can enable it by following the instructions at +https://docs.portworx.com/knowledgebase/shared-mount-propagation.html. + +Run as root (not using sudo): + +:: + + bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/stn-install.sh) + +The install script should output the following: + +:: + + Installing Contiv STN daemon. + Starting contiv-stn Docker container: + 550334308f85f05b2690f5cfb5dd945bd9c501ab9d074231f15c14d7098ef212 + +Check that the STN daemon is running: + +:: + + docker ps -a + CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES + 550334308f85 contivvpp/stn "/stn" 33 seconds ago Up 33 seconds contiv-stn + +Check that the STN daemon is operational: + +:: + + docker logs contiv-stn + +The expected logs would look like the following excerpt: + +:: + + 2018/02/23 10:08:34 Starting the STN GRPC server at port 50051 + +For more details, please read the Go documentation for +`contiv-stn <https://github.com/contiv/vpp/blob/master/cmd/contiv-stn/doc.go>`__ +and +`contiv-init <https://github.com/contiv/vpp/blob/master/cmd/contiv-init/doc.go>`__. + +Creating a VPP Interface Configuration +-------------------------------------- + +Create the VPP configuration for the hardware interface as described +`here <https://github.com/contiv/vpp/blob/master/docs/VPP_CONFIG.md#single-nic-configuration>`__. + +Configuring STN in Contiv-VPP K8s Deployment Files +-------------------------------------------------- + +The STN feature is disabled by default. It needs to be enabled either +globally, or individually for every node in the cluster. + +Global Configuration: +~~~~~~~~~~~~~~~~~~~~~ + +Global configuration is used in homogeneous environments where all nodes +in a given cluster have the same hardware configuration, for example +only a single Network Adapter. To enable the STN feature globally, put +the ``StealFirstNIC: True`` stanza into the [``contiv-vpp.yaml``][1] +deployment file, for example: + +:: + + data: + contiv.yaml: |- + TCPstackDisabled: true + ... + StealFirstNIC: True + ... + IPAMConfig: + +Setting ``StealFirstNIC`` to ``True`` will tell the STN Daemon on every +node in the cluster to steal the first NIC from the kernel and assign it +to VPP. Note that the Network Adapters on different nodes do not need to +be of the same type. You still need to create the respective vswitch +configurations on every node in the cluster, as shown +[above](#creating-a-vpp-interface-configuration). + +Individual Configuration: +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Individual configuration is used in heterogeneous environments where +each node in a given cluster may be configured differently. To enable +the STN feature for a specific node in the cluster, put the following +stanza into its Node Configuration in the [``contiv-vpp.yaml``][1] +deployment file, for example: + +:: + + ... + NodeConfig: + - NodeName: "ubuntu-1" + StealInterface: "enp0s8" + - NodeName: "ubuntu-2" + StealInterface: "enp0s8" + ... + +Note that you still have to create the vswitch configuration on the node +as shown `here <#creating-a-vpp-interface-configuration>`__. + +Uninstalling the STN Daemon +--------------------------- + +Run as root (not using sudo): + +:: + + bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/stn-install.sh) --uninstall + +The install script should output the following: + +:: + + Uninstalling Contiv STN daemon. + Stopping contiv-stn Docker container: + contiv-stn + contiv-stn + contiv-stn + +Make sure that the STN daemon has been uninstalled: + +:: + + docker ps -q -f name=contiv-stn + +No containers should be listed. diff --git a/docs/usecases/contiv/VMWARE_FUSION_HOST.md b/docs/usecases/contiv/VMWARE_FUSION_HOST.md deleted file mode 100644 index d4e251c0fcd..00000000000 --- a/docs/usecases/contiv/VMWARE_FUSION_HOST.md +++ /dev/null @@ -1,52 +0,0 @@ -### Preparing a VmWare Fusion Host -The *vmxnet3 driver* is required on a GigE Network Adapter used by VPP. On VmWare -Fusion, the default Network Adapter driver is an *Intel 82545EM (e1000)*, and there -is no GUI to change it to *vmxnet3*. The change must be done manually in the VM's -configuration file as follows: - -- Bring up the VM library window: **Window -> Virtual Machine Library** -- Right click on the VM where you want to change the driver: - <*VM-Name*> **-> Show in Finder**. This pops up a new Finder window with a line - for each VM that Fusion knows about. -- Right click on the VM where you want to change the driver: - <*VM-Name*> **-> Show package contents**. This brings up a window with the - contents of the package. -- Open the file <*VM-Name*> **.vmx** with your favorite text editor. -- For each Network Adapter that you want used by VPP, look for the - Network Adapter's driver configuration. For example, for the VM's first - Network Adapter look for: - ``` - ethernet0.virtualDev = "e1000" - ``` - Replace `e1000` with `vmxnet3`: - ``` - ethernet0.virtualDev = "vmxnet3" - ``` -and restart the VM. - -If you replaced the driver on your VM's primary Network Adapter, you will -have to change the primary network interface configuration in Linux. - -First, get the new primary network interface name: -``` -sudo lshw -class network -businfo - -Bus info Device Class Description -======================================================== -pci@0000:03:00.0 ens160 network VMXNET3 Ethernet Controller -``` -Replace the existing primary network interface name in `/etc/network/interfaces` -with the above device name (ens160): -``` -# This file describes the network interfaces available on your system, -# and how to activate them. For more information, see interfaces(5). - -source /etc/network/interfaces.d/* - -# The loopback network interface -auto lo -iface lo inet loopback - -# The primary network interface -auto ens160 -iface ens160 inet dhcp
\ No newline at end of file diff --git a/docs/usecases/contiv/VMWARE_FUSION_HOST.rst b/docs/usecases/contiv/VMWARE_FUSION_HOST.rst new file mode 100644 index 00000000000..6ddc66546c0 --- /dev/null +++ b/docs/usecases/contiv/VMWARE_FUSION_HOST.rst @@ -0,0 +1,66 @@ +Preparing a VmWare Fusion Host +============================== + +The *vmxnet3 driver* is required on a GigE Network Adapter used by VPP. +On VmWare Fusion, the default Network Adapter driver is an *Intel +82545EM (e1000)*, and there is no GUI to change it to *vmxnet3*. The +change must be done manually in the VM’s configuration file as follows: + +- Bring up the VM library window: **Window -> Virtual Machine Library** + +- Right click on the VM where you want to change the driver: + <*VM-Name*> **-> Show in Finder**. This pops up a new Finder window + with a line for each VM that Fusion knows about. + +- Right click on the VM where you want to change the driver: + <*VM-Name*> **-> Show package contents**. This brings up a window + with the contents of the package. + +- Open the file <*VM-Name*> **.vmx** with your favorite text editor. + +- For each Network Adapter that you want used by VPP, look for the + Network Adapter’s driver configuration. For example, for the VM’s + first Network Adapter look for: + + :: + + ethernet0.virtualDev = "e1000" + + Replace ``e1000`` with ``vmxnet3``: + + :: + + ethernet0.virtualDev = "vmxnet3" + + and restart the VM. + +If you replaced the driver on your VM’s primary Network Adapter, you +will have to change the primary network interface configuration in +Linux. + +First, get the new primary network interface name: + +:: + + sudo lshw -class network -businfo + + Bus info Device Class Description + ======================================================== + pci@0000:03:00.0 ens160 network VMXNET3 Ethernet Controller + +Replace the existing primary network interface name in +``/etc/network/interfaces`` with the above device name (ens160): \``\` # +This file describes the network interfaces available on your system, # +and how to activate them. For more information, see interfaces(5). + +source /etc/network/interfaces.d/\* + +The loopback network interface +============================== + +auto lo iface lo inet loopback + +The primary network interface +============================= + +auto ens160 iface ens160 inet dhcp diff --git a/docs/usecases/contiv/VPPTRACE.md b/docs/usecases/contiv/VPPTRACE.md deleted file mode 100644 index c9d2088266b..00000000000 --- a/docs/usecases/contiv/VPPTRACE.md +++ /dev/null @@ -1,95 +0,0 @@ -## Using vpptrace.sh for VPP Packet Tracing - -VPP allows tracing of incoming packets using CLI commands `trace add` and `show trace` -as explained \[here\](VPP_PACKET_TRACING_K8S.html), but it is a rather cumbersome process. - -The buffer for captured packets is limited in size, and once it gets full the tracing stops. The user has to manually clear the buffer content, and then repeat the trace command to resume the packet capture, losing information about all packets received in the meantime. - -Packet filtering exposed via the CLI command `trace filter` is also quite limited in what it can do. Currently there is just one available filter, which allows you to keep only packets that include a certain node in the trace or exclude a certain node in the trace. -It is not possible to filter the traffic by its content (e.g., by the source/destination IP address, protocol, etc.). - -Last but not least, it is not possible to trace packets on a selected interface -like `tcpdump`, which allows tracing via the option `-i`. VPP is only able to capture packets -on the *RX side* of selected *devices* (e.g., dpdk, virtio, af-packet). This means -that interfaces based on the same device cannot be traced for incoming packets -individually, but only all at the same time. In Contiv/VPP all pods are connected -with VPP via the same kind of the TAP interface, meaning that it is not possible to -capture packets incoming only from one selected pod. - -Contiv/VPP ships with a simple bash script [vpptrace.sh](https://github.com/contiv/vpp/blob/master/scripts/vpptrace.sh), -which helps alleviate the aforementioned VPP limitations. The script automatically -re-initializes buffers and traces whenever it is close to getting full, in order to -avoid packet loss as much as possible. Next it allows you to filter packets -by the content of the trace. There are two modes of filtering: - - *substring mode* (default): packet trace must contain a given sub-string in order to - be included in the output - - *regex mode*: packet trace must match a given regex in order to be printed - -The script is still limited, in that capture runs only on the RX side of all interfaces that are built on top of selected devices. Using filtering, however, it is possible to limit -*traffic by interface* simply by using the interface name as a substring to match against. - -#### Usage - -Run the script with option `-h` to get the usage printed: -``` -Usage: ./vpptrace.sh [-i <VPP-IF-TYPE>]... [-a <VPP-ADDRESS>] [-r] [-f <REGEXP> / <SUBSTRING>] - -i <VPP-IF-TYPE> : VPP interface *type* to run the packet capture on (e.g., dpdk-input, virtio-input, etc.) - - available aliases: - - af-packet-input: afpacket, af-packet, veth - - virtio-input: tap (version determined from the VPP runtime config), tap2, tapv2 - - tapcli-rx: tap (version determined from the VPP config), tap1, tapv1 - - dpdk-input: dpdk, gbe, phys* - - multiple interfaces can be watched at the same time - the option can be repeated with - different values - - default = dpdk + tap - -a <VPP-ADDRESS> : IP address or hostname of the VPP to capture packets from - - not supported if VPP listens on a UNIX domain socket - - default = 127.0.0.1 - -r : apply filter string (passed with -f) as a regexp expression - - by default the filter is NOT treated as regexp - -f : filter string that packet must contain (without -r) or match as regexp (with -r) to be printed - - default is no filtering -``` - -`VPP-IF-TYPE` is a repeated option used to select the set of devices (e.g., virtio, dpdk, etc.) -to capture the incoming traffic. Script provides multiple aliases, which -are much easier to remember than the device names. For `dpdk-input` one can enter -just `dpdk`, or anything starting with `phys`, etc. For TAPs, the script is even -smart enough to find out the TAP version used, which allows to enter just `tap` -as the device name. - -If `VPP-IF-TYPE` is not specified, then the default behaviour is to capture from both -`dpdk` (traffic entering the node from outside) and `tap` (preferred interface type -for pod-VPP and host-VPP interconnection, receiving node-initiated traffic). - -vpptrace.sh can capture packets even from a VPP on a different host, provided that -VPP-CLI listens on a port, and not on a UNIX domain socket (for security reasons IPC -is the default communication link, see `/etc/vpp/contiv-vswitch.conf`). Enter the destination -node IP address via the option `-a`(localhost is the default). - -The capture can be filtered via the `-f` option. The output will include only packets -whose trace matches contain the given expression/sub-string. - -Option `-r` enables the regex mode for filtering. - -#### Examples - -- Capture all packets entering VPP via `tapcli-1` interface **AND** all packets - leaving VPP via `tapcli-1` that were sent from a pod, or the host on the *same node* - (sent from tap, not Gbe): -``` -$ vpptrace.sh -i tap -f "tapcli-1" -``` - -- Capture all packets with source or destination IP address 10.1.1.3: -``` -$ vpptrace.sh -i tap -i dpdk -f "10.1.1.3" - -Or just: -$ vpptrace.sh "10.1.1.3" -``` - -- Capture all SYN-ACKs received from outside: -``` -$ vpptrace.sh -i dpdk -f "SYN-ACK" -```
\ No newline at end of file diff --git a/docs/usecases/contiv/VPPTRACE.rst b/docs/usecases/contiv/VPPTRACE.rst new file mode 100644 index 00000000000..a277a57b24f --- /dev/null +++ b/docs/usecases/contiv/VPPTRACE.rst @@ -0,0 +1,120 @@ +Using vpptrace.sh for VPP Packet Tracing +======================================== + +VPP allows tracing of incoming packets using CLI commands ``trace add`` +and ``show trace`` as explained [here](VPP_PACKET_TRACING_K8S.html), but +it is a rather cumbersome process. + +The buffer for captured packets is limited in size, and once it gets +full the tracing stops. The user has to manually clear the buffer +content, and then repeat the trace command to resume the packet capture, +losing information about all packets received in the meantime. + +Packet filtering exposed via the CLI command ``trace filter`` is also +quite limited in what it can do. Currently there is just one available +filter, which allows you to keep only packets that include a certain +node in the trace or exclude a certain node in the trace. It is not +possible to filter the traffic by its content (e.g., by the +source/destination IP address, protocol, etc.). + +Last but not least, it is not possible to trace packets on a selected +interface like ``tcpdump``, which allows tracing via the option ``-i``. +VPP is only able to capture packets on the *RX side* of selected +*devices* (e.g., dpdk, virtio, af-packet). This means that interfaces +based on the same device cannot be traced for incoming packets +individually, but only all at the same time. In Contiv/VPP all pods are +connected with VPP via the same kind of the TAP interface, meaning that +it is not possible to capture packets incoming only from one selected +pod. + +Contiv/VPP ships with a simple bash script +`vpptrace.sh <https://github.com/contiv/vpp/blob/master/scripts/vpptrace.sh>`__, +which helps alleviate the aforementioned VPP limitations. The script +automatically re-initializes buffers and traces whenever it is close to +getting full, in order to avoid packet loss as much as possible. Next it +allows you to filter packets by the content of the trace. There are two +modes of filtering: - *substring mode* (default): packet trace must +contain a given sub-string in order to be included in the output - +*regex mode*: packet trace must match a given regex in order to be +printed + +The script is still limited, in that capture runs only on the RX side of +all interfaces that are built on top of selected devices. Using +filtering, however, it is possible to limit *traffic by interface* +simply by using the interface name as a substring to match against. + +Usage +----- + +Run the script with option ``-h`` to get the usage printed: + +:: + + Usage: ./vpptrace.sh [-i <VPP-IF-TYPE>]... [-a <VPP-ADDRESS>] [-r] [-f <REGEXP> / <SUBSTRING>] + -i <VPP-IF-TYPE> : VPP interface *type* to run the packet capture on (e.g., dpdk-input, virtio-input, etc.) + - available aliases: + - af-packet-input: afpacket, af-packet, veth + - virtio-input: tap (version determined from the VPP runtime config), tap2, tapv2 + - tapcli-rx: tap (version determined from the VPP config), tap1, tapv1 + - dpdk-input: dpdk, gbe, phys* + - multiple interfaces can be watched at the same time - the option can be repeated with + different values + - default = dpdk + tap + -a <VPP-ADDRESS> : IP address or hostname of the VPP to capture packets from + - not supported if VPP listens on a UNIX domain socket + - default = 127.0.0.1 + -r : apply filter string (passed with -f) as a regexp expression + - by default the filter is NOT treated as regexp + -f : filter string that packet must contain (without -r) or match as regexp (with -r) to be printed + - default is no filtering + +``VPP-IF-TYPE`` is a repeated option used to select the set of devices +(e.g., virtio, dpdk, etc.) to capture the incoming traffic. Script +provides multiple aliases, which are much easier to remember than the +device names. For ``dpdk-input`` one can enter just ``dpdk``, or +anything starting with ``phys``, etc. For TAPs, the script is even smart +enough to find out the TAP version used, which allows to enter just +``tap`` as the device name. + +If ``VPP-IF-TYPE`` is not specified, then the default behaviour is to +capture from both ``dpdk`` (traffic entering the node from outside) and +``tap`` (preferred interface type for pod-VPP and host-VPP +interconnection, receiving node-initiated traffic). + +vpptrace.sh can capture packets even from a VPP on a different host, +provided that VPP-CLI listens on a port, and not on a UNIX domain socket +(for security reasons IPC is the default communication link, see +``/etc/vpp/contiv-vswitch.conf``). Enter the destination node IP address +via the option ``-a``\ (localhost is the default). + +The capture can be filtered via the ``-f`` option. The output will +include only packets whose trace matches contain the given +expression/sub-string. + +Option ``-r`` enables the regex mode for filtering. + +Examples +-------- + +- Capture all packets entering VPP via ``tapcli-1`` interface **AND** + all packets leaving VPP via ``tapcli-1`` that were sent from a pod, + or the host on the *same node* (sent from tap, not Gbe): + +:: + + $ vpptrace.sh -i tap -f "tapcli-1" + + - Capture all packets with source or destination IP address 10.1.1.3: + +:: + + $ vpptrace.sh -i tap -i dpdk -f "10.1.1.3" + + Or just: + $ vpptrace.sh "10.1.1.3" + +- Capture all SYN-ACKs received from outside: + +:: + + $ vpptrace.sh -i dpdk -f "SYN-ACK" diff --git a/docs/usecases/contiv/VPP_CONFIG.md b/docs/usecases/contiv/VPP_CONFIG.md deleted file mode 100644 index 0d0559372cb..00000000000 --- a/docs/usecases/contiv/VPP_CONFIG.md +++ /dev/null @@ -1,153 +0,0 @@ -## Creating VPP Startup Configuration -This document describes how to create the VPP startup configuration -file located at `/etc/vpp/contiv-vswitch.conf`. - -### Hardware Interface Configuration -#### Single-NIC Configuration -You need to configure hardware interfaces for use by VPP. First, find out the PCI address of the host's network interface. On -Debian-based distributions, you can use `lshw`: - -``` -sudo lshw -class network -businfo -Bus info Device Class Description -======================================================== -pci@0000:03:00.0 ens160 network VMXNET3 Ethernet Controller -``` - -In our case, it would be the `ens3` interface with the PCI address -`0000:00:03.0` - -Now, add or modify the VPP startup config file (`/etc/vpp/contiv-vswitch.conf`) -to contain the proper PCI address: -``` -unix { - nodaemon - cli-listen /run/vpp/cli.sock - cli-no-pager - coredump-size unlimited - full-coredump - poll-sleep-usec 100 -} -nat { - endpoint-dependent -} -dpdk { - dev 0000:00:03.0 -} -api-trace { - on - nitems 500 -} -``` -#### Multi-NIC Configuration -Similar to the single-NIC configuration, use command *lshw* to find the PCI -addresses of all the NICs in the system, for example: - -``` -$ sudo lshw -class network -businfo -Bus info Device Class Description -==================================================== -pci@0000:00:03.0 ens3 network Virtio network device -pci@0000:00:04.0 ens4 network Virtio network device -``` - -In the example above, `ens3` would be the primary interface and `ens4` would -be the interface that would be used by VPP. The PCI address of the `ens4` -interface would be `0000:00:04.0`. - -Make sure the selected interface is *shut down*, otherwise VPP -will not grab it: -``` -sudo ip link set ens4 down -``` - -Now, add or modify the VPP startup config file in `/etc/vpp/contiv-vswitch.conf` -to contain the proper PCI address: -``` -unix { - nodaemon - cli-listen /run/vpp/cli.sock - cli-no-pager - coredump-size unlimited - full-coredump - poll-sleep-usec 100 -} -nat { - endpoint-dependent -} -dpdk { - dev 0000:00:04.0 -} -api-trace { - on - nitems 500 -} -``` -If assigning multiple NICs to VPP you will need to include each NIC's PCI address -in the dpdk stanza in `/etc/vpp/contiv-vswitch.conf`. - -##### Assigning all NICs to VPP -On a multi-NIC node, it is also possible to assign all NICs from the kernel for -use by VPP. First, you need to install the STN daemon, as described [here][1], -since you will want the NICs to revert to the kernel if VPP crashes. - -You also need to configure the NICs in the VPP startup config file -in `/etc/vpp/contiv-vswitch.conf`. For example, to use both the primary and -secondary NIC, in a two-NIC node, your VPP startup config file would look -something like this: - -``` -unix { - nodaemon - cli-listen /run/vpp/cli.sock - cli-no-pager - coredump-size unlimited - full-coredump - poll-sleep-usec 100 -} -nat { - endpoint-dependent -} -dpdk { - dev 0000:00:03.0 - dev 0000:00:04.0 -} -api-trace { - on - nitems 500 -} -``` - -#### Installing `lshw` on CentOS/RedHat/Fedora -Note: On CentOS/RedHat/Fedora distributions, `lshw` may not be available -by default, install it by -``` -sudo yum -y install lshw -``` - -### Power-saving Mode -In regular operation, VPP takes 100% of one CPU core at all times (poll loop). -If high performance and low latency is not required you can "slow-down" -the poll-loop and drastically reduce CPU utilization by adding the following -stanza to the `unix` section of the VPP startup config file: -``` -unix { - ... - poll-sleep-usec 100 - ... -} -``` -The power-saving mode is especially useful in VM-based development environments -running on laptops or less powerful servers. - -### VPP API Trace -To troubleshoot VPP configuration issues in production environments, it is -strongly recommended to configure VPP API trace. This is done by adding the -following stanza to the VPP startup config file: -``` -api-trace { - on - nitems 500 -} -``` -You can set the size of the trace buffer with the <nitems> attribute. diff --git a/docs/usecases/contiv/VPP_CONFIG.rst b/docs/usecases/contiv/VPP_CONFIG.rst new file mode 100644 index 00000000000..5dcb6dd5f66 --- /dev/null +++ b/docs/usecases/contiv/VPP_CONFIG.rst @@ -0,0 +1,183 @@ +Creating VPP Startup Configuration +================================== + +This document describes how to create the VPP startup configuration file +located at ``/etc/vpp/contiv-vswitch.conf``. + +Hardware Interface Configuration +-------------------------------- + +Single-NIC Configuration +~~~~~~~~~~~~~~~~~~~~~~~~ + +You need to configure hardware interfaces for use by VPP. First, find +out the PCI address of the host’s network interface. On Debian-based +distributions, you can use ``lshw``: + +:: + + sudo lshw -class network -businfo + Bus info Device Class Description + ======================================================== + pci@0000:03:00.0 ens160 network VMXNET3 Ethernet Controller + +In our case, it would be the ``ens3`` interface with the PCI address +``0000:00:03.0`` + +Now, add or modify the VPP startup config file +(``/etc/vpp/contiv-vswitch.conf``) to contain the proper PCI address: + +:: + + unix { + nodaemon + cli-listen /run/vpp/cli.sock + cli-no-pager + coredump-size unlimited + full-coredump + poll-sleep-usec 100 + } + nat { + endpoint-dependent + } + dpdk { + dev 0000:00:03.0 + } + api-trace { + on + nitems 500 + } + +Multi-NIC Configuration +~~~~~~~~~~~~~~~~~~~~~~~ + +Similar to the single-NIC configuration, use command *lshw* to find the +PCI addresses of all the NICs in the system, for example: + +:: + + $ sudo lshw -class network -businfo + Bus info Device Class Description + ==================================================== + pci@0000:00:03.0 ens3 network Virtio network device + pci@0000:00:04.0 ens4 network Virtio network device + +In the example above, ``ens3`` would be the primary interface and +``ens4`` would be the interface that would be used by VPP. The PCI +address of the ``ens4`` interface would be ``0000:00:04.0``. + +Make sure the selected interface is *shut down*, otherwise VPP will not +grab it: + +:: + + sudo ip link set ens4 down + +Now, add or modify the VPP startup config file in +``/etc/vpp/contiv-vswitch.conf`` to contain the proper PCI address: + +:: + + unix { + nodaemon + cli-listen /run/vpp/cli.sock + cli-no-pager + coredump-size unlimited + full-coredump + poll-sleep-usec 100 + } + nat { + endpoint-dependent + } + dpdk { + dev 0000:00:04.0 + } + api-trace { + on + nitems 500 + } + +If assigning multiple NICs to VPP you will need to include each NIC’s +PCI address in the dpdk stanza in ``/etc/vpp/contiv-vswitch.conf``. + +Assigning all NICs to VPP +^^^^^^^^^^^^^^^^^^^^^^^^^ + +On a multi-NIC node, it is also possible to assign all NICs from the +kernel for use by VPP. First, you need to install the STN daemon, as +described [here][1], since you will want the NICs to revert to the +kernel if VPP crashes. + +You also need to configure the NICs in the VPP startup config file in +``/etc/vpp/contiv-vswitch.conf``. For example, to use both the primary +and secondary NIC, in a two-NIC node, your VPP startup config file would +look something like this: + +:: + + unix { + nodaemon + cli-listen /run/vpp/cli.sock + cli-no-pager + coredump-size unlimited + full-coredump + poll-sleep-usec 100 + } + nat { + endpoint-dependent + } + dpdk { + dev 0000:00:03.0 + dev 0000:00:04.0 + } + api-trace { + on + nitems 500 + } + +Installing ``lshw`` on CentOS/RedHat/Fedora +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Note: On CentOS/RedHat/Fedora distributions, ``lshw`` may not be +available by default, install it by + +:: + + sudo yum -y install lshw + +Power-saving Mode +----------------- + +In regular operation, VPP takes 100% of one CPU core at all times (poll +loop). If high performance and low latency is not required you can +“slow-down” the poll-loop and drastically reduce CPU utilization by +adding the following stanza to the ``unix`` section of the VPP startup +config file: + +:: + + unix { + ... + poll-sleep-usec 100 + ... + } + +The power-saving mode is especially useful in VM-based development +environments running on laptops or less powerful servers. + +VPP API Trace +------------- + +To troubleshoot VPP configuration issues in production environments, it +is strongly recommended to configure VPP API trace. This is done by +adding the following stanza to the VPP startup config file: + +:: + + api-trace { + on + nitems 500 + } + +You can set the size of the trace buffer with the attribute. + diff --git a/docs/usecases/contiv/VPP_PACKET_TRACING_K8S.md b/docs/usecases/contiv/VPP_PACKET_TRACING_K8S.md deleted file mode 100644 index 740918197e2..00000000000 --- a/docs/usecases/contiv/VPP_PACKET_TRACING_K8S.md +++ /dev/null @@ -1,510 +0,0 @@ -## How to do VPP Packet Tracing in Kubernetes - -This document describes the steps to do *manual* packet tracing (capture) using -VPP in Kubernetes. Contiv/VPP also ships with a simple bash script -[vpptrace.sh](https://github.com/contiv/vpp/blob/master/scripts/vpptrace.sh), -which allows to *continuously* trace and -*filter* packets incoming through a given set of interface types. -Documentation for vpptrace.sh is available [here](https://github.com/contiv/vpp/blob/master/docs/VPPTRACE.md). - - -More information about VPP packet tracing is in: - -* <https://wiki.fd.io/view/VPP/Command-line_Interface_(CLI)_Guide#packet_tracer> -* <https://wiki.fd.io/view/VPP/How_To_Use_The_Packet_Generator_and_Packet_Tracer> -* <https://wiki.fd.io/view/VPP/Tutorial_Routing_and_Switching> - -#### SSH into the Node -Perform the following commands to SSH into the node: - -``` -cd vpp/vagrant/vagrant-scripts/ -vagrant ssh k8s-worker1 -``` - -#### Check the VPP Graph Nodes (Input and Output Queues) - -The following content shows what is running on VPP, via the `show run` command - -``` -vagrant@k8s-worker1:~$ sudo vppctl - _______ _ _ _____ ___ - __/ __/ _ \ (_)__ | | / / _ \/ _ \ - _/ _// // / / / _ \ | |/ / ___/ ___/ - /_/ /____(_)_/\___/ |___/_/ /_/ - -vpp# show run -Time 1026791.9, average vectors/node 1.12, last 128 main loops 0.00 per node 0.00 - vector rates in 1.6459e-4, out 1.5485e-4, drop 1.3635e-5, punt 0.0000e0 - Name State Calls Vectors Suspends Clocks Vectors/Call -GigabitEthernet0/8/0-output active 56 69 0 1.34e3 1.23 -GigabitEthernet0/8/0-tx active 54 67 0 8.09e5 1.24 -acl-plugin-fa-cleaner-process event wait 0 0 1 2.84e4 0.00 -admin-up-down-process event wait 0 0 1 4.59e3 0.00 -api-rx-from-ring any wait 0 0 3316292 1.24e5 0.00 -arp-input active 3 3 0 2.53e5 1.00 -bfd-process event wait 0 0 1 5.94e3 0.00 -cdp-process any wait 0 0 145916 1.36e4 0.00 -dhcp-client-process any wait 0 0 10268 3.65e4 0.00 -dns-resolver-process any wait 0 0 1027 5.86e4 0.00 -dpdk-input polling 8211032318951 93 0 1.48e13 0.00 -dpdk-ipsec-process done 1 0 0 2.10e5 0.00 -dpdk-process any wait 0 0 342233 9.86e6 0.00 -error-drop active 12 14 0 6.67e3 1.17 -ethernet-input active 60 74 0 5.81e3 1.23 -fib-walk any wait 0 0 513322 1.59e4 0.00 -flow-report-process any wait 0 0 1 1.45e3 0.00 -flowprobe-timer-process any wait 0 0 1 6.34e3 0.00 -ikev2-manager-process any wait 0 0 1026484 1.18e4 0.00 -interface-output active 2 2 0 3.23e3 1.00 -ioam-export-process any wait 0 0 1 1.98e3 0.00 -ip-route-resolver-process any wait 0 0 10268 3.02e4 0.00 -ip4-arp active 1 1 0 1.49e4 1.00 -ip4-input active 223 248 0 3.39e3 1.11 -ip4-load-balance active 106 132 0 5.34e3 1.25 -ip4-local active 86 92 0 2.46e3 1.07 -ip4-local-end-of-arc active 86 92 0 1.00e3 1.07 -ip4-lookup active 223 248 0 3.31e3 1.11 -ip4-rewrite active 190 222 0 1.92e3 1.17 -ip4-udp-lookup active 86 92 0 3.76e3 1.07 -ip6-drop active 6 7 0 2.29e3 1.17 -ip6-icmp-neighbor-discovery-ev any wait 0 0 1026484 1.13e4 0.00 -ip6-input active 6 7 0 3.33e3 1.17 -l2-flood active 2 2 0 4.42e3 1.00 -l2-fwd active 138 157 0 2.13e3 1.14 -l2-input active 140 159 0 2.41e3 1.14 -l2-learn active 86 92 0 3.64e4 1.07 -l2-output active 54 67 0 3.05e3 1.24 -l2fib-mac-age-scanner-process event wait 0 0 85 5.01e4 0.00 -lisp-retry-service any wait 0 0 513322 1.62e4 0.00 -lldp-process event wait 0 0 1 5.02e4 0.00 -loop0-output active 54 67 0 1.66e3 1.24 -loop0-tx active 54 0 0 2.49e3 0.00 -memif-process event wait 0 0 1 1.70e4 0.00 -nat-det-expire-walk done 1 0 0 3.79e3 0.00 -nat44-classify active 171 183 0 2.49e3 1.07 -nat44-hairpinning active 86 92 0 1.80e3 1.07 -nat44-in2out active 171 183 0 4.45e3 1.07 -nat44-in2out-slowpath active 171 183 0 3.98e3 1.07 -nat44-out2in active 52 65 0 1.28e4 1.25 -nat64-expire-walk any wait 0 0 102677 5.95e4 0.00 -nat64-expire-worker-walk interrupt wa 102676 0 0 7.39e3 0.00 -send-garp-na-process event wait 0 0 1 1.28e3 0.00 -startup-config-process done 1 0 1 4.19e3 0.00 -tapcli-0-output active 1 1 0 6.97e3 1.00 -tapcli-0-tx active 1 1 0 7.32e4 1.00 -tapcli-1-output active 57 63 0 1.66e3 1.11 -tapcli-1-tx active 57 63 0 1.35e5 1.11 -tapcli-2-output active 28 28 0 3.26e3 1.00 -tapcli-2-tx active 28 28 0 4.06e5 1.00 -tapcli-rx interrupt wa 62 76 0 6.58e4 1.23 -udp-ping-process any wait 0 0 1 1.79e4 0.00 -unix-cli-127.0.0.1:43282 active 2 0 455 1.26e15 0.00 -unix-epoll-input polling 8010763239 0 0 8.17e2 0.00 -vhost-user-process any wait 0 0 1 1.96e3 0.00 -vhost-user-send-interrupt-proc any wait 0 0 1 3.85e3 0.00 -vpe-link-state-process event wait 0 0 8 9.79e4 0.00 -vpe-oam-process any wait 0 0 503263 1.21e4 0.00 -vxlan-gpe-ioam-export-process any wait 0 0 1 2.91e3 0.00 -vxlan4-encap active 54 67 0 3.55e3 1.24 -vxlan4-input active 86 92 0 3.79e3 1.07 -wildcard-ip4-arp-publisher-pro event wait 0 0 1 6.44e3 0.00 -``` - -`tapcli-rx` above is the node-level input queue for incoming packets into all the pods on the node. There is one `tapcli-rx` input queue for every node. - -The following are the input and output queues for each pod and the node: - -``` -tapcli-0-output -tapcli-0-tx -tapcli-1-output -tapcli-1-tx -tapcli-2-output -tapcli-2-tx -``` - -Each pod and node has two queues, one for rx (`tapcli-X-output`), and one for tx (`tapcli-X-tx`). The above output is with two `nginx` pods in kubernetes. - -#### Clear Existing VPP Packet Trace -Enter the following command: -``` -vpp# clear trace -``` - -#### How to Turn on VPP Packet Tracing -Enter the following commands: - -``` -vpp# trace add <input or output queue name> <number of packets to capture> - -vpp# trace add dpdk-input 1000 - -vpp# trace add tapcli-rx 1000 -``` - -#### Send Traffic to the Pods - -Open another terminal, SSH into the master node, refer the documentation in `vpp/vagrant/README.md` and send traffic to the two `nginx` pods using `wget`. - -``` -cd vpp/vagrant/vagrant-scripts/ -vagrant ssh k8s-master - -vagrant@k8s-master:~$ kubectl get pods -o wide -NAME READY STATUS RESTARTS AGE IP NODE -nginx-8586cf59-768qw 1/1 Running 0 11d 10.1.2.3 k8s-worker1 -nginx-8586cf59-d27h2 1/1 Running 0 11d 10.1.2.2 k8s-worker1 - -vagrant@k8s-master:~$ wget 10.1.2.2 ---2018-02-08 16:46:01-- http://10.1.2.2/ -Connecting to 10.1.2.2:80... connected. -HTTP request sent, awaiting response... 200 OK -Length: 612 [text/html] -Saving to: ‘index.html’ -index.html 100%[=========================================================>] 612 --.-KB/s in 0.004s -2018-02-08 16:46:01 (162 KB/s) - ‘index.html’ saved [612/612] - -vagrant@k8s-master:~$ wget 10.1.2.3 ---2018-02-08 16:46:02-- http://10.1.2.3/ -Connecting to 10.1.2.3:80... connected. -HTTP request sent, awaiting response... 200 OK -Length: 612 [text/html] -Saving to: ‘index.html.1’ -index.html.1 100%[=========================================================>] 612 --.-KB/s in 0.004s -2018-02-08 16:46:02 (143 KB/s) - ‘index.html.1’ saved [612/612] -``` - -#### Check the Packets Captured by VPP - -Back in the first terminal, check the packets captured by VPP. - -``` -vpp# show trace -... -... -Packet 33 - -21:34:51:476110: tapcli-rx - tapcli-2 -21:34:51:476115: ethernet-input - IP4: 00:00:00:00:00:02 -> 02:fe:72:95:66:c7 -21:34:51:476117: ip4-input - TCP: 10.1.2.3 -> 172.30.1.2 - tos 0x00, ttl 64, length 52, checksum 0x6fb4 - fragment id 0x11ec, flags DONT_FRAGMENT - TCP: 80 -> 58430 - seq. 0x5db741c8 ack 0x709defa7 - flags 0x11 FIN ACK, tcp header: 32 bytes - window 235, checksum 0x55c3 -21:34:51:476118: nat44-out2in - NAT44_OUT2IN: sw_if_index 6, next index 1, session index -1 -21:34:51:476120: ip4-lookup - fib 0 dpo-idx 23 flow hash: 0x00000000 - TCP: 10.1.2.3 -> 172.30.1.2 - tos 0x00, ttl 64, length 52, checksum 0x6fb4 - fragment id 0x11ec, flags DONT_FRAGMENT - TCP: 80 -> 58430 - seq. 0x5db741c8 ack 0x709defa7 - flags 0x11 FIN ACK, tcp header: 32 bytes - window 235, checksum 0x55c3 -21:34:51:476121: ip4-load-balance - fib 0 dpo-idx 23 flow hash: 0x00000000 - TCP: 10.1.2.3 -> 172.30.1.2 - tos 0x00, ttl 64, length 52, checksum 0x6fb4 - fragment id 0x11ec, flags DONT_FRAGMENT - TCP: 80 -> 58430 - seq. 0x5db741c8 ack 0x709defa7 - flags 0x11 FIN ACK, tcp header: 32 bytes - window 235, checksum 0x55c3 -21:34:51:476122: ip4-rewrite - tx_sw_if_index 3 dpo-idx 5 : ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 flow hash: 0x00000000 - 00000000: 1a2b3c4d5e011a2b3c4d5e0208004500003411ec40003f0670b40a010203ac1e - 00000020: 01020050e43e5db741c8709defa7801100eb55c300000101080a0f4b -21:34:51:476123: loop0-output - loop0 - IP4: 1a:2b:3c:4d:5e:02 -> 1a:2b:3c:4d:5e:01 - TCP: 10.1.2.3 -> 172.30.1.2 - tos 0x00, ttl 63, length 52, checksum 0x70b4 - fragment id 0x11ec, flags DONT_FRAGMENT - TCP: 80 -> 58430 - seq. 0x5db741c8 ack 0x709defa7 - flags 0x11 FIN ACK, tcp header: 32 bytes - window 235, checksum 0x55c3 -21:34:51:476124: l2-input - l2-input: sw_if_index 3 dst 1a:2b:3c:4d:5e:01 src 1a:2b:3c:4d:5e:02 -21:34:51:476125: l2-fwd - l2-fwd: sw_if_index 3 dst 1a:2b:3c:4d:5e:01 src 1a:2b:3c:4d:5e:02 bd_index 1 -21:34:51:476125: l2-output - l2-output: sw_if_index 4 dst 1a:2b:3c:4d:5e:01 src 1a:2b:3c:4d:5e:02 data 08 00 45 00 00 34 11 ec 40 00 3f 06 -21:34:51:476126: vxlan4-encap - VXLAN encap to vxlan_tunnel0 vni 10 -21:34:51:476126: ip4-load-balance - fib 4 dpo-idx 22 flow hash: 0x00000103 - UDP: 192.168.16.2 -> 192.168.16.1 - tos 0x00, ttl 254, length 102, checksum 0x1b33 - fragment id 0x0000 - UDP: 24320 -> 4789 - length 82, checksum 0x0000 -21:34:51:476127: ip4-rewrite - tx_sw_if_index 1 dpo-idx 4 : ipv4 via 192.168.16.1 GigabitEthernet0/8/0: 080027b2610908002733fb6f0800 flow hash: 0x00000103 - 00000000: 080027b2610908002733fb6f08004500006600000000fd111c33c0a81002c0a8 - 00000020: 10015f0012b5005200000800000000000a001a2b3c4d5e011a2b3c4d -21:34:51:476127: GigabitEthernet0/8/0-output - GigabitEthernet0/8/0 - IP4: 08:00:27:33:fb:6f -> 08:00:27:b2:61:09 - UDP: 192.168.16.2 -> 192.168.16.1 - tos 0x00, ttl 253, length 102, checksum 0x1c33 - fragment id 0x0000 - UDP: 24320 -> 4789 - length 82, checksum 0x0000 -21:34:51:476128: GigabitEthernet0/8/0-tx - GigabitEthernet0/8/0 tx queue 0 - buffer 0xfa7f: current data -50, length 116, free-list 0, clone-count 0, totlen-nifb 0, trace 0x20 - l2-hdr-offset 0 l3-hdr-offset 14 - PKT MBUF: port 255, nb_segs 1, pkt_len 116 - buf_len 2176, data_len 116, ol_flags 0x0, data_off 78, phys_addr 0x569ea040 - packet_type 0x0 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 - IP4: 08:00:27:33:fb:6f -> 08:00:27:b2:61:09 - UDP: 192.168.16.2 -> 192.168.16.1 - tos 0x00, ttl 253, length 102, checksum 0x1c33 - fragment id 0x0000 - UDP: 24320 -> 4789 - length 82, checksum 0x0000 -``` - -In the above captured packet, we can see: - -* Input queue name `tapcli-rx` -* Pod's IP address `10.1.2.3` -* IP address of the master node `172.30.1.2`, which sent the `wget` traffic to the two pods -* HTTP port `80`, destination port and TCP protocol (`TCP: 80 -> 58430`) -* NAT queue name `nat44-out2in` -* VXLAN VNI ID `VXLAN encap to vxlan_tunnel0 vni 10` -* VXLAN UDP port `4789` -* IP address of `GigabitEthernet0/8/0` interface (`192.168.16.2`) -* Packet on the outgoing queue `GigabitEthernet0/8/0-tx` - -#### Find IP Addresses of GigabitEthernet and the Tap Interfaces -Enter the following commands to find the IP addresses and Tap interfaces: - -``` -vpp# show int address -GigabitEthernet0/8/0 (up): - L3 192.168.16.2/24 -local0 (dn): -loop0 (up): - L2 bridge bd-id 1 idx 1 shg 0 bvi - L3 192.168.30.2/24 -tapcli-0 (up): - L3 172.30.2.1/24 -tapcli-1 (up): - L3 10.2.1.2/32 -tapcli-2 (up): - L3 10.2.1.3/32 -vxlan_tunnel0 (up): - L2 bridge bd-id 1 idx 1 shg 0 -``` - -#### Other Useful VPP CLIs - -Enter the following commands to see additional information about VPP: - -``` -vpp# show int - Name Idx State Counter Count -GigabitEthernet0/8/0 1 up rx packets 138 - rx bytes 18681 - tx packets 100 - tx bytes 29658 - drops 1 - ip4 137 - tx-error 2 -local0 0 down drops 1 -loop0 3 up rx packets 137 - rx bytes 9853 - tx packets 200 - tx bytes 49380 - drops 1 - ip4 136 -tapcli-0 2 up rx packets 8 - rx bytes 600 - tx packets 1 - tx bytes 42 - drops 9 - ip6 7 -tapcli-1 5 up rx packets 56 - rx bytes 13746 - tx packets 78 - tx bytes 6733 - drops 1 - ip4 56 -tapcli-2 6 up rx packets 42 - rx bytes 10860 - tx packets 58 - tx bytes 4996 - drops 1 - ip4 42 -vxlan_tunnel0 4 up rx packets 137 - rx bytes 11771 - tx packets 100 - tx bytes 28290 - -vpp# show hardware - Name Idx Link Hardware -GigabitEthernet0/8/0 1 up GigabitEthernet0/8/0 - Ethernet address 08:00:27:33:fb:6f - Intel 82540EM (e1000) - carrier up full duplex speed 1000 mtu 9216 - rx queues 1, rx desc 1024, tx queues 1, tx desc 1024 - cpu socket 0 - - tx frames ok 100 - tx bytes ok 29658 - rx frames ok 138 - rx bytes ok 19233 - extended stats: - rx good packets 138 - tx good packets 100 - rx good bytes 19233 - tx good bytes 29658 -local0 0 down local0 - local -loop0 3 up loop0 - Ethernet address 1a:2b:3c:4d:5e:02 -tapcli-0 2 up tapcli-0 - Ethernet address 02:fe:95:07:df:9c -tapcli-1 5 up tapcli-1 - Ethernet address 02:fe:3f:5f:0f:9a -tapcli-2 6 up tapcli-2 - Ethernet address 02:fe:72:95:66:c7 -vxlan_tunnel0 4 up vxlan_tunnel0 - VXLAN - -vpp# show bridge-domain - BD-ID Index BSN Age(min) Learning U-Forwrd UU-Flood Flooding ARP-Term BVI-Intf - 1 1 1 off on on on on off loop0 - -vpp# show bridge-domain 1 detail - BD-ID Index BSN Age(min) Learning U-Forwrd UU-Flood Flooding ARP-Term BVI-Intf - 1 1 1 off on on on on off loop0 - - Interface If-idx ISN SHG BVI TxFlood VLAN-Tag-Rewrite - loop0 3 3 0 * * none - vxlan_tunnel0 4 1 0 - * none - -vpp# show l2fib verbose - Mac-Address BD-Idx If-Idx BSN-ISN Age(min) static filter bvi Interface-Name - 1a:2b:3c:4d:5e:02 1 3 0/0 - * - * loop0 - 1a:2b:3c:4d:5e:01 1 4 1/1 - - - - vxlan_tunnel0 -L2FIB total/learned entries: 2/1 Last scan time: 0.0000e0sec Learn limit: 4194304 - -vpp# show ip fib -ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto ] locks:[src:(nil):2, src:adjacency:3, src:default-route:1, ] -0.0.0.0/0 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:1 buckets:1 uRPF:21 to:[0:0]] - [0] [@5]: ipv4 via 172.30.2.2 tapcli-0: def35b93961902fe9507df9c0800 -0.0.0.0/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:2 buckets:1 uRPF:1 to:[0:0]] - [0] [@0]: dpo-drop ip4 -10.1.1.0/24 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:24 buckets:1 uRPF:29 to:[0:0]] - [0] [@10]: dpo-load-balance: [proto:ip4 index:23 buckets:1 uRPF:28 to:[0:0] via:[98:23234]] - [0] [@5]: ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 -10.1.2.2/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:27 buckets:1 uRPF:12 to:[78:5641]] - [0] [@5]: ipv4 via 10.1.2.2 tapcli-1: 00000000000202fe3f5f0f9a0800 -10.1.2.3/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:29 buckets:1 uRPF:32 to:[58:4184]] - [0] [@5]: ipv4 via 10.1.2.3 tapcli-2: 00000000000202fe729566c70800 -10.2.1.2/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:26 buckets:1 uRPF:31 to:[0:0]] - [0] [@2]: dpo-receive: 10.2.1.2 on tapcli-1 -10.2.1.3/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:28 buckets:1 uRPF:33 to:[0:0]] - [0] [@2]: dpo-receive: 10.2.1.3 on tapcli-2 -172.30.1.0/24 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:25 buckets:1 uRPF:29 to:[98:23234]] - [0] [@10]: dpo-load-balance: [proto:ip4 index:23 buckets:1 uRPF:28 to:[0:0] via:[98:23234]] - [0] [@5]: ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 -172.30.2.0/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:14 buckets:1 uRPF:15 to:[0:0]] - [0] [@0]: dpo-drop ip4 -172.30.2.0/24 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:13 buckets:1 uRPF:14 to:[0:0]] - [0] [@4]: ipv4-glean: tapcli-0 -172.30.2.1/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:16 buckets:1 uRPF:19 to:[0:0]] - [0] [@2]: dpo-receive: 172.30.2.1 on tapcli-0 -172.30.2.2/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:17 buckets:1 uRPF:18 to:[0:0]] - [0] [@5]: ipv4 via 172.30.2.2 tapcli-0: def35b93961902fe9507df9c0800 -172.30.2.255/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:15 buckets:1 uRPF:17 to:[0:0]] - [0] [@0]: dpo-drop ip4 -192.168.16.0/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:10 buckets:1 uRPF:9 to:[0:0]] - [0] [@0]: dpo-drop ip4 -192.168.16.1/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:22 buckets:1 uRPF:34 to:[0:0] via:[100:28290]] - [0] [@5]: ipv4 via 192.168.16.1 GigabitEthernet0/8/0: 080027b2610908002733fb6f0800 -192.168.16.0/24 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:9 buckets:1 uRPF:30 to:[0:0]] - [0] [@4]: ipv4-glean: GigabitEthernet0/8/0 -192.168.16.2/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:12 buckets:1 uRPF:13 to:[137:16703]] - [0] [@2]: dpo-receive: 192.168.16.2 on GigabitEthernet0/8/0 -192.168.16.255/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:11 buckets:1 uRPF:11 to:[0:0]] - [0] [@0]: dpo-drop ip4 -192.168.30.0/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:19 buckets:1 uRPF:23 to:[0:0]] - [0] [@0]: dpo-drop ip4 -192.168.30.1/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:23 buckets:1 uRPF:28 to:[0:0] via:[98:23234]] - [0] [@5]: ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 -192.168.30.0/24 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:18 buckets:1 uRPF:22 to:[0:0]] - [0] [@4]: ipv4-glean: loop0 -192.168.30.2/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:21 buckets:1 uRPF:27 to:[0:0]] - [0] [@2]: dpo-receive: 192.168.30.2 on loop0 -192.168.30.255/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:20 buckets:1 uRPF:25 to:[0:0]] - [0] [@0]: dpo-drop ip4 -224.0.0.0/4 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:4 buckets:1 uRPF:3 to:[0:0]] - [0] [@0]: dpo-drop ip4 -240.0.0.0/4 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:3 buckets:1 uRPF:2 to:[0:0]] - [0] [@0]: dpo-drop ip4 -255.255.255.255/32 - unicast-ip4-chain - [@0]: dpo-load-balance: [proto:ip4 index:5 buckets:1 uRPF:4 to:[0:0]] - [0] [@0]: dpo-drop ip4 -``` diff --git a/docs/usecases/contiv/VPP_PACKET_TRACING_K8S.rst b/docs/usecases/contiv/VPP_PACKET_TRACING_K8S.rst new file mode 100644 index 00000000000..765d65713e0 --- /dev/null +++ b/docs/usecases/contiv/VPP_PACKET_TRACING_K8S.rst @@ -0,0 +1,535 @@ +How to do VPP Packet Tracing in Kubernetes +========================================== + +This document describes the steps to do *manual* packet tracing +(capture) using VPP in Kubernetes. Contiv/VPP also ships with a simple +bash script +`vpptrace.sh <https://github.com/contiv/vpp/blob/master/scripts/vpptrace.sh>`__, +which allows to *continuously* trace and *filter* packets incoming +through a given set of interface types. Documentation for vpptrace.sh is +available +`here <https://github.com/contiv/vpp/blob/master/docs/VPPTRACE.md>`__. + +More information about VPP packet tracing is in: + +- https://wiki.fd.io/view/VPP/Command-line_Interface_(CLI)_Guide#packet_tracer +- https://wiki.fd.io/view/VPP/How_To_Use_The_Packet_Generator_and_Packet_Tracer +- https://wiki.fd.io/view/VPP/Tutorial_Routing_and_Switching + +SSH into the Node +----------------- + +Perform the following commands to SSH into the node: + +:: + + cd vpp/vagrant/vagrant-scripts/ + vagrant ssh k8s-worker1 + +Check the VPP Graph Nodes (Input and Output Queues) +--------------------------------------------------- + +The following content shows what is running on VPP, via the ``show run`` +command + +:: + + vagrant@k8s-worker1:~$ sudo vppctl + _______ _ _ _____ ___ + __/ __/ _ \ (_)__ | | / / _ \/ _ \ + _/ _// // / / / _ \ | |/ / ___/ ___/ + /_/ /____(_)_/\___/ |___/_/ /_/ + + vpp# show run + Time 1026791.9, average vectors/node 1.12, last 128 main loops 0.00 per node 0.00 + vector rates in 1.6459e-4, out 1.5485e-4, drop 1.3635e-5, punt 0.0000e0 + Name State Calls Vectors Suspends Clocks Vectors/Call + GigabitEthernet0/8/0-output active 56 69 0 1.34e3 1.23 + GigabitEthernet0/8/0-tx active 54 67 0 8.09e5 1.24 + acl-plugin-fa-cleaner-process event wait 0 0 1 2.84e4 0.00 + admin-up-down-process event wait 0 0 1 4.59e3 0.00 + api-rx-from-ring any wait 0 0 3316292 1.24e5 0.00 + arp-input active 3 3 0 2.53e5 1.00 + bfd-process event wait 0 0 1 5.94e3 0.00 + cdp-process any wait 0 0 145916 1.36e4 0.00 + dhcp-client-process any wait 0 0 10268 3.65e4 0.00 + dns-resolver-process any wait 0 0 1027 5.86e4 0.00 + dpdk-input polling 8211032318951 93 0 1.48e13 0.00 + dpdk-ipsec-process done 1 0 0 2.10e5 0.00 + dpdk-process any wait 0 0 342233 9.86e6 0.00 + error-drop active 12 14 0 6.67e3 1.17 + ethernet-input active 60 74 0 5.81e3 1.23 + fib-walk any wait 0 0 513322 1.59e4 0.00 + flow-report-process any wait 0 0 1 1.45e3 0.00 + flowprobe-timer-process any wait 0 0 1 6.34e3 0.00 + ikev2-manager-process any wait 0 0 1026484 1.18e4 0.00 + interface-output active 2 2 0 3.23e3 1.00 + ioam-export-process any wait 0 0 1 1.98e3 0.00 + ip-route-resolver-process any wait 0 0 10268 3.02e4 0.00 + ip4-arp active 1 1 0 1.49e4 1.00 + ip4-input active 223 248 0 3.39e3 1.11 + ip4-load-balance active 106 132 0 5.34e3 1.25 + ip4-local active 86 92 0 2.46e3 1.07 + ip4-local-end-of-arc active 86 92 0 1.00e3 1.07 + ip4-lookup active 223 248 0 3.31e3 1.11 + ip4-rewrite active 190 222 0 1.92e3 1.17 + ip4-udp-lookup active 86 92 0 3.76e3 1.07 + ip6-drop active 6 7 0 2.29e3 1.17 + ip6-icmp-neighbor-discovery-ev any wait 0 0 1026484 1.13e4 0.00 + ip6-input active 6 7 0 3.33e3 1.17 + l2-flood active 2 2 0 4.42e3 1.00 + l2-fwd active 138 157 0 2.13e3 1.14 + l2-input active 140 159 0 2.41e3 1.14 + l2-learn active 86 92 0 3.64e4 1.07 + l2-output active 54 67 0 3.05e3 1.24 + l2fib-mac-age-scanner-process event wait 0 0 85 5.01e4 0.00 + lisp-retry-service any wait 0 0 513322 1.62e4 0.00 + lldp-process event wait 0 0 1 5.02e4 0.00 + loop0-output active 54 67 0 1.66e3 1.24 + loop0-tx active 54 0 0 2.49e3 0.00 + memif-process event wait 0 0 1 1.70e4 0.00 + nat-det-expire-walk done 1 0 0 3.79e3 0.00 + nat44-classify active 171 183 0 2.49e3 1.07 + nat44-hairpinning active 86 92 0 1.80e3 1.07 + nat44-in2out active 171 183 0 4.45e3 1.07 + nat44-in2out-slowpath active 171 183 0 3.98e3 1.07 + nat44-out2in active 52 65 0 1.28e4 1.25 + nat64-expire-walk any wait 0 0 102677 5.95e4 0.00 + nat64-expire-worker-walk interrupt wa 102676 0 0 7.39e3 0.00 + send-garp-na-process event wait 0 0 1 1.28e3 0.00 + startup-config-process done 1 0 1 4.19e3 0.00 + tapcli-0-output active 1 1 0 6.97e3 1.00 + tapcli-0-tx active 1 1 0 7.32e4 1.00 + tapcli-1-output active 57 63 0 1.66e3 1.11 + tapcli-1-tx active 57 63 0 1.35e5 1.11 + tapcli-2-output active 28 28 0 3.26e3 1.00 + tapcli-2-tx active 28 28 0 4.06e5 1.00 + tapcli-rx interrupt wa 62 76 0 6.58e4 1.23 + udp-ping-process any wait 0 0 1 1.79e4 0.00 + unix-cli-127.0.0.1:43282 active 2 0 455 1.26e15 0.00 + unix-epoll-input polling 8010763239 0 0 8.17e2 0.00 + vhost-user-process any wait 0 0 1 1.96e3 0.00 + vhost-user-send-interrupt-proc any wait 0 0 1 3.85e3 0.00 + vpe-link-state-process event wait 0 0 8 9.79e4 0.00 + vpe-oam-process any wait 0 0 503263 1.21e4 0.00 + vxlan-gpe-ioam-export-process any wait 0 0 1 2.91e3 0.00 + vxlan4-encap active 54 67 0 3.55e3 1.24 + vxlan4-input active 86 92 0 3.79e3 1.07 + wildcard-ip4-arp-publisher-pro event wait 0 0 1 6.44e3 0.00 + +``tapcli-rx`` above is the node-level input queue for incoming packets +into all the pods on the node. There is one ``tapcli-rx`` input queue +for every node. + +The following are the input and output queues for each pod and the node: + +:: + + tapcli-0-output + tapcli-0-tx + tapcli-1-output + tapcli-1-tx + tapcli-2-output + tapcli-2-tx + +Each pod and node has two queues, one for rx (``tapcli-X-output``), and +one for tx (``tapcli-X-tx``). The above output is with two ``nginx`` +pods in kubernetes. + +Clear Existing VPP Packet Trace +------------------------------- + +Enter the following command: + +:: + + vpp# clear trace + +How to Turn on VPP Packet Tracing +--------------------------------- + +Enter the following commands: + +:: + + vpp# trace add <input or output queue name> <number of packets to capture> + + vpp# trace add dpdk-input 1000 + + vpp# trace add tapcli-rx 1000 + +Send Traffic to the Pods +------------------------ + +Open another terminal, SSH into the master node, refer the documentation +in ``vpp/vagrant/README.md`` and send traffic to the two ``nginx`` pods +using ``wget``. + +:: + + cd vpp/vagrant/vagrant-scripts/ + vagrant ssh k8s-master + + vagrant@k8s-master:~$ kubectl get pods -o wide + NAME READY STATUS RESTARTS AGE IP NODE + nginx-8586cf59-768qw 1/1 Running 0 11d 10.1.2.3 k8s-worker1 + nginx-8586cf59-d27h2 1/1 Running 0 11d 10.1.2.2 k8s-worker1 + + vagrant@k8s-master:~$ wget 10.1.2.2 + --2018-02-08 16:46:01-- http://10.1.2.2/ + Connecting to 10.1.2.2:80... connected. + HTTP request sent, awaiting response... 200 OK + Length: 612 [text/html] + Saving to: ‘index.html’ + index.html 100%[=========================================================>] 612 --.-KB/s in 0.004s + 2018-02-08 16:46:01 (162 KB/s) - ‘index.html’ saved [612/612] + + vagrant@k8s-master:~$ wget 10.1.2.3 + --2018-02-08 16:46:02-- http://10.1.2.3/ + Connecting to 10.1.2.3:80... connected. + HTTP request sent, awaiting response... 200 OK + Length: 612 [text/html] + Saving to: ‘index.html.1’ + index.html.1 100%[=========================================================>] 612 --.-KB/s in 0.004s + 2018-02-08 16:46:02 (143 KB/s) - ‘index.html.1’ saved [612/612] + +Check the Packets Captured by VPP +--------------------------------- + +Back in the first terminal, check the packets captured by VPP. + +:: + + vpp# show trace + ... + ... + Packet 33 + + 21:34:51:476110: tapcli-rx + tapcli-2 + 21:34:51:476115: ethernet-input + IP4: 00:00:00:00:00:02 -> 02:fe:72:95:66:c7 + 21:34:51:476117: ip4-input + TCP: 10.1.2.3 -> 172.30.1.2 + tos 0x00, ttl 64, length 52, checksum 0x6fb4 + fragment id 0x11ec, flags DONT_FRAGMENT + TCP: 80 -> 58430 + seq. 0x5db741c8 ack 0x709defa7 + flags 0x11 FIN ACK, tcp header: 32 bytes + window 235, checksum 0x55c3 + 21:34:51:476118: nat44-out2in + NAT44_OUT2IN: sw_if_index 6, next index 1, session index -1 + 21:34:51:476120: ip4-lookup + fib 0 dpo-idx 23 flow hash: 0x00000000 + TCP: 10.1.2.3 -> 172.30.1.2 + tos 0x00, ttl 64, length 52, checksum 0x6fb4 + fragment id 0x11ec, flags DONT_FRAGMENT + TCP: 80 -> 58430 + seq. 0x5db741c8 ack 0x709defa7 + flags 0x11 FIN ACK, tcp header: 32 bytes + window 235, checksum 0x55c3 + 21:34:51:476121: ip4-load-balance + fib 0 dpo-idx 23 flow hash: 0x00000000 + TCP: 10.1.2.3 -> 172.30.1.2 + tos 0x00, ttl 64, length 52, checksum 0x6fb4 + fragment id 0x11ec, flags DONT_FRAGMENT + TCP: 80 -> 58430 + seq. 0x5db741c8 ack 0x709defa7 + flags 0x11 FIN ACK, tcp header: 32 bytes + window 235, checksum 0x55c3 + 21:34:51:476122: ip4-rewrite + tx_sw_if_index 3 dpo-idx 5 : ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 flow hash: 0x00000000 + 00000000: 1a2b3c4d5e011a2b3c4d5e0208004500003411ec40003f0670b40a010203ac1e + 00000020: 01020050e43e5db741c8709defa7801100eb55c300000101080a0f4b + 21:34:51:476123: loop0-output + loop0 + IP4: 1a:2b:3c:4d:5e:02 -> 1a:2b:3c:4d:5e:01 + TCP: 10.1.2.3 -> 172.30.1.2 + tos 0x00, ttl 63, length 52, checksum 0x70b4 + fragment id 0x11ec, flags DONT_FRAGMENT + TCP: 80 -> 58430 + seq. 0x5db741c8 ack 0x709defa7 + flags 0x11 FIN ACK, tcp header: 32 bytes + window 235, checksum 0x55c3 + 21:34:51:476124: l2-input + l2-input: sw_if_index 3 dst 1a:2b:3c:4d:5e:01 src 1a:2b:3c:4d:5e:02 + 21:34:51:476125: l2-fwd + l2-fwd: sw_if_index 3 dst 1a:2b:3c:4d:5e:01 src 1a:2b:3c:4d:5e:02 bd_index 1 + 21:34:51:476125: l2-output + l2-output: sw_if_index 4 dst 1a:2b:3c:4d:5e:01 src 1a:2b:3c:4d:5e:02 data 08 00 45 00 00 34 11 ec 40 00 3f 06 + 21:34:51:476126: vxlan4-encap + VXLAN encap to vxlan_tunnel0 vni 10 + 21:34:51:476126: ip4-load-balance + fib 4 dpo-idx 22 flow hash: 0x00000103 + UDP: 192.168.16.2 -> 192.168.16.1 + tos 0x00, ttl 254, length 102, checksum 0x1b33 + fragment id 0x0000 + UDP: 24320 -> 4789 + length 82, checksum 0x0000 + 21:34:51:476127: ip4-rewrite + tx_sw_if_index 1 dpo-idx 4 : ipv4 via 192.168.16.1 GigabitEthernet0/8/0: 080027b2610908002733fb6f0800 flow hash: 0x00000103 + 00000000: 080027b2610908002733fb6f08004500006600000000fd111c33c0a81002c0a8 + 00000020: 10015f0012b5005200000800000000000a001a2b3c4d5e011a2b3c4d + 21:34:51:476127: GigabitEthernet0/8/0-output + GigabitEthernet0/8/0 + IP4: 08:00:27:33:fb:6f -> 08:00:27:b2:61:09 + UDP: 192.168.16.2 -> 192.168.16.1 + tos 0x00, ttl 253, length 102, checksum 0x1c33 + fragment id 0x0000 + UDP: 24320 -> 4789 + length 82, checksum 0x0000 + 21:34:51:476128: GigabitEthernet0/8/0-tx + GigabitEthernet0/8/0 tx queue 0 + buffer 0xfa7f: current data -50, length 116, free-list 0, clone-count 0, totlen-nifb 0, trace 0x20 + l2-hdr-offset 0 l3-hdr-offset 14 + PKT MBUF: port 255, nb_segs 1, pkt_len 116 + buf_len 2176, data_len 116, ol_flags 0x0, data_off 78, phys_addr 0x569ea040 + packet_type 0x0 l2_len 0 l3_len 0 outer_l2_len 0 outer_l3_len 0 + IP4: 08:00:27:33:fb:6f -> 08:00:27:b2:61:09 + UDP: 192.168.16.2 -> 192.168.16.1 + tos 0x00, ttl 253, length 102, checksum 0x1c33 + fragment id 0x0000 + UDP: 24320 -> 4789 + length 82, checksum 0x0000 + +In the above captured packet, we can see: + +- Input queue name ``tapcli-rx`` +- Pod’s IP address ``10.1.2.3`` +- IP address of the master node ``172.30.1.2``, which sent the ``wget`` + traffic to the two pods +- HTTP port ``80``, destination port and TCP protocol + (``TCP: 80 -> 58430``) +- NAT queue name ``nat44-out2in`` +- VXLAN VNI ID ``VXLAN encap to vxlan_tunnel0 vni 10`` +- VXLAN UDP port ``4789`` +- IP address of ``GigabitEthernet0/8/0`` interface (``192.168.16.2``) +- Packet on the outgoing queue ``GigabitEthernet0/8/0-tx`` + +Find IP Addresses of GigabitEthernet and the Tap Interfaces +----------------------------------------------------------- + +Enter the following commands to find the IP addresses and Tap +interfaces: + +:: + + vpp# show int address + GigabitEthernet0/8/0 (up): + L3 192.168.16.2/24 + local0 (dn): + loop0 (up): + L2 bridge bd-id 1 idx 1 shg 0 bvi + L3 192.168.30.2/24 + tapcli-0 (up): + L3 172.30.2.1/24 + tapcli-1 (up): + L3 10.2.1.2/32 + tapcli-2 (up): + L3 10.2.1.3/32 + vxlan_tunnel0 (up): + L2 bridge bd-id 1 idx 1 shg 0 + +Other Useful VPP CLIs +--------------------- + +Enter the following commands to see additional information about VPP: + +:: + + vpp# show int + Name Idx State Counter Count + GigabitEthernet0/8/0 1 up rx packets 138 + rx bytes 18681 + tx packets 100 + tx bytes 29658 + drops 1 + ip4 137 + tx-error 2 + local0 0 down drops 1 + loop0 3 up rx packets 137 + rx bytes 9853 + tx packets 200 + tx bytes 49380 + drops 1 + ip4 136 + tapcli-0 2 up rx packets 8 + rx bytes 600 + tx packets 1 + tx bytes 42 + drops 9 + ip6 7 + tapcli-1 5 up rx packets 56 + rx bytes 13746 + tx packets 78 + tx bytes 6733 + drops 1 + ip4 56 + tapcli-2 6 up rx packets 42 + rx bytes 10860 + tx packets 58 + tx bytes 4996 + drops 1 + ip4 42 + vxlan_tunnel0 4 up rx packets 137 + rx bytes 11771 + tx packets 100 + tx bytes 28290 + + vpp# show hardware + Name Idx Link Hardware + GigabitEthernet0/8/0 1 up GigabitEthernet0/8/0 + Ethernet address 08:00:27:33:fb:6f + Intel 82540EM (e1000) + carrier up full duplex speed 1000 mtu 9216 + rx queues 1, rx desc 1024, tx queues 1, tx desc 1024 + cpu socket 0 + + tx frames ok 100 + tx bytes ok 29658 + rx frames ok 138 + rx bytes ok 19233 + extended stats: + rx good packets 138 + tx good packets 100 + rx good bytes 19233 + tx good bytes 29658 + local0 0 down local0 + local + loop0 3 up loop0 + Ethernet address 1a:2b:3c:4d:5e:02 + tapcli-0 2 up tapcli-0 + Ethernet address 02:fe:95:07:df:9c + tapcli-1 5 up tapcli-1 + Ethernet address 02:fe:3f:5f:0f:9a + tapcli-2 6 up tapcli-2 + Ethernet address 02:fe:72:95:66:c7 + vxlan_tunnel0 4 up vxlan_tunnel0 + VXLAN + + vpp# show bridge-domain + BD-ID Index BSN Age(min) Learning U-Forwrd UU-Flood Flooding ARP-Term BVI-Intf + 1 1 1 off on on on on off loop0 + + vpp# show bridge-domain 1 detail + BD-ID Index BSN Age(min) Learning U-Forwrd UU-Flood Flooding ARP-Term BVI-Intf + 1 1 1 off on on on on off loop0 + + Interface If-idx ISN SHG BVI TxFlood VLAN-Tag-Rewrite + loop0 3 3 0 * * none + vxlan_tunnel0 4 1 0 - * none + + vpp# show l2fib verbose + Mac-Address BD-Idx If-Idx BSN-ISN Age(min) static filter bvi Interface-Name + 1a:2b:3c:4d:5e:02 1 3 0/0 - * - * loop0 + 1a:2b:3c:4d:5e:01 1 4 1/1 - - - - vxlan_tunnel0 + L2FIB total/learned entries: 2/1 Last scan time: 0.0000e0sec Learn limit: 4194304 + + vpp# show ip fib + ipv4-VRF:0, fib_index:0, flow hash:[src dst sport dport proto ] locks:[src:(nil):2, src:adjacency:3, src:default-route:1, ] + 0.0.0.0/0 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:1 buckets:1 uRPF:21 to:[0:0]] + [0] [@5]: ipv4 via 172.30.2.2 tapcli-0: def35b93961902fe9507df9c0800 + 0.0.0.0/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:2 buckets:1 uRPF:1 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 10.1.1.0/24 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:24 buckets:1 uRPF:29 to:[0:0]] + [0] [@10]: dpo-load-balance: [proto:ip4 index:23 buckets:1 uRPF:28 to:[0:0] via:[98:23234]] + [0] [@5]: ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 + 10.1.2.2/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:27 buckets:1 uRPF:12 to:[78:5641]] + [0] [@5]: ipv4 via 10.1.2.2 tapcli-1: 00000000000202fe3f5f0f9a0800 + 10.1.2.3/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:29 buckets:1 uRPF:32 to:[58:4184]] + [0] [@5]: ipv4 via 10.1.2.3 tapcli-2: 00000000000202fe729566c70800 + 10.2.1.2/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:26 buckets:1 uRPF:31 to:[0:0]] + [0] [@2]: dpo-receive: 10.2.1.2 on tapcli-1 + 10.2.1.3/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:28 buckets:1 uRPF:33 to:[0:0]] + [0] [@2]: dpo-receive: 10.2.1.3 on tapcli-2 + 172.30.1.0/24 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:25 buckets:1 uRPF:29 to:[98:23234]] + [0] [@10]: dpo-load-balance: [proto:ip4 index:23 buckets:1 uRPF:28 to:[0:0] via:[98:23234]] + [0] [@5]: ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 + 172.30.2.0/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:14 buckets:1 uRPF:15 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 172.30.2.0/24 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:13 buckets:1 uRPF:14 to:[0:0]] + [0] [@4]: ipv4-glean: tapcli-0 + 172.30.2.1/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:16 buckets:1 uRPF:19 to:[0:0]] + [0] [@2]: dpo-receive: 172.30.2.1 on tapcli-0 + 172.30.2.2/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:17 buckets:1 uRPF:18 to:[0:0]] + [0] [@5]: ipv4 via 172.30.2.2 tapcli-0: def35b93961902fe9507df9c0800 + 172.30.2.255/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:15 buckets:1 uRPF:17 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 192.168.16.0/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:10 buckets:1 uRPF:9 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 192.168.16.1/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:22 buckets:1 uRPF:34 to:[0:0] via:[100:28290]] + [0] [@5]: ipv4 via 192.168.16.1 GigabitEthernet0/8/0: 080027b2610908002733fb6f0800 + 192.168.16.0/24 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:9 buckets:1 uRPF:30 to:[0:0]] + [0] [@4]: ipv4-glean: GigabitEthernet0/8/0 + 192.168.16.2/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:12 buckets:1 uRPF:13 to:[137:16703]] + [0] [@2]: dpo-receive: 192.168.16.2 on GigabitEthernet0/8/0 + 192.168.16.255/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:11 buckets:1 uRPF:11 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 192.168.30.0/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:19 buckets:1 uRPF:23 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 192.168.30.1/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:23 buckets:1 uRPF:28 to:[0:0] via:[98:23234]] + [0] [@5]: ipv4 via 192.168.30.1 loop0: 1a2b3c4d5e011a2b3c4d5e020800 + 192.168.30.0/24 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:18 buckets:1 uRPF:22 to:[0:0]] + [0] [@4]: ipv4-glean: loop0 + 192.168.30.2/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:21 buckets:1 uRPF:27 to:[0:0]] + [0] [@2]: dpo-receive: 192.168.30.2 on loop0 + 192.168.30.255/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:20 buckets:1 uRPF:25 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 224.0.0.0/4 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:4 buckets:1 uRPF:3 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 240.0.0.0/4 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:3 buckets:1 uRPF:2 to:[0:0]] + [0] [@0]: dpo-drop ip4 + 255.255.255.255/32 + unicast-ip4-chain + [@0]: dpo-load-balance: [proto:ip4 index:5 buckets:1 uRPF:4 to:[0:0]] + [0] [@0]: dpo-drop ip4 diff --git a/docs/usecases/contiv/Vagrant.md b/docs/usecases/contiv/Vagrant.md deleted file mode 100644 index a9040a6c1a1..00000000000 --- a/docs/usecases/contiv/Vagrant.md +++ /dev/null @@ -1,250 +0,0 @@ -## Contiv-VPP Vagrant Installation - -### Prerequisites -The following items are prerequisites before installing vagrant: -- Vagrant 2.0.1 or later -- Hypervisors: - - VirtualBox 5.2.8 or later - - VMWare Fusion 10.1.0 or later or VmWare Workstation 14 - - For VmWare Fusion, you will need the [Vagrant VmWare Fusion plugin](https://www.vagrantup.com/vmware/index.html) -- Laptop or server with at least 4 CPU cores and 16 Gig of RAM - -### Creating / Shutting Down / Destroying the Cluster -This folder contains the Vagrant file that is used to create a single or multi-node -Kubernetes cluster using Contiv-VPP as a Network Plugin. - -The folder is organized into two subfolders: - - - (config) - contains the files that share cluster information, which are used - during the provisioning stage (master IP address, Certificates, hash-keys). - **CAUTION:** Editing is not recommended! - - (vagrant) - contains scripts that are used for creating, destroying, rebooting - and shutting down the VMs that host the K8s cluster. - -To create and run a K8s cluster with a *contiv-vpp CNI* plugin, run the -`vagrant-start` script, located in the [vagrant folder](https://github.com/contiv/vpp/tree/master/vagrant). The `vagrant-start` -script prompts the user to select the number of worker nodes for the kubernetes cluster. -Zero (0) worker nodes mean that a single-node cluster (with one kubernetes master node) will be deployed. - -Next, the user is prompted to select either the *production environment* or the *development environment*. -Instructions on how to build the development *contiv/vpp-vswitch* image can be found below in the -[development environment](#building-and-deploying-the-dev-contiv-vswitch-image) command section. - -The last option asks the user to select either *Without StealTheNIC* or *With StealTheNIC*. -Using option *With StealTheNIC* has the plugin "steal" interfaces owned by Linux and uses their configuration in VPP. - -For the production environment, enter the following commands: -``` -| => ./vagrant-start -Please provide the number of workers for the Kubernetes cluster (0-50) or enter [Q/q] to exit: 1 - -Please choose Kubernetes environment: -1) Production -2) Development -3) Quit ---> 1 -You chose Development environment - -Please choose deployment scenario: -1) Without StealTheNIC -2) With StealTheNIC -3) Quit ---> 1 -You chose deployment without StealTheNIC - -Creating a production environment, without STN and 1 worker node(s) -``` - -For the development environment, enter the following commands: -``` -| => ./vagrant-start -Please provide the number of workers for the Kubernetes cluster (0-50) or enter [Q/q] to exit: 1 - -Please choose Kubernetes environment: -1) Production -2) Development -3) Quit ---> 2 -You chose Development environment - -Please choose deployment scenario: -1) Without StealTheNIC -2) With StealTheNIC -3) Quit ---> 1 -You chose deployment without StealTheNIC - -Creating a development environment, without STN and 1 worker node(s) -``` - -To destroy and clean-up the cluster, run the *vagrant-cleanup* script, located -[inside the vagrant folder](https://github.com/contiv/vpp/tree/master/vagrant): -``` -cd vagrant/ -./vagrant-cleanup -``` - -To shutdown the cluster, run the *vagrant-shutdown* script, located [inside the vagrant folder](https://github.com/contiv/vpp/tree/master/vagrant): -``` -cd vagrant/ -./vagrant-shutdown -``` - -- To reboot the cluster, run the *vagrant-reload* script, located [inside the vagrant folder](https://github.com/contiv/vpp/tree/master/vagrant): -``` -cd vagrant/ -./vagrant-reload -``` - -- From a suspended state, or after a reboot of the host machine, the cluster -can be brought up by running the *vagrant-up* script. - - -### Building and Deploying the dev-contiv-vswitch Image -If you chose the optional development-environment-deployment option, then perform the -following instructions on how to build a modified *contivvpp/vswitch* image: - -- Make sure changes in the code have been saved. From the k8s-master node, - build the new *contivvpp/vswitch* image (run as sudo): - -``` -vagrant ssh k8s-master -cd /vagrant/config -sudo ./save-dev-image -``` - -- The newly built *contivvpp/vswitch* image is now tagged as *latest*. Verify the -build with `sudo docker images`; the *contivvpp/vswitch* should have been created a few -seconds ago. The new image with all the changes must become available to all -the nodes in the K8s cluster. To make the changes available to all, load the docker image into the running -worker nodes (run as sudo): - -``` -vagrant ssh k8s-worker1 -cd /vagrant/config -sudo ./load-dev-image -``` - -- Verify with `sudo docker images`; the old *contivvpp/vswitch* should now be tagged as -`<none>` and the latest tagged *contivvpp/vswitch* should have been created a -few seconds ago. - -### Exploring the Cluster -Once the cluster is up, perform the following steps: -- Log into the master: -``` -cd vagrant - -vagrant ssh k8s-master - -Welcome to Ubuntu 16.04 LTS (GNU/Linux 4.4.0-21-generic x86_64) - - * Documentation: https://help.ubuntu.com/ -vagrant@k8s-master:~$ -``` -- Verify the Kubernetes/Contiv-VPP installation. First, verify the nodes -in the cluster: - -``` -vagrant@k8s-master:~$ kubectl get nodes -o wide - -NAME STATUS ROLES AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME -k8s-master Ready master 22m v1.9.2 <none> Ubuntu 16.04 LTS 4.4.0-21-generic docker://17.12.0-ce -k8s-worker1 Ready <none> 15m v1.9.2 <none> Ubuntu 16.04 LTS 4.4.0-21-generic docker://17.12.0-ce -``` - -- Next, verify that all pods are running correctly: - -``` -vagrant@k8s-master:~$ kubectl get pods -n kube-system -o wide - -NAME READY STATUS RESTARTS AGE IP NODE -contiv-etcd-2ngdc 1/1 Running 0 17m 192.169.1.10 k8s-master -contiv-ksr-x7gsq 1/1 Running 3 17m 192.169.1.10 k8s-master -contiv-vswitch-9bql6 2/2 Running 0 17m 192.169.1.10 k8s-master -contiv-vswitch-hpt2x 2/2 Running 0 10m 192.169.1.11 k8s-worker1 -etcd-k8s-master 1/1 Running 0 16m 192.169.1.10 k8s-master -kube-apiserver-k8s-master 1/1 Running 0 16m 192.169.1.10 k8s-master -kube-controller-manager-k8s-master 1/1 Running 0 15m 192.169.1.10 k8s-master -kube-dns-6f4fd4bdf-62rv4 2/3 CrashLoopBackOff 14 17m 10.1.1.2 k8s-master -kube-proxy-bvr74 1/1 Running 0 10m 192.169.1.11 k8s-worker1 -kube-proxy-v4fzq 1/1 Running 0 17m 192.169.1.10 k8s-master -kube-scheduler-k8s-master 1/1 Running 0 16m 192.169.1.10 k8s-master -``` - -- If you want your pods to be scheduled on both the master and the workers, -you have to untaint the master node: -``` - -``` - -- Check VPP and its interfaces: -``` -vagrant@k8s-master:~$ sudo vppctl - _______ _ _ _____ ___ - __/ __/ _ \ (_)__ | | / / _ \/ _ \ - _/ _// // / / / _ \ | |/ / ___/ ___/ - /_/ /____(_)_/\___/ |___/_/ /_/ - -vpp# sh interface - Name Idx State Counter Count -GigabitEthernet0/8/0 1 up rx packets 14 - rx bytes 3906 - tx packets 18 - tx bytes 2128 - drops 3 - ip4 13 -... - -``` -- Make sure that `GigabitEthernet0/8/0` is listed and that its status is `up`. - -- Next, create an example deployment of nginx pods: -``` -vagrant@k8s-master:~$ kubectl run nginx --image=nginx --replicas=2 -deployment "nginx" created -``` -- Check the status of the deployment: - -``` -vagrant@k8s-master:~$ kubectl get deploy -o wide - -NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR -nginx 2 2 2 2 2h nginx nginx run=nginx -``` - -- Verify that the pods in the deployment are up and running: -``` -vagrant@k8s-master:~$ kubectl get pods -o wide - -NAME READY STATUS RESTARTS AGE IP NODE -nginx-8586cf59-6kx2m 1/1 Running 1 1h 10.1.2.3 k8s-worker1 -nginx-8586cf59-j5vf9 1/1 Running 1 1h 10.1.2.2 k8s-worker1 -``` - -- Issue an HTTP GET request to a pod in the deployment: - -``` -vagrant@k8s-master:~$ wget 10.1.2.2 - ---2018-01-19 12:34:08-- http://10.1.2.2/ -Connecting to 10.1.2.2:80... connected. -HTTP request sent, awaiting response... 200 OK -Length: 612 [text/html] -Saving to: ‘index.html.1’ - -index.html.1 100%[=========================================>] 612 --.-KB/s in 0s - -2018-01-19 12:34:08 (1.78 MB/s) - ‘index.html.1’ saved [612/612] -``` - -#### How to SSH into k8s Worker Node -To SSH into k8s Worker Node, perform the following steps: - -``` -cd vagrant - -vagrant status - -vagrant ssh k8s-worker1 -``` diff --git a/docs/usecases/contiv/Vagrant.rst b/docs/usecases/contiv/Vagrant.rst new file mode 100644 index 00000000000..035dd09b88e --- /dev/null +++ b/docs/usecases/contiv/Vagrant.rst @@ -0,0 +1,284 @@ +Contiv-VPP Vagrant Installation +=============================== + +Prerequisites +------------- + +The following items are prerequisites before installing vagrant: - +Vagrant 2.0.1 or later - Hypervisors: - VirtualBox 5.2.8 or later - +VMWare Fusion 10.1.0 or later or VmWare Workstation 14 - For VmWare +Fusion, you will need the `Vagrant VmWare Fusion +plugin <https://www.vagrantup.com/vmware/index.html>`__ - Laptop or +server with at least 4 CPU cores and 16 Gig of RAM + +Creating / Shutting Down / Destroying the Cluster +------------------------------------------------- + +This folder contains the Vagrant file that is used to create a single or +multi-node Kubernetes cluster using Contiv-VPP as a Network Plugin. + +The folder is organized into two subfolders: + +- (config) - contains the files that share cluster information, which + are used during the provisioning stage (master IP address, + Certificates, hash-keys). **CAUTION:** Editing is not recommended! +- (vagrant) - contains scripts that are used for creating, destroying, + rebooting and shutting down the VMs that host the K8s cluster. + +To create and run a K8s cluster with a *contiv-vpp CNI* plugin, run the +``vagrant-start`` script, located in the `vagrant +folder <https://github.com/contiv/vpp/tree/master/vagrant>`__. The +``vagrant-start`` script prompts the user to select the number of worker +nodes for the kubernetes cluster. Zero (0) worker nodes mean that a +single-node cluster (with one kubernetes master node) will be deployed. + +Next, the user is prompted to select either the *production environment* +or the *development environment*. Instructions on how to build the +development *contiv/vpp-vswitch* image can be found below in the +`development +environment <#building-and-deploying-the-dev-contiv-vswitch-image>`__ +command section. + +The last option asks the user to select either *Without StealTheNIC* or +*With StealTheNIC*. Using option *With StealTheNIC* has the plugin +“steal” interfaces owned by Linux and uses their configuration in VPP. + +For the production environment, enter the following commands: + +:: + + | => ./vagrant-start + Please provide the number of workers for the Kubernetes cluster (0-50) or enter [Q/q] to exit: 1 + + Please choose Kubernetes environment: + 1) Production + 2) Development + 3) Quit + --> 1 + You chose Development environment + + Please choose deployment scenario: + 1) Without StealTheNIC + 2) With StealTheNIC + 3) Quit + --> 1 + You chose deployment without StealTheNIC + + Creating a production environment, without STN and 1 worker node(s) + +For the development environment, enter the following commands: + +:: + + | => ./vagrant-start + Please provide the number of workers for the Kubernetes cluster (0-50) or enter [Q/q] to exit: 1 + + Please choose Kubernetes environment: + 1) Production + 2) Development + 3) Quit + --> 2 + You chose Development environment + + Please choose deployment scenario: + 1) Without StealTheNIC + 2) With StealTheNIC + 3) Quit + --> 1 + You chose deployment without StealTheNIC + + Creating a development environment, without STN and 1 worker node(s) + +To destroy and clean-up the cluster, run the *vagrant-cleanup* script, +located `inside the vagrant +folder <https://github.com/contiv/vpp/tree/master/vagrant>`__: + +:: + + cd vagrant/ + ./vagrant-cleanup + +To shutdown the cluster, run the *vagrant-shutdown* script, located +`inside the vagrant +folder <https://github.com/contiv/vpp/tree/master/vagrant>`__: + +:: + + cd vagrant/ + ./vagrant-shutdown + +- To reboot the cluster, run the *vagrant-reload* script, located + `inside the vagrant + folder <https://github.com/contiv/vpp/tree/master/vagrant>`__: + +:: + + cd vagrant/ + ./vagrant-reload + +- From a suspended state, or after a reboot of the host machine, the + cluster can be brought up by running the *vagrant-up* script. + +Building and Deploying the dev-contiv-vswitch Image +--------------------------------------------------- + +If you chose the optional development-environment-deployment option, +then perform the following instructions on how to build a modified +*contivvpp/vswitch* image: + +- Make sure changes in the code have been saved. From the k8s-master + node, build the new *contivvpp/vswitch* image (run as sudo): + +:: + + vagrant ssh k8s-master + cd /vagrant/config + sudo ./save-dev-image + +- The newly built *contivvpp/vswitch* image is now tagged as *latest*. + Verify the build with ``sudo docker images``; the *contivvpp/vswitch* + should have been created a few seconds ago. The new image with all + the changes must become available to all the nodes in the K8s + cluster. To make the changes available to all, load the docker image + into the running worker nodes (run as sudo): + +:: + + vagrant ssh k8s-worker1 + cd /vagrant/config + sudo ./load-dev-image + +- Verify with ``sudo docker images``; the old *contivvpp/vswitch* + should now be tagged as ``<none>`` and the latest tagged + *contivvpp/vswitch* should have been created a few seconds ago. + +Exploring the Cluster +--------------------- + +Once the cluster is up, perform the following steps: - Log into the +master: + +:: + + cd vagrant + + vagrant ssh k8s-master + + Welcome to Ubuntu 16.04 LTS (GNU/Linux 4.4.0-21-generic x86_64) + + * Documentation: https://help.ubuntu.com/ + vagrant@k8s-master:~$ + +- Verify the Kubernetes/Contiv-VPP installation. First, verify the + nodes in the cluster: + +:: + + vagrant@k8s-master:~$ kubectl get nodes -o wide + + NAME STATUS ROLES AGE VERSION EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME + k8s-master Ready master 22m v1.9.2 <none> Ubuntu 16.04 LTS 4.4.0-21-generic docker://17.12.0-ce + k8s-worker1 Ready <none> 15m v1.9.2 <none> Ubuntu 16.04 LTS 4.4.0-21-generic docker://17.12.0-ce + +- Next, verify that all pods are running correctly: + +:: + + vagrant@k8s-master:~$ kubectl get pods -n kube-system -o wide + + NAME READY STATUS RESTARTS AGE IP NODE + contiv-etcd-2ngdc 1/1 Running 0 17m 192.169.1.10 k8s-master + contiv-ksr-x7gsq 1/1 Running 3 17m 192.169.1.10 k8s-master + contiv-vswitch-9bql6 2/2 Running 0 17m 192.169.1.10 k8s-master + contiv-vswitch-hpt2x 2/2 Running 0 10m 192.169.1.11 k8s-worker1 + etcd-k8s-master 1/1 Running 0 16m 192.169.1.10 k8s-master + kube-apiserver-k8s-master 1/1 Running 0 16m 192.169.1.10 k8s-master + kube-controller-manager-k8s-master 1/1 Running 0 15m 192.169.1.10 k8s-master + kube-dns-6f4fd4bdf-62rv4 2/3 CrashLoopBackOff 14 17m 10.1.1.2 k8s-master + kube-proxy-bvr74 1/1 Running 0 10m 192.169.1.11 k8s-worker1 + kube-proxy-v4fzq 1/1 Running 0 17m 192.169.1.10 k8s-master + kube-scheduler-k8s-master 1/1 Running 0 16m 192.169.1.10 k8s-master + +- If you want your pods to be scheduled on both the master and the + workers, you have to untaint the master node: + +:: + +- Check VPP and its interfaces: + +:: + + vagrant@k8s-master:~$ sudo vppctl + _______ _ _ _____ ___ + __/ __/ _ \ (_)__ | | / / _ \/ _ \ + _/ _// // / / / _ \ | |/ / ___/ ___/ + /_/ /____(_)_/\___/ |___/_/ /_/ + + vpp# sh interface + Name Idx State Counter Count + GigabitEthernet0/8/0 1 up rx packets 14 + rx bytes 3906 + tx packets 18 + tx bytes 2128 + drops 3 + ip4 13 + ... + + +- Make sure that ``GigabitEthernet0/8/0`` is listed and that its status + is ``up``. + +- Next, create an example deployment of nginx pods: + +:: + + vagrant@k8s-master:~$ kubectl run nginx --image=nginx --replicas=2 + deployment "nginx" created + +- Check the status of the deployment: + +:: + + vagrant@k8s-master:~$ kubectl get deploy -o wide + + NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE CONTAINERS IMAGES SELECTOR + nginx 2 2 2 2 2h nginx nginx run=nginx + +- Verify that the pods in the deployment are up and running: + +:: + + vagrant@k8s-master:~$ kubectl get pods -o wide + + NAME READY STATUS RESTARTS AGE IP NODE + nginx-8586cf59-6kx2m 1/1 Running 1 1h 10.1.2.3 k8s-worker1 + nginx-8586cf59-j5vf9 1/1 Running 1 1h 10.1.2.2 k8s-worker1 + +- Issue an HTTP GET request to a pod in the deployment: + +:: + + vagrant@k8s-master:~$ wget 10.1.2.2 + + --2018-01-19 12:34:08-- http://10.1.2.2/ + Connecting to 10.1.2.2:80... connected. + HTTP request sent, awaiting response... 200 OK + Length: 612 [text/html] + Saving to: ‘index.html.1’ + + index.html.1 100%[=========================================>] 612 --.-KB/s in 0s + + 2018-01-19 12:34:08 (1.78 MB/s) - ‘index.html.1’ saved [612/612] + +How to SSH into k8s Worker Node +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To SSH into k8s Worker Node, perform the following steps: + +:: + + cd vagrant + + vagrant status + + vagrant ssh k8s-worker1 diff --git a/docs/usecases/contiv/index.rst b/docs/usecases/contiv/index.rst index bc52e6142ca..45d5095e6f8 100644 --- a/docs/usecases/contiv/index.rst +++ b/docs/usecases/contiv/index.rst @@ -1,26 +1,26 @@ -.. _contiv:
-
-##########
-Contiv/VPP
-##########
-
-This section provides the following information about the Contiv function:
-
-.. toctree::
- :maxdepth: 2
-
- K8s_Overview
- SECURITY
- Vagrant
- MANUAL_INSTALL
- VPP_CONFIG
- VMWARE_FUSION_HOST
- NETWORKING
- SINGLE_NIC_SETUP
- MULTI_NIC_SETUP
- CUSTOM_MGMT_NETWORK
- Prometheus
- VPP_PACKET_TRACING_K8S
- VPPTRACE
- CORE_FILES
- BUG_REPORTS
+.. _contiv: + +===================================== +VPP in kubernetes (Contiv/Deprecated) +===================================== + +This section provides the following information about the Contiv function: + +.. toctree:: + :maxdepth: 2 + + K8s_Overview + SECURITY + Vagrant + MANUAL_INSTALL + VPP_CONFIG + VMWARE_FUSION_HOST + NETWORKING + SINGLE_NIC_SETUP + MULTI_NIC_SETUP + CUSTOM_MGMT_NETWORK + Prometheus + VPP_PACKET_TRACING_K8S + VPPTRACE + CORE_FILES + BUG_REPORTS diff --git a/docs/usecases/hgw.md b/docs/usecases/hgw.md deleted file mode 100644 index 0b659e9f818..00000000000 --- a/docs/usecases/hgw.md +++ /dev/null @@ -1,497 +0,0 @@ -Using VPP as a Home Gateway -=========================== - -Vpp running on a small system (with appropriate NICs) makes a fine -home gateway. The resulting system performs far in excess of -requirements: a debug image runs at a vector size of \~1.2 terminating -a 150-mbit down / 10-mbit up cable modem connection. - -At a minimum, install sshd and the isc-dhcp-server. If you prefer, you -can use dnsmasq. - -System configuration files --------------------------- - -/etc/vpp/startup.conf: - - unix { - nodaemon - log /var/log/vpp/vpp.log - full-coredump - cli-listen /run/vpp/cli.sock - startup-config /setup.gate - poll-sleep-usec 100 - gid vpp - } - api-segment { - gid vpp - } - dpdk { - dev 0000:03:00.0 - dev 0000:14:00.0 - etc. - } - - plugins { - ## Disable all plugins, selectively enable specific plugins - ## YMMV, you may wish to enable other plugins (acl, etc.) - plugin default { disable } - plugin dpdk_plugin.so { enable } - plugin nat_plugin.so { enable } - ## if you plan to use the time-based MAC filter - plugin mactime_plugin.so { enable } - } - -/etc/dhcp/dhcpd.conf: - - subnet 192.168.1.0 netmask 255.255.255.0 { - range 192.168.1.10 192.168.1.99; - option routers 192.168.1.1; - option domain-name-servers 8.8.8.8; - } - -If you decide to enable the vpp dns name resolver, substitute -192.168.1.2 for 8.8.8.8 in the dhcp server configuration. - -/etc/default/isc-dhcp-server: - - # On which interfaces should the DHCP server (dhcpd) serve DHCP requests? - # Separate multiple interfaces with spaces, e.g. "eth0 eth1". - INTERFACESv4="lstack" - INTERFACESv6="" - -/etc/ssh/sshd\_config: - - # What ports, IPs and protocols we listen for - Port <REDACTED-high-number-port> - # Change to no to disable tunnelled clear text passwords - PasswordAuthentication no - -For your own comfort and safety, do NOT allow password authentication -and do not answer ssh requests on port 22. Experience shows several hack -attempts per hour on port 22, but none (ever) on random high-number -ports. - -Systemd configuration ---------------------- - -In a typical home-gateway use-case, vpp owns the one-and-only WAN link -with a prayer of reaching the public internet. Simple things like -updating distro software requires use of the \"lstack\" interface -created above, and configuring a plausible upstream DNS name resolver. - -Configure /etc/systemd/resolved.conf as follows. - -/etc/systemd/resolved.conf: - - [Resolve] - DNS=8.8.8.8 - #FallbackDNS= - #Domains= - #LLMNR=no - #MulticastDNS=no - #DNSSEC=no - #Cache=yes - #DNSStubListener=yes - -Netplan configuration ---------------------- - -If you want to configure a static IP address on one of your home-gateway -Ethernet ports on Ubuntu 18.04, you\'ll need to configure netplan. -Netplan is relatively new. It and the network manager GUI and can be -cranky. In the configuration shown below, -s/enp4s0/\<your-interface\>/\... - -/etc/netplan-01-netcfg.yaml: - - # This file describes the network interfaces available on your system - # For more information, see netplan(5). - network: - version: 2 - renderer: networkd - ethernets: - enp4s0: - dhcp4: no - addresses: [192.168.2.254/24] - gateway4: 192.168.2.100 - nameservers: - search: [my.local] - addresses: [8.8.8.8] - -/etc/systemd/network-10.enp4s0.network: - - [Match] - Name=enp4s0 - - [Link] - RequiredForOnline=no - - [Network] - ConfigureWithoutCarrier=true - Address=192.168.2.254/24 - -Note that we\'ve picked an IP address for the home gateway which is on -an independent unrouteable subnet. This is handy for installing (and -possibly reverting) new vpp software. - -VPP Configuration Files ------------------------ - -Here we see a nice use-case for the vpp debug CLI macro expander: - -/setup.gate: - - define HOSTNAME vpp1 - define TRUNK GigabitEthernet3/0/0 - - comment { Specific MAC address yields a constant IP address } - define TRUNK_MACADDR 48:f8:b3:00:01:01 - define BVI_MACADDR 48:f8:b3:01:01:02 - - comment { inside subnet 192.168.<inside_subnet>.0/24 } - define INSIDE_SUBNET 1 - - define INSIDE_PORT1 GigabitEthernet6/0/0 - define INSIDE_PORT2 GigabitEthernet6/0/1 - define INSIDE_PORT3 GigabitEthernet8/0/0 - define INSIDE_PORT4 GigabitEthernet8/0/1 - - comment { feature selections } - define FEATURE_NAT44 comment - define FEATURE_CNAT uncomment - define FEATURE_DNS comment - define FEATURE_IP6 comment - define FEATURE_MACTIME uncomment - - exec /setup.tmpl - -/setup.tmpl: - - show macro - - set int mac address $(TRUNK) $(TRUNK_MACADDR) - set dhcp client intfc $(TRUNK) hostname $(HOSTNAME) - set int state $(TRUNK) up - - bvi create instance 0 - set int mac address bvi0 $(BVI_MACADDR) - set int l2 bridge bvi0 1 bvi - set int ip address bvi0 192.168.$(INSIDE_SUBNET).1/24 - set int state bvi0 up - - set int l2 bridge $(INSIDE_PORT1) 1 - set int state $(INSIDE_PORT1) up - set int l2 bridge $(INSIDE_PORT2) 1 - set int state $(INSIDE_PORT2) up - set int l2 bridge $(INSIDE_PORT3) 1 - set int state $(INSIDE_PORT3) up - set int l2 bridge $(INSIDE_PORT4) 1 - set int state $(INSIDE_PORT4) up - - comment { dhcp server and host-stack access } - create tap host-if-name lstack host-ip4-addr 192.168.$(INSIDE_SUBNET).2/24 host-ip4-gw 192.168.$(INSIDE_SUBNET).1 - set int l2 bridge tap0 1 - set int state tap0 up - - service restart isc-dhcp-server - - $(FEATURE_NAT44) { nat44 enable users 50 user-sessions 750 sessions 63000 } - $(FEATURE_NAT44) { nat44 add interface address $(TRUNK) } - $(FEATURE_NAT44) { set interface nat44 in bvi0 out $(TRUNK) } - - $(FEATURE_NAT44) { nat44 add static mapping local 192.168.$(INSIDE_SUBNET).2 22432 external $(TRUNK) 22432 tcp } - - $(FEATURE_CNAT) { cnat snat with $(TRUNK) } - $(FEATURE_CNAT) { set interface feature bvi0 ip4-cnat-snat arc ip4-unicast } - $(FEATURE_CNAT) { cnat translation add proto tcp real $(TRUNK) 22432 to -> 192.168.$(INSIDE_SUBNET).2 22432 } - $(FEATURE_CNAT) { $(FEATURE_DNS) { cnat translation add proto udp real $(TRUNK) 53053 to -> 192.168.$(INSIDE_SUBNET).1 53053 } } - - $(FEATURE_DNS) { $(FEATURE_NAT44) { nat44 add identity mapping external $(TRUNK) udp 53053 } } - $(FEATURE_DNS) { bin dns_name_server_add_del 8.8.8.8 } - $(FEATURE_DNS) { bin dns_enable_disable } - - comment { set ct6 inside $(TRUNK) } - comment { set ct6 outside $(TRUNK) } - - $(FEATURE_IP6) { set int ip6 table $(TRUNK) 0 } - $(FEATURE_IP6) { ip6 nd address autoconfig $(TRUNK) default-route } - $(FEATURE_IP6) { dhcp6 client $(TRUNK) } - $(FEATURE_IP6) { dhcp6 pd client $(TRUNK) prefix group hgw } - $(FEATURE_IP6) { set ip6 address bvi0 prefix group hgw ::1/64 } - $(FEATURE_IP6) { ip6 nd address autoconfig bvi0 default-route } - comment { iPhones seem to need lots of RA messages... } - $(FEATURE_IP6) { ip6 nd bvi0 ra-managed-config-flag ra-other-config-flag ra-interval 5 3 ra-lifetime 180 } - comment { ip6 nd bvi0 prefix 0::0/0 ra-lifetime 100000 } - - - $(FEATURE_MACTIME) { bin mactime_add_del_range name cisco-vpn mac a8:b4:56:e1:b8:3e allow-static } - $(FEATURE_MACTIME) { bin mactime_add_del_range name old-mac mac <redacted> allow-static } - $(FEATURE_MACTIME) { bin mactime_add_del_range name roku mac <redacted> allow-static } - $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT1) } - $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT2) } - $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT3) } - $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT4) } - -Installing new vpp software ---------------------------- - -If you\'re **sure** that a given set of vpp Debian packages will install -and work properly, you can install them while logged into the gateway -via the lstack / nat path. This procedure is a bit like standing on a -rug and yanking it. If all goes well, a perfect back-flip occurs. If -not, you may wish that you\'d configured a static IP address on a -reserved Ethernet interface as described above. - -Installing a new vpp image via ssh to 192.168.1.2: - - # nohup dpkg -i *.deb >/dev/null 2>&1 & - -Within a few seconds, the inbound ssh connection SHOULD begin to respond -again. If it does not, you\'ll have to debug the issue(s). - -Reasonably Robust Remote Software Installation ----------------------------------------------- - -Here are a couple of scripts which yield a reasonably robust software -installation scheme. - -### Build-host script - - #!/bin/bash - - buildroot=/scratch/vpp-workspace/build-root - if [ $1x = "testx" ] ; then - subdir="test" - ipaddr="192.168.2.48" - elif [ $1x = "foox" ] ; then - subdir="foo" - ipaddr="foo.some.net" - elif [ $1x = "barx" ] ; then - subdir="bar" - ipaddr="bar.some.net" - else - subdir="test" - ipaddr="192.168.2.48" - fi - - echo Save current software... - ssh -p 22432 $ipaddr "rm -rf /gate_debians.prev" - ssh -p 22432 $ipaddr "mv /gate_debians /gate_debians.prev" - ssh -p 22432 $ipaddr "mkdir /gate_debians" - echo Copy new software to the gateway... - scp -P 22432 $buildroot/*.deb $ipaddr:/gate_debians - echo Install new software... - ssh -p 22432 $ipaddr "nohup /usr/local/bin/vpp-swupdate > /dev/null 2>&1 &" - - for i in 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 - do - echo Wait for $i seconds... - sleep 1 - done - - echo Try to access the device... - - ssh -p 22432 -o ConnectTimeout=10 $ipaddr "tail -20 /var/log/syslog | grep Ping" - if [ $? == 0 ] ; then - echo Access test OK... - else - echo Access failed, wait for configuration restoration... - for i in 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 - do - echo Wait for $i seconds... - sleep 1 - done - echo Retry access test - ssh -p 22432 -o ConnectTimeout=10 $ipaddr "tail -20 /var/log/syslog | grep Ping" - if [ $? == 0 ] ; then - echo Access test OK, check syslog on the device - exit 1 - else - echo Access test still fails, manual intervention required. - exit 2 - fi - fi - - exit 0 - -### Target script - - #!/bin/bash - - logger "About to update vpp software..." - cd /gate_debians - service vpp stop - sudo dpkg -i *.deb >/dev/null 2>&1 & - sleep 20 - logger "Ping connectivity test..." - for i in 1 2 3 4 5 6 7 8 9 10 - do - ping -4 -c 1 yahoo.com - if [ $? == 0 ] ; then - logger "Ping test OK..." - exit 0 - fi - done - - logger "Ping test NOT OK, restore old software..." - rm -rf /gate_debians - mv /gate_debians.prev /gate_debians - cd /gate_debians - nohup sudo dpkg -i *.deb >/dev/null 2>&1 & - sleep 20 - logger "Repeat connectivity test..." - for i in 1 2 3 4 5 6 7 8 9 10 - do - ping -4 -c 1 yahoo.com - if [ $? == 0 ] ; then - logger "Ping test OK after restoring old software..." - exit 0 - fi - done - - logger "Ping test FAIL after restoring software, manual intervention required" - exit 2 - -Note that the target script **requires** that the userid which invokes -it will manage to "sudo dpkg ..." without further authentication. If -you're uncomfortable with the security implications of that -requirement, you'll need to solve the problem a different -way. Strongly suggest configuring sshd as described above to minimize -risk. - - -Testing new software --------------------- - -If you frequently test new home gateway software, it may be handy to set -up a test gateway behind your production gateway. This testing -methodology reduces complaints from family members, to name one benefit. - -Change the inside network (dhcp) subnet from 192.168.1.0/24 to -192.168.3.0/24, change the (dhcp) advertised router to 192.168.3.1, -reconfigure the vpp tap interface addresses onto the 192.168.3.0/24 -subnet, and you should be all set. - -This scenario nats traffic twice: first, from the 192.168.3.0/24 network -onto the 192.168.1.0/24 network. Next, from the 192.168.1.0/24 network -onto the public internet. - -Patches -------- - -You\'ll want this addition to src/vpp/vnet/main.c to add the \"service -restart isc-dhcp-server" and \"service restart vpp\" commands: - - #include <sys/types.h> - #include <sys/wait.h> - - static int - mysystem (char *cmd) - { - int rv = 0; - - if (fork()) - wait (&rv); - else - execl("/bin/sh", "sh", "-c", cmd); - - if (rv != 0) - clib_unix_warning ("('%s') child process returned %d", cmd, rv); - return rv; - } - - static clib_error_t * - restart_isc_dhcp_server_command_fn (vlib_main_t * vm, - unformat_input_t * input, - vlib_cli_command_t * cmd) - { - int rv; - - /* Wait a while... */ - vlib_process_suspend (vm, 2.0); - - rv = mysystem("/usr/sbin/service isc-dhcp-server restart"); - - vlib_cli_output (vm, "Restarted the isc-dhcp-server, status %d...", rv); - return 0; - } - - /* *INDENT-OFF* */ - VLIB_CLI_COMMAND (restart_isc_dhcp_server_command, static) = - { - .path = "service restart isc-dhcp-server", - .short_help = "restarts the isc-dhcp-server", - .function = restart_isc_dhcp_server_command_fn, - }; - /* *INDENT-ON* */ - - static clib_error_t * - restart_dora_tunnels_command_fn (vlib_main_t * vm, - unformat_input_t * input, - vlib_cli_command_t * cmd) - { - int rv; - - /* Wait three seconds... */ - vlib_process_suspend (vm, 3.0); - - rv = mysystem ("/usr/sbin/service dora restart"); - - vlib_cli_output (vm, "Restarted the dora tunnel service, status %d...", rv); - return 0; - } - - /* *INDENT-OFF* */ - VLIB_CLI_COMMAND (restart_dora_tunnels_command, static) = - { - .path = "service restart dora", - .short_help = "restarts the dora tunnel service", - .function = restart_dora_tunnels_command_fn, - }; - /* *INDENT-ON* */ - - static clib_error_t * - restart_vpp_service_command_fn (vlib_main_t * vm, - unformat_input_t * input, - vlib_cli_command_t * cmd) - { - (void) mysystem ("/usr/sbin/service vpp restart"); - return 0; - } - - /* *INDENT-OFF* */ - VLIB_CLI_COMMAND (restart_vpp_service_command, static) = - { - .path = "service restart vpp", - .short_help = "restarts the vpp service, be careful what you wish for", - .function = restart_vpp_service_command_fn, - }; - /* *INDENT-ON* */ - -Using the time-based mac filter plugin --------------------------------------- - -If you need to restrict network access for certain devices to specific -daily time ranges, configure the \"mactime\" plugin. Add it to the list -of enabled plugins in /etc/vpp/startup.conf, then enable the feature on -the NAT \"inside\" interfaces: - - bin mactime_enable_disable GigabitEthernet0/14/0 - bin mactime_enable_disable GigabitEthernet0/14/1 - ... - -Create the required src-mac-address rule database. There are 4 rule -entry types: - -- allow-static - pass traffic from this mac address -- drop-static - drop traffic from this mac address -- allow-range - pass traffic from this mac address at specific times -- drop-range - drop traffic from this mac address at specific times - -Here are some examples: - - bin mactime_add_del_range name alarm-system mac 00:de:ad:be:ef:00 allow-static - bin mactime_add_del_range name unwelcome mac 00:de:ad:be:ef:01 drop-static - bin mactime_add_del_range name not-during-business-hours mac <mac> drop-range Mon - Fri 7:59 - 18:01 - bin mactime_add_del_range name monday-busines-hours mac <mac> allow-range Mon 7:59 - 18:01 diff --git a/docs/usecases/home_gateway.rst b/docs/usecases/home_gateway.rst new file mode 100644 index 00000000000..90d25cf4a6c --- /dev/null +++ b/docs/usecases/home_gateway.rst @@ -0,0 +1,520 @@ +VPP as a Home Gateway +===================== + +Vpp running on a small system (with appropriate NICs) makes a fine home +gateway. The resulting system performs far in excess of requirements: a +debug image runs at a vector size of ~1.2 terminating a 150-mbit down / +10-mbit up cable modem connection. + +At a minimum, install sshd and the isc-dhcp-server. If you prefer, you +can use dnsmasq. + +System configuration files +-------------------------- + +/etc/vpp/startup.conf: + +.. code-block:: c + + unix { + nodaemon + log /var/log/vpp/vpp.log + full-coredump + cli-listen /run/vpp/cli.sock + startup-config /setup.gate + poll-sleep-usec 100 + gid vpp + } + api-segment { + gid vpp + } + dpdk { + dev 0000:03:00.0 + dev 0000:14:00.0 + etc. + } + + plugins { + ## Disable all plugins, selectively enable specific plugins + ## YMMV, you may wish to enable other plugins (acl, etc.) + plugin default { disable } + plugin dpdk_plugin.so { enable } + plugin nat_plugin.so { enable } + ## if you plan to use the time-based MAC filter + plugin mactime_plugin.so { enable } + } + +/etc/dhcp/dhcpd.conf: + +.. code-block:: c + + subnet 192.168.1.0 netmask 255.255.255.0 { + range 192.168.1.10 192.168.1.99; + option routers 192.168.1.1; + option domain-name-servers 8.8.8.8; + } + +If you decide to enable the vpp dns name resolver, substitute +192.168.1.2 for 8.8.8.8 in the dhcp server configuration. + +/etc/default/isc-dhcp-server: + +.. code-block:: c + + # On which interfaces should the DHCP server (dhcpd) serve DHCP requests? + # Separate multiple interfaces with spaces, e.g. "eth0 eth1". + INTERFACESv4="lstack" + INTERFACESv6="" + +/etc/ssh/sshd_config: + +.. code-block:: c + + # What ports, IPs and protocols we listen for + Port <REDACTED-high-number-port> + # Change to no to disable tunnelled clear text passwords + PasswordAuthentication no + +For your own comfort and safety, do NOT allow password authentication +and do not answer ssh requests on port 22. Experience shows several hack +attempts per hour on port 22, but none (ever) on random high-number +ports. + +Systemd configuration +--------------------- + +In a typical home-gateway use-case, vpp owns the one-and-only WAN link +with a prayer of reaching the public internet. Simple things like +updating distro software requires use of the "lstack" interface created +above, and configuring a plausible upstream DNS name resolver. + +Configure /etc/systemd/resolved.conf as follows. + +/etc/systemd/resolved.conf: + +.. code-block:: c + + [Resolve] + DNS=8.8.8.8 + #FallbackDNS= + #Domains= + #LLMNR=no + #MulticastDNS=no + #DNSSEC=no + #Cache=yes + #DNSStubListener=yes + +Netplan configuration +--------------------- + +If you want to configure a static IP address on one of your home-gateway +Ethernet ports on Ubuntu 18.04, you'll need to configure netplan. +Netplan is relatively new. It and the network manager GUI and can be +cranky. In the configuration shown below, s/enp4s0/<your-interface>/... + +/etc/netplan-01-netcfg.yaml: + +.. code-block:: c + + # This file describes the network interfaces available on your system + # For more information, see netplan(5). + network: + version: 2 + renderer: networkd + ethernets: + enp4s0: + dhcp4: no + addresses: [192.168.2.254/24] + gateway4: 192.168.2.100 + nameservers: + search: [my.local] + addresses: [8.8.8.8] + +/etc/systemd/network-10.enp4s0.network: + +.. code-block:: c + + [Match] + Name=enp4s0 + + [Link] + RequiredForOnline=no + + [Network] + ConfigureWithoutCarrier=true + Address=192.168.2.254/24 + +Note that we've picked an IP address for the home gateway which is on an +independent unrouteable subnet. This is handy for installing (and +possibly reverting) new vpp software. + +VPP Configuration Files +----------------------- + +Here we see a nice use-case for the vpp debug CLI macro expander: + +/setup.gate: + +.. code-block:: c + + define HOSTNAME vpp1 + define TRUNK GigabitEthernet3/0/0 + + comment { Specific MAC address yields a constant IP address } + define TRUNK_MACADDR 48:f8:b3:00:01:01 + define BVI_MACADDR 48:f8:b3:01:01:02 + + comment { inside subnet 192.168.<inside_subnet>.0/24 } + define INSIDE_SUBNET 1 + + define INSIDE_PORT1 GigabitEthernet6/0/0 + define INSIDE_PORT2 GigabitEthernet6/0/1 + define INSIDE_PORT3 GigabitEthernet8/0/0 + define INSIDE_PORT4 GigabitEthernet8/0/1 + + comment { feature selections } + define FEATURE_NAT44 comment + define FEATURE_CNAT uncomment + define FEATURE_DNS comment + define FEATURE_IP6 comment + define FEATURE_MACTIME uncomment + + exec /setup.tmpl + +/setup.tmpl: + +.. code-block:: c + + show macro + + set int mac address $(TRUNK) $(TRUNK_MACADDR) + set dhcp client intfc $(TRUNK) hostname $(HOSTNAME) + set int state $(TRUNK) up + + bvi create instance 0 + set int mac address bvi0 $(BVI_MACADDR) + set int l2 bridge bvi0 1 bvi + set int ip address bvi0 192.168.$(INSIDE_SUBNET).1/24 + set int state bvi0 up + + set int l2 bridge $(INSIDE_PORT1) 1 + set int state $(INSIDE_PORT1) up + set int l2 bridge $(INSIDE_PORT2) 1 + set int state $(INSIDE_PORT2) up + set int l2 bridge $(INSIDE_PORT3) 1 + set int state $(INSIDE_PORT3) up + set int l2 bridge $(INSIDE_PORT4) 1 + set int state $(INSIDE_PORT4) up + + comment { dhcp server and host-stack access } + create tap host-if-name lstack host-ip4-addr 192.168.$(INSIDE_SUBNET).2/24 host-ip4-gw 192.168.$(INSIDE_SUBNET).1 + set int l2 bridge tap0 1 + set int state tap0 up + + service restart isc-dhcp-server + + $(FEATURE_NAT44) { nat44 enable users 50 user-sessions 750 sessions 63000 } + $(FEATURE_NAT44) { nat44 add interface address $(TRUNK) } + $(FEATURE_NAT44) { set interface nat44 in bvi0 out $(TRUNK) } + + $(FEATURE_NAT44) { nat44 add static mapping local 192.168.$(INSIDE_SUBNET).2 22432 external $(TRUNK) 22432 tcp } + + $(FEATURE_CNAT) { cnat snat with $(TRUNK) } + $(FEATURE_CNAT) { set interface feature bvi0 ip4-cnat-snat arc ip4-unicast } + $(FEATURE_CNAT) { cnat translation add proto tcp real $(TRUNK) 22432 to -> 192.168.$(INSIDE_SUBNET).2 22432 } + $(FEATURE_CNAT) { $(FEATURE_DNS) { cnat translation add proto udp real $(TRUNK) 53053 to -> 192.168.$(INSIDE_SUBNET).1 53053 } } + + $(FEATURE_DNS) { $(FEATURE_NAT44) { nat44 add identity mapping external $(TRUNK) udp 53053 } } + $(FEATURE_DNS) { bin dns_name_server_add_del 8.8.8.8 } + $(FEATURE_DNS) { bin dns_enable_disable } + + comment { set ct6 inside $(TRUNK) } + comment { set ct6 outside $(TRUNK) } + + $(FEATURE_IP6) { set int ip6 table $(TRUNK) 0 } + $(FEATURE_IP6) { ip6 nd address autoconfig $(TRUNK) default-route } + $(FEATURE_IP6) { dhcp6 client $(TRUNK) } + $(FEATURE_IP6) { dhcp6 pd client $(TRUNK) prefix group hgw } + $(FEATURE_IP6) { set ip6 address bvi0 prefix group hgw ::1/64 } + $(FEATURE_IP6) { ip6 nd address autoconfig bvi0 default-route } + comment { iPhones seem to need lots of RA messages... } + $(FEATURE_IP6) { ip6 nd bvi0 ra-managed-config-flag ra-other-config-flag ra-interval 5 3 ra-lifetime 180 } + comment { ip6 nd bvi0 prefix 0::0/0 ra-lifetime 100000 } + + + $(FEATURE_MACTIME) { bin mactime_add_del_range name cisco-vpn mac a8:b4:56:e1:b8:3e allow-static } + $(FEATURE_MACTIME) { bin mactime_add_del_range name old-mac mac <redacted> allow-static } + $(FEATURE_MACTIME) { bin mactime_add_del_range name roku mac <redacted> allow-static } + $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT1) } + $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT2) } + $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT3) } + $(FEATURE_MACTIME) { bin mactime_enable_disable $(INSIDE_PORT4) } + +Installing new vpp software +--------------------------- + +If you're **sure** that a given set of vpp Debian packages will install +and work properly, you can install them while logged into the gateway +via the lstack / nat path. This procedure is a bit like standing on a +rug and yanking it. If all goes well, a perfect back-flip occurs. If +not, you may wish that you'd configured a static IP address on a +reserved Ethernet interface as described above. + +Installing a new vpp image via ssh to 192.168.1.2: + +.. code-block:: c + + # nohup dpkg -i *.deb >/dev/null 2>&1 & + +Within a few seconds, the inbound ssh connection SHOULD begin to respond +again. If it does not, you'll have to debug the issue(s). + +Reasonably Robust Remote Software Installation +---------------------------------------------- + +Here are a couple of scripts which yield a reasonably robust software +installation scheme. + +Build-host script +~~~~~~~~~~~~~~~~~ + +.. code-block:: c + + #!/bin/bash + + buildroot=/scratch/vpp-workspace/build-root + if [ $1x = "testx" ] ; then + subdir="test" + ipaddr="192.168.2.48" + elif [ $1x = "foox" ] ; then + subdir="foo" + ipaddr="foo.some.net" + elif [ $1x = "barx" ] ; then + subdir="bar" + ipaddr="bar.some.net" + else + subdir="test" + ipaddr="192.168.2.48" + fi + + echo Save current software... + ssh -p 22432 $ipaddr "rm -rf /gate_debians.prev" + ssh -p 22432 $ipaddr "mv /gate_debians /gate_debians.prev" + ssh -p 22432 $ipaddr "mkdir /gate_debians" + echo Copy new software to the gateway... + scp -P 22432 $buildroot/*.deb $ipaddr:/gate_debians + echo Install new software... + ssh -p 22432 $ipaddr "nohup /usr/local/bin/vpp-swupdate > /dev/null 2>&1 &" + + for i in 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 + do + echo Wait for $i seconds... + sleep 1 + done + + echo Try to access the device... + + ssh -p 22432 -o ConnectTimeout=10 $ipaddr "tail -20 /var/log/syslog | grep Ping" + if [ $? == 0 ] ; then + echo Access test OK... + else + echo Access failed, wait for configuration restoration... + for i in 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 + do + echo Wait for $i seconds... + sleep 1 + done + echo Retry access test + ssh -p 22432 -o ConnectTimeout=10 $ipaddr "tail -20 /var/log/syslog | grep Ping" + if [ $? == 0 ] ; then + echo Access test OK, check syslog on the device + exit 1 + else + echo Access test still fails, manual intervention required. + exit 2 + fi + fi + + exit 0 + +Target script +~~~~~~~~~~~~~ + +.. code-block:: c + + #!/bin/bash + + logger "About to update vpp software..." + cd /gate_debians + service vpp stop + sudo dpkg -i *.deb >/dev/null 2>&1 & + sleep 20 + logger "Ping connectivity test..." + for i in 1 2 3 4 5 6 7 8 9 10 + do + ping -4 -c 1 yahoo.com + if [ $? == 0 ] ; then + logger "Ping test OK..." + exit 0 + fi + done + + logger "Ping test NOT OK, restore old software..." + rm -rf /gate_debians + mv /gate_debians.prev /gate_debians + cd /gate_debians + nohup sudo dpkg -i *.deb >/dev/null 2>&1 & + sleep 20 + logger "Repeat connectivity test..." + for i in 1 2 3 4 5 6 7 8 9 10 + do + ping -4 -c 1 yahoo.com + if [ $? == 0 ] ; then + logger "Ping test OK after restoring old software..." + exit 0 + fi + done + + logger "Ping test FAIL after restoring software, manual intervention required" + exit 2 + +Note that the target script **requires** that the user id which invokes +it will manage to “sudo dpkg …” without further authentication. If +you’re uncomfortable with the security implications of that requirement, +you’ll need to solve the problem a different way. Strongly suggest +configuring sshd as described above to minimize risk. + +Testing new software +-------------------- + +If you frequently test new home gateway software, it may be handy to set +up a test gateway behind your production gateway. This testing +methodology reduces complaints from family members, to name one benefit. + +Change the inside network (dhcp) subnet from 192.168.1.0/24 to +192.168.3.0/24, change the (dhcp) advertised router to 192.168.3.1, +reconfigure the vpp tap interface addresses onto the 192.168.3.0/24 +subnet, and you should be all set. + +This scenario nats traffic twice: first, from the 192.168.3.0/24 network +onto the 192.168.1.0/24 network. Next, from the 192.168.1.0/24 network +onto the public internet. + +Patches +------- + +You'll want this addition to src/vpp/vnet/main.c to add the "service +restart isc-dhcp-server” and "service restart vpp" commands: + +.. code-block:: c + + #include <sys/types.h> + #include <sys/wait.h> + + static int + mysystem (char *cmd) + { + int rv = 0; + + if (fork()) + wait (&rv); + else + execl("/bin/sh", "sh", "-c", cmd); + + if (rv != 0) + clib_unix_warning ("('%s') child process returned %d", cmd, rv); + return rv; + } + + static clib_error_t * + restart_isc_dhcp_server_command_fn (vlib_main_t * vm, + unformat_input_t * input, + vlib_cli_command_t * cmd) + { + int rv; + + /* Wait a while... */ + vlib_process_suspend (vm, 2.0); + + rv = mysystem("/usr/sbin/service isc-dhcp-server restart"); + + vlib_cli_output (vm, "Restarted the isc-dhcp-server, status %d...", rv); + return 0; + } + + VLIB_CLI_COMMAND (restart_isc_dhcp_server_command, static) = + { + .path = "service restart isc-dhcp-server", + .short_help = "restarts the isc-dhcp-server", + .function = restart_isc_dhcp_server_command_fn, + }; + + static clib_error_t * + restart_dora_tunnels_command_fn (vlib_main_t * vm, + unformat_input_t * input, + vlib_cli_command_t * cmd) + { + int rv; + + /* Wait three seconds... */ + vlib_process_suspend (vm, 3.0); + + rv = mysystem ("/usr/sbin/service dora restart"); + + vlib_cli_output (vm, "Restarted the dora tunnel service, status %d...", rv); + return 0; + } + + VLIB_CLI_COMMAND (restart_dora_tunnels_command, static) = + { + .path = "service restart dora", + .short_help = "restarts the dora tunnel service", + .function = restart_dora_tunnels_command_fn, + }; + + static clib_error_t * + restart_vpp_service_command_fn (vlib_main_t * vm, + unformat_input_t * input, + vlib_cli_command_t * cmd) + { + (void) mysystem ("/usr/sbin/service vpp restart"); + return 0; + } + + VLIB_CLI_COMMAND (restart_vpp_service_command, static) = + { + .path = "service restart vpp", + .short_help = "restarts the vpp service, be careful what you wish for", + .function = restart_vpp_service_command_fn, + }; + +Using the time-based mac filter plugin +-------------------------------------- + +If you need to restrict network access for certain devices to specific +daily time ranges, configure the "mactime" plugin. Add it to the list of +enabled plugins in /etc/vpp/startup.conf, then enable the feature on the +NAT "inside" interfaces: + +.. code-block:: c + + bin mactime_enable_disable GigabitEthernet0/14/0 + bin mactime_enable_disable GigabitEthernet0/14/1 + ... + +Create the required src-mac-address rule database. There are 4 rule +entry types: + +- allow-static - pass traffic from this mac address +- drop-static - drop traffic from this mac address +- allow-range - pass traffic from this mac address at specific times +- drop-range - drop traffic from this mac address at specific times + +Here are some examples: + +.. code-block:: c + + bin mactime_add_del_range name alarm-system mac 00:de:ad:be:ef:00 allow-static + bin mactime_add_del_range name unwelcome mac 00:de:ad:be:ef:01 drop-static + bin mactime_add_del_range name not-during-business-hours mac <mac> drop-range Mon - Fri 7:59 - 18:01 + bin mactime_add_del_range name monday-busines-hours mac <mac> allow-range Mon 7:59 - 18:01 diff --git a/docs/usecases/ikev2/2_vpp.rst b/docs/usecases/ikev2/2_vpp.rst new file mode 100644 index 00000000000..2c6fe6b88e4 --- /dev/null +++ b/docs/usecases/ikev2/2_vpp.rst @@ -0,0 +1,128 @@ +How to connect VPP instances using IKEv2 +======================================== + +This section describes how to initiate IKEv2 session between two VPP +instances using Linux veth interfaces and namespaces. + +Create veth interfaces and namespaces and configure it: + +:: + + sudo ip link add ifresp type veth peer name ifinit + sudo ip link set dev ifresp up + sudo ip link set dev ifinit up + + sudo ip netns add clientns + sudo ip netns add serverns + sudo ip link add veth_client type veth peer name client + sudo ip link add veth_server type veth peer name server + sudo ip link set dev veth_client up netns clientns + sudo ip link set dev veth_server up netns serverns + + sudo ip netns exec clientns \ + bash -c " + ip link set dev lo up + ip addr add 192.168.5.2/24 dev veth_client + ip addr add fec5::2/16 dev veth_client + ip route add 192.168.3.0/24 via 192.168.5.1 + ip route add fec3::0/16 via fec5::1 + " + + sudo ip netns exec serverns \ + bash -c " + ip link set dev lo up + ip addr add 192.168.3.2/24 dev veth_server + ip addr add fec3::2/16 dev veth_server + ip route add 192.168.5.0/24 via 192.168.3.1 + ip route add fec5::0/16 via fec3::1 + " + +Run responder VPP: + +:: + + sudo /usr/bin/vpp unix { \ + cli-listen /tmp/vpp_resp.sock \ + gid $(id -g) } \ + api-segment { prefix vpp } \ + plugins { plugin dpdk_plugin.so { disable } } + +Configure the responder + +:: + + create host-interface name ifresp + set interface ip addr host-ifresp 192.168.10.2/24 + set interface state host-ifresp up + + create host-interface name server + set interface ip addr host-server 192.168.3.1/24 + set interface state host-server up + + ikev2 profile add pr1 + ikev2 profile set pr1 auth shared-key-mic string Vpp123 + ikev2 profile set pr1 id local ipv4 192.168.10.2 + ikev2 profile set pr1 id remote ipv4 192.168.10.1 + + ikev2 profile set pr1 traffic-selector local ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 + ikev2 profile set pr1 traffic-selector remote ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 + + create ipip tunnel src 192.168.10.2 dst 192.168.10.1 + ikev2 profile set pr1 tunnel ipip0 + ip route add 192.168.5.0/24 via 192.168.10.1 ipip0 + set interface unnumbered ipip0 use host-ifresp + +Run initiator VPP: + +:: + + sudo /usr/bin/vpp unix { \ + cli-listen /tmp/vpp_init.sock \ + gid $(id -g) } \ + api-segment { prefix vpp } \ + plugins { plugin dpdk_plugin.so { disable } } + +Configure initiator: + +:: + + create host-interface name ifinit + set interface ip addr host-ifinit 192.168.10.1/24 + set interface state host-ifinit up + + create host-interface name client + set interface ip addr host-client 192.168.5.1/24 + set interface state host-client up + + ikev2 profile add pr1 + ikev2 profile set pr1 auth shared-key-mic string Vpp123 + ikev2 profile set pr1 id local ipv4 192.168.10.1 + ikev2 profile set pr1 id remote ipv4 192.168.10.2 + + ikev2 profile set pr1 traffic-selector remote ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 + ikev2 profile set pr1 traffic-selector local ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 + + ikev2 profile set pr1 responder host-ifinit 192.168.10.2 + ikev2 profile set pr1 ike-crypto-alg aes-gcm-16 256 ike-dh modp-2048 + ikev2 profile set pr1 esp-crypto-alg aes-gcm-16 256 + + create ipip tunnel src 192.168.10.1 dst 192.168.10.2 + ikev2 profile set pr1 tunnel ipip0 + ip route add 192.168.3.0/24 via 192.168.10.2 ipip0 + set interface unnumbered ipip0 use host-ifinit + +Initiate the IKEv2 connection: + +:: + + vpp# ikev2 initiate sa-init pr1 + +Responder’s and initiator’s private networks are now connected with +IPSEC tunnel: + +:: + + $ sudo ip netns exec clientns ping 192.168.3.1 + PING 192.168.3.1 (192.168.3.1) 56(84) bytes of data. + 64 bytes from 192.168.3.1: icmp_seq=1 ttl=63 time=1.64 ms + 64 bytes from 192.168.3.1: icmp_seq=2 ttl=63 time=7.24 ms diff --git a/docs/usecases/ikev2.rst b/docs/usecases/ikev2/index.rst index 853b22ef738..c9829b41908 100644 --- a/docs/usecases/ikev2.rst +++ b/docs/usecases/ikev2/index.rst @@ -1,7 +1,7 @@ .. _ikev2: -IKEv2 in VPP -============ +IKEv2 with VPP +============== This sections describes some of the ways to establish IKEv2 connection between two VPP instances or VPP and strongSwan. It covers scenarios in diff --git a/docs/usecases/ikev2/vpp_init_sswan_resp.rst b/docs/usecases/ikev2/vpp_init_sswan_resp.rst new file mode 100644 index 00000000000..93862185773 --- /dev/null +++ b/docs/usecases/ikev2/vpp_init_sswan_resp.rst @@ -0,0 +1,202 @@ +VPP as IKEv2 initiator and strongSwan as responder +================================================== + +Prerequisites +------------- + +To make the examples easier to configure ``docker`` it is required to +pull strongSwan docker image. The networking is done using Linux’ veth +interfaces and namespaces. + +Setup +----- + +First a topology: + +:: + + 192.168.3.2 192.168.5.2 + + loopback + | + + +----+----+ 192.168.10.2 +-----+----+ + | VPP | |strongSwan| + |initiator+----------------------+responder | + +---------+ +----------+ + 192.168.10.1 + +Create veth interfaces and namespaces and configure them: + +:: + + sudo ip link add gw type veth peer name swanif + sudo ip link set dev gw up + + sudo ip netns add ns + sudo ip link add veth_priv type veth peer name priv + sudo ip link set dev priv up + sudo ip link set dev veth_priv up netns ns + + sudo ip netns exec ns \ + bash -c " + ip link set dev lo up + ip addr add 192.168.3.2/24 dev veth_priv + ip route add 192.168.5.0/24 via 192.168.3.1" + +Create directory with strongswan configs that will be mounted to the +docker container + +:: + + mkdir /tmp/sswan + +Create the ``ipsec.conf`` file in the ``/tmp/sswan`` directory with +following content: + +:: + + config setup + strictcrlpolicy=no + + conn initiator + mobike=no + auto=add + type=tunnel + keyexchange=ikev2 + ike=aes256gcm16-prfsha256-modp2048! + esp=aes256gcm16-esn! + + # local: + leftauth=psk + leftid=@sswan.vpn.example.com + leftsubnet=192.168.5.0/24 + + # remote: (gateway) + rightid=@roadwarrior.vpp + right=192.168.10.2 + rightauth=psk + rightsubnet=192.168.3.0/24 + +``/tmp/sswan/ipsec.secrets`` + +:: + + : PSK 'Vpp123' + +``/tmp/sswan/strongswan.conf`` + +:: + + charon { + load_modular = yes + plugins { + include strongswan.d/charon/*.conf + } + filelog { + /tmp/charon.log { + time_format = %b %e %T + ike_name = yes + append = no + default = 2 + flush_line = yes + } + } + } + include strongswan.d/*.conf + +Start docker container with strongSwan: + +:: + + docker run --name sswan -d --privileged --rm --net=none \ + -v /tmp/sswan:/conf -v /tmp/sswan:/etc/ipsec.d philplckthun/strongswan + +Finish configuration of initiator’s private network: + +:: + + pid=$(docker inspect --format "{{.State.Pid}}" sswan) + sudo ip link set netns $pid dev swanif + + sudo nsenter -t $pid -n ip addr add 192.168.10.1/24 dev swanif + sudo nsenter -t $pid -n ip link set dev swanif up + + sudo nsenter -t $pid -n ip addr add 192.168.5.2/32 dev lo + sudo nsenter -t $pid -n ip link set dev lo up + +Start VPP … + +:: + + sudo /usr/bin/vpp unix { \ + cli-listen /tmp/vpp.sock \ + gid $(id -g) } \ + api-segment { prefix vpp } \ + plugins { plugin dpdk_plugin.so { disable } } + +… and configure it: + +:: + + create host-interface name gw + set interface ip addr host-gw 192.168.10.2/24 + set interface state host-gw up + + create host-interface name priv + set interface ip addr host-priv 192.168.3.1/24 + set interface state host-priv up + + ikev2 profile add pr1 + ikev2 profile set pr1 auth shared-key-mic string Vpp123 + ikev2 profile set pr1 id local fqdn roadwarrior.vpp + ikev2 profile set pr1 id remote fqdn sswan.vpn.example.com + + ikev2 profile set pr1 traffic-selector local ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 + ikev2 profile set pr1 traffic-selector remote ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 + + ikev2 profile set pr1 responder host-gw 192.168.10.1 + ikev2 profile set pr1 ike-crypto-alg aes-gcm-16 256 ike-dh modp-2048 + ikev2 profile set pr1 esp-crypto-alg aes-gcm-16 256 + + create ipip tunnel src 192.168.10.2 dst 192.168.10.1 + ikev2 profile set pr1 tunnel ipip0 + ip route add 192.168.5.0/24 via 192.168.10.1 ipip0 + set interface unnumbered ipip0 use host-gw + +Initiate the IKEv2 connection: + +:: + + vpp# ikev2 initiate sa-init pr1 + +:: + + vpp# show ikev2 sa details + iip 192.168.10.2 ispi f717b0cbd17e27c3 rip 192.168.10.1 rspi e9b7af7fc9b13361 + encr:aes-gcm-16 prf:hmac-sha2-256 dh-group:modp-2048 + nonce i:eb0354613b268c6372061bbdaab13deca37c8a625b1f65c073d25df2ecfe672e + r:70e1248ac09943047064f6a2135fa2a424778ba03038ab9c4c2af8aba179ed84 + SK_d 96bd4feb59be2edf1930a12a3a5d22e30195ee9f56ea203c5fb6cba5dd2bb80f + SK_e i:00000000: 5b75b9d808c8467fd00a0923c06efee2a4eb1d033c57532e05f9316ed9c56fe9 + 00000020: c4db9114 + r:00000000: 95121b63372d20b83558dc3e209b9affef042816cf071c86a53543677b40c15b + 00000020: f169ab67 + SK_p i:fb40d1114c347ddc3228ba004d4759d58f9c1ae6f1746833f908d39444ef92b1 + r:aa049828240cb242e1d5aa625cd5914dc8f8e980a74de8e06883623d19384902 + identifier (i) id-type fqdn data roadwarrior.vpp + identifier (r) id-type fqdn data sswan.vpn.example.com + child sa 0:encr:aes-gcm-16 esn:yes + spi(i) 9dffd57a spi(r) c4e0ef53 + SK_e i:290c681694f130b33d511335dd257e78721635b7e8aa87930dd77bb1d6dd3f42 + r:0a09fa18cf1cf65c6324df02b46dcc998b84e5397cf911b63e0c096053946c2e + traffic selectors (i):0 type 7 protocol_id 0 addr 192.168.3.0 - 192.168.3.255 port 0 - 65535 + traffic selectors (r):0 type 7 protocol_id 0 addr 192.168.5.0 - 192.168.5.255 port 0 - 65535 + +Now we can generate some traffic between responder’s and initiator’s +private networks and see it works. + +:: + + $ sudo ip netns exec ns ping 192.168.5.2 + PING 192.168.5.2 (192.168.5.2) 56(84) bytes of data. + 64 bytes from 192.168.5.2: icmp_seq=1 ttl=63 time=0.450 ms + 64 bytes from 192.168.5.2: icmp_seq=2 ttl=63 time=0.630 ms diff --git a/docs/usecases/ikev2/vpp_resp_sswan_init.rst b/docs/usecases/ikev2/vpp_resp_sswan_init.rst new file mode 100644 index 00000000000..9f3c7e7cbaf --- /dev/null +++ b/docs/usecases/ikev2/vpp_resp_sswan_init.rst @@ -0,0 +1,203 @@ +VPP as IKEv2 responder and strongSwan as initiator +================================================== + +Prerequisites +------------- + +To make the examples easier to configure ``docker`` it is required to +pull strongSwan docker image. The networking is done using Linux’ veth +interfaces and namespaces. + +Setup +----- + +First a topology: + +:: + + 192.168.3.2 192.168.5.2 + + loopback + | + + +----+----+ 192.168.10.2 +-----+----+ + | VPP | |initiator | + |responder+----------------------+strongSwan| + +---------+ +----------+ + 192.168.10.1 + +Create veth interfaces and namespaces and configure them: + +:: + + sudo ip link add gw type veth peer name swanif + sudo ip link set dev gw up + + sudo ip netns add ns + sudo ip link add veth_priv type veth peer name priv + sudo ip link set dev priv up + sudo ip link set dev veth_priv up netns ns + + sudo ip netns exec ns \ + bash -c " + ip link set dev lo up + ip addr add 192.168.3.2/24 dev veth_priv + ip route add 192.168.5.0/24 via 192.168.3.1" + +Create directory with strongswan configs that will be mounted to the +docker container + +:: + + mkdir /tmp/sswan + +Create the ``ipsec.conf`` file in the ``/tmp/sswan`` directory with +following content: + +:: + + config setup + strictcrlpolicy=no + + conn initiator + mobike=no + auto=add + type=tunnel + keyexchange=ikev2 + ike=aes256gcm16-prfsha256-modp2048! + esp=aes256gcm16-esn! + + # local: + leftauth=psk + leftid=@roadwarrior.vpn.example.com + leftsubnet=192.168.5.0/24 + + # remote: (vpp gateway) + rightid=@vpp.home + right=192.168.10.2 + rightauth=psk + rightsubnet=192.168.3.0/24 + +``/tmp/sswan/ipsec.secrets`` + +:: + + : PSK 'Vpp123' + +``/tmp/sswan/strongswan.conf`` + +:: + + charon { + load_modular = yes + plugins { + include strongswan.d/charon/*.conf + } + filelog { + /tmp/charon.log { + time_format = %b %e %T + ike_name = yes + append = no + default = 2 + flush_line = yes + } + } + } + include strongswan.d/*.conf + +Start docker container with strongSwan: + +:: + + docker run --name sswan -d --privileged --rm --net=none \ + -v /tmp/sswan:/conf -v /tmp/sswan:/etc/ipsec.d philplckthun/strongswan + +Finish configuration of initiator’s private network: + +:: + + pid=$(docker inspect --format "{{.State.Pid}}" sswan) + sudo ip link set netns $pid dev swanif + + sudo nsenter -t $pid -n ip addr add 192.168.10.1/24 dev swanif + sudo nsenter -t $pid -n ip link set dev swanif up + + sudo nsenter -t $pid -n ip addr add 192.168.5.2/32 dev lo + sudo nsenter -t $pid -n ip link set dev lo up + +Start VPP … + +:: + + sudo /usr/bin/vpp unix { \ + cli-listen /tmp/vpp.sock \ + gid $(id -g) } \ + api-segment { prefix vpp } \ + plugins { plugin dpdk_plugin.so { disable } } + +… and configure it: + +:: + + create host-interface name gw + set interface ip addr host-gw 192.168.10.2/24 + set interface state host-gw up + + create host-interface name priv + set interface ip addr host-priv 192.168.3.1/24 + set interface state host-priv up + + ikev2 profile add pr1 + ikev2 profile set pr1 auth shared-key-mic string Vpp123 + ikev2 profile set pr1 id local fqdn vpp.home + ikev2 profile set pr1 id remote fqdn roadwarrior.vpn.example.com + + ikev2 profile set pr1 traffic-selector local ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 + ikev2 profile set pr1 traffic-selector remote ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 + + create ipip tunnel src 192.168.10.2 dst 192.168.10.1 + ikev2 profile set pr1 tunnel ipip0 + ip route add 192.168.5.0/24 via 192.168.10.1 ipip0 + set interface unnumbered ipip0 use host-gw + +Initiate the IKEv2 connection: + +:: + + $ sudo docker exec sswan ipsec up initiator + + ... + CHILD_SA initiator{1} established with SPIs c320c95f_i 213932c2_o and TS 192.168.5.0/24 === 192.168.3.0/24 + connection 'initiator' established successfully + +:: + + vpp# show ikev2 sa details + + iip 192.168.10.1 ispi 7849021d9f655f1b rip 192.168.10.2 rspi 5a9ca7469a035205 + encr:aes-gcm-16 prf:hmac-sha2-256 dh-group:modp-2048 + nonce i:692ce8fd8f1c1934f63bfa2b167c4de2cff25640dffe938cdfe01a5d7f6820e6 + r:3ed84a14ea8526063e5aa762312be225d33e866d7152b9ce23e50f0ededca9e3 + SK_d 9a9b896ed6c35c78134fcd6e966c04868b6ecacf6d5088b4b2aee8b05d30fdda + SK_e i:00000000: 1b1619788d8c812ca5916c07e635bda860f15293099f3bf43e8d88e52074b006 + 00000020: 72c8e3e3 + r:00000000: 89165ceb2cef6a6b3319f437386292d9ef2e96d8bdb21eeb0cb0d3b92733de03 + 00000020: bbc29c50 + SK_p i:fe35fca30985ee75e7c8bc0d7bc04db7a0e1655e997c0f5974c31458826b6fef + r:0dd318662a96a25fcdf4998d8c6e4180c67c03586cf91dab26ed43aeda250272 + identifier (i) id-type fqdn data roadwarrior.vpn.example.com + identifier (r) id-type fqdn data vpp.home + child sa 0:encr:aes-gcm-16 esn:yes + spi(i) c320c95f spi(r) 213932c2 + SK_e i:2a6c9eae9dbed202c0ae6ccc001621aba5bb0b01623d4de4d14fd27bd5185435 + r:15e2913d39f809040ca40a02efd27da298b6de05f67bd8f10210da5e6ae606fb + traffic selectors (i):0 type 7 protocol_id 0 addr 192.168.5.0 - 192.168.5.255 port 0 - 65535 + traffic selectors (r):0 type 7 protocol_id 0 addr 192.168.3.0 - 192.168.3.255 port 0 - 65535 + +Now we can generate some traffic between responder’s and initiator’s +private networks and see it works. + +:: + + $ sudo ip netns exec ns ping 192.168.5.2 + PING 192.168.5.2 (192.168.5.2) 56(84) bytes of data. + 64 bytes from 192.168.5.2: icmp_seq=1 ttl=63 time=1.02 ms + 64 bytes from 192.168.5.2: icmp_seq=2 ttl=63 time=0.599 ms diff --git a/docs/usecases/index.rst b/docs/usecases/index.rst deleted file mode 100644 index d33e9c5fcc7..00000000000 --- a/docs/usecases/index.rst +++ /dev/null @@ -1,24 +0,0 @@ -.. _usecases: - - -Use Cases -========== - -This chapter contains a sample of the many ways FD.io VPP can be used. It is by no means an -extensive list, but should give a sampling of the many features contained in FD.io VPP. - -.. toctree:: - - containers - simpleperf/index.rst - vhost/index.rst - vmxnet3 - acls - vppcloud - hgw - contiv/index.rst - networksim - webapp - container_test - trafficgen - ikev2 diff --git a/docs/usecases/networksim.md b/docs/usecases/networksim.md deleted file mode 100644 index 817ddf82a29..00000000000 --- a/docs/usecases/networksim.md +++ /dev/null @@ -1,90 +0,0 @@ -Network Simulator Plugin -======================== - -Vpp includes a fairly capable network simulator plugin, which can -simulate real-world round-trip times and a configurable network packet -loss rate. It's perfect for evaluating the performance of a TCP stack -under specified delay/bandwidth/loss conditions. - -The "nsim" plugin cross-connects two physical interfaces at layer 2, -introducing the specified delay and network loss -parameters. Reconfiguration on the fly is OK, with the proviso that -packets held in the network simulator scheduling wheel will be lost. - -Configuration -------------- - -Configuration by debug CLI is simple. First, specify the simulator -configuration: unidirectional delay (half of the desired RTT), the -link bandwidth, and the expected average packet size. These parameters -allow the network simulator allocate the right amount of buffering to -produce the requested delay/bandwidth product. - -``` - set nsim delay 25.0 ms bandwidth 10 gbit packet-size 128 -``` - -To simulate network packet drops, add either "packets-per-drop <nnnnn>" or -"drop-fraction [0.0 ... 1.0]" parameters: - -``` - set nsim delay 25.0 ms bandwidth 10 gbit packet-size 128 packets-per-drop 10000 -``` -Remember to configure the layer-2 cross-connect: - -``` - nsim enable-disable <interface-1> <interface-2> -``` - -Packet Generator Configuration ------------------------------- - -Here's a unit-test configuration for the vpp packet generator: - -``` - loop cre - set int ip address loop0 11.22.33.1/24 - set int state loop0 up - - loop cre - set int ip address loop1 11.22.34.1/24 - set int state loop1 up - - set nsim delay 1.0 ms bandwidth 10 gbit packet-size 128 packets-per-drop 1000 - nsim enable-disable loop0 loop1 - - packet-generator new { - name s0 - limit 10000 - size 128-128 - interface loop0 - node ethernet-input - data { IP4: 1.2.3 -> 4.5.6 - UDP: 11.22.33.44 -> 11.22.34.44 - UDP: 1234 -> 2345 - incrementing 114 - } - } -``` - -For extra realism, the network simulator drops any specific packet -with the specified probability. In this example, we see that slight -variation from run to run occurs as it should. - -``` - DBGvpp# pa en - DBGvpp# sh err - Count Node Reason - 9991 nsim Packets buffered - 9 nsim Network loss simulation drop packets - 9991 ethernet-input l3 mac mismatch - - DBGvpp# clear err - DBGvpp# pa en - DBGvpp# sh err - sh err - Count Node Reason - 9993 nsim Packets buffered - 7 nsim Network loss simulation drop packets - 9993 ethernet-input l3 mac mismatch -``` diff --git a/docs/usecases/networksim.rst b/docs/usecases/networksim.rst new file mode 100644 index 00000000000..9324c992f00 --- /dev/null +++ b/docs/usecases/networksim.rst @@ -0,0 +1,91 @@ +Generating traffic with VPP +=========================== + +Vpp includes a fairly capable network simulator plugin, which can +simulate real-world round-trip times and a configurable network packet +loss rate. It’s perfect for evaluating the performance of a TCP stack +under specified delay/bandwidth/loss conditions. + +The “nsim” plugin cross-connects two physical interfaces at layer 2, +introducing the specified delay and network loss parameters. +Reconfiguration on the fly is OK, with the proviso that packets held in +the network simulator scheduling wheel will be lost. + +Configuration +------------- + +Configuration by debug CLI is simple. First, specify the simulator +configuration: unidirectional delay (half of the desired RTT), the link +bandwidth, and the expected average packet size. These parameters allow +the network simulator allocate the right amount of buffering to produce +the requested delay/bandwidth product. + +:: + + set nsim delay 25.0 ms bandwidth 10 gbit packet-size 128 + +To simulate network packet drops, add either “packets-per-drop ” or +“drop-fraction [0.0 … 1.0]” parameters: + +:: + + set nsim delay 25.0 ms bandwidth 10 gbit packet-size 128 packets-per-drop 10000 + +Remember to configure the layer-2 cross-connect: + +:: + + nsim enable-disable <interface-1> <interface-2> + +Packet Generator Configuration +------------------------------ + +Here’s a unit-test configuration for the vpp packet generator: + +:: + + loop cre + set int ip address loop0 11.22.33.1/24 + set int state loop0 up + + loop cre + set int ip address loop1 11.22.34.1/24 + set int state loop1 up + + set nsim delay 1.0 ms bandwidth 10 gbit packet-size 128 packets-per-drop 1000 + nsim enable-disable loop0 loop1 + + packet-generator new { + name s0 + limit 10000 + size 128-128 + interface loop0 + node ethernet-input + data { IP4: 1.2.3 -> 4.5.6 + UDP: 11.22.33.44 -> 11.22.34.44 + UDP: 1234 -> 2345 + incrementing 114 + } + } + +For extra realism, the network simulator drops any specific packet with +the specified probability. In this example, we see that slight variation +from run to run occurs as it should. + +:: + + DBGvpp# pa en + DBGvpp# sh err + Count Node Reason + 9991 nsim Packets buffered + 9 nsim Network loss simulation drop packets + 9991 ethernet-input l3 mac mismatch + + DBGvpp# clear err + DBGvpp# pa en + DBGvpp# sh err + sh err + Count Node Reason + 9993 nsim Packets buffered + 7 nsim Network loss simulation drop packets + 9993 ethernet-input l3 mac mismatch diff --git a/docs/usecases/simpleperf/iperf3.rst b/docs/usecases/simpleperf/iperf3.rst index 6f5d345c598..d485a5e8a77 100644 --- a/docs/usecases/simpleperf/iperf3.rst +++ b/docs/usecases/simpleperf/iperf3.rst @@ -60,7 +60,7 @@ Configure the system *csp2s22c03* to have 10.10.1.1 and 10.10.2.1 on the two 40- csp2s22c03$ sudo ip link set dev ens802f0 up csp2s22c03$ sudo ip addr add 10.10.2.1/24 dev ens802f1 csp2s22c03$ sudo ip link set dev ens802f1 up - + List the route table: .. code-block:: console @@ -123,7 +123,7 @@ route for IP packet 10.10.2.0/24: TX packets:1179 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:262230 (262.2 KB) TX bytes:139975 (139.9 KB) - + ens802 Link encap:Ethernet HWaddr 68:05:ca:2e:76:e0 inet addr:10.10.1.2 Bcast:0.0.0.0 Mask:255.255.255.0 inet6 addr: fe80::6a05:caff:fe2e:76e0/64 Scope:Link @@ -132,7 +132,7 @@ route for IP packet 10.10.2.0/24: TX packets:40 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 B) TX bytes:5480 (5.4 KB) - + lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host @@ -233,5 +233,5 @@ we start the **iperf3** client to connect to the server: [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 9.45 GBytes 8.12 Gbits/sec 16474 sender [ 4] 0.00-10.00 sec 9.44 GBytes 8.11 Gbits/sec receiver - + iperf Done. diff --git a/docs/usecases/simpleperf/iperf31.rst b/docs/usecases/simpleperf/iperf31.rst index 50abfdf0396..dbd0e72c61b 100644 --- a/docs/usecases/simpleperf/iperf31.rst +++ b/docs/usecases/simpleperf/iperf31.rst @@ -22,38 +22,38 @@ at 82:00.0 and 82:00.1. Use the device’s slots to bind them to the driver uio_ .. code-block:: console csp2s22c03$ ./install-vpp-native/dpdk/sbin/dpdk-devbind -s - + Network devices using DPDK-compatible driver ============================================ <none> - + Network devices using kernel driver =================================== 0000:03:00.0 'Ethernet Controller 10-Gigabit X540-AT2' if=enp3s0f0 drv=ixgbe unused=vfio-pci,uio_pci_generic *Active* 0000:03:00.1 'Ethernet Controller 10-Gigabit X540-AT2' if=enp3s0f1 drv=ixgbe unused=vfio-pci,uio_pci_generic *Active* - 0000:82:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=ens802f0d1,ens802f0 drv=i40e unused=uio_pci_generic - 0000:82:00.1 'Ethernet Controller XL710 for 40GbE QSFP+' if=ens802f1d1,ens802f1 drv=i40e unused=uio_pci_generic - + 0000:82:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' if=ens802f0d1,ens802f0 drv=i40e unused=uio_pci_generic + 0000:82:00.1 'Ethernet Controller XL710 for 40GbE QSFP+' if=ens802f1d1,ens802f1 drv=i40e unused=uio_pci_generic + Other network devices ===================== <none> - + csp2s22c03$ sudo modprobe uio_pci_generic csp2s22c03$ sudo ./install-vpp-native/dpdk/sbin/dpdk-devbind --bind uio_pci_generic 82:00.0 csp2s22c03$ sudo ./install-vpp-native/dpdk/sbin/dpdk-devbind --bind uio_pci_generic 82:00.1 csp2s22c03$ sudo ./install-vpp-native/dpdk/sbin/dpdk-devbind -s - + Network devices using DPDK-compatible driver ============================================ 0000:82:00.0 'Ethernet Controller XL710 for 40GbE QSFP+' drv=uio_pci_generic unused=i40e,vfio-pci 0000:82:00.1 'Ethernet Controller XL710 for 40GbE QSFP+' drv=uio_pci_generic unused=i40e,vfio-pci - + Network devices using kernel driver =================================== 0000:03:00.0 'Ethernet Controller 10-Gigabit X540-AT2' if=enp3s0f0 drv=ixgbe unused=vfio-pci,uio_pci_generic *Active* 0000:03:00.1 'Ethernet Controller 10-Gigabit X540-AT2' if=enp3s0f1 drv=ixgbe unused=vfio-pci,uio_pci_generic *Active* - + Start the VPP service, and verify that VPP is running: .. code-block:: console @@ -63,7 +63,7 @@ Start the VPP service, and verify that VPP is running: root 105655 1 98 17:34 ? 00:00:02 /usr/bin/vpp -c /etc/vpp/startup.conf :w 105675 105512 0 17:34 pts/4 00:00:00 grep --color=auto vpp - + To access the VPP CLI, issue the command sudo vppctl . From the VPP interface, list all interfaces that are bound to DPDK using the command show interface: @@ -109,7 +109,7 @@ between *net2s22c05* and *csp2s22c04* increases to 20.3 Gbits per second. [ ID] Interval Transfer Bandwidth Retr [ 4] 0.00-10.00 sec 23.7 GBytes 20.3 Gbits/sec 13434 sender [ 4] 0.00-10.00 sec 23.7 GBytes 20.3 Gbits/sec receiver - + iperf Done. The **show run** command displays the graph runtime statistics. Observe that the diff --git a/docs/usecases/simpleperf/trex.rst b/docs/usecases/simpleperf/trex.rst index 996ed156d10..6d38ce52e57 100644 --- a/docs/usecases/simpleperf/trex.rst +++ b/docs/usecases/simpleperf/trex.rst @@ -65,7 +65,7 @@ information on the configuration file, please refer to the `TRex Manual <http:// default_gw: 10.10.2.1 - ip: 10.10.1.2 default_gw: 10.10.1.1 - + Stop the previous VPP session and start it again in order to add a route for new IP addresses 16.0.0.0/8 and 48.0.0.0/8, according to Figure 2. Those IP addresses are needed because TRex generates packets that use these addresses. Refer to the @@ -81,13 +81,13 @@ these traffic templates. __/ __/ _ \ (_)__ | | / / _ \/ _ \ _/ _// // / / / _ \ | |/ / ___/ ___/ /_/ /____(_)_/\___/ |___/_/ /_/ - + vpp# sho int Name Idx State Counter Count FortyGigabitEthernet82/0/0 1 down FortyGigabitEthernet82/0/1 2 down local0 0 down - + vpp# vpp# set interface ip address FortyGigabitEthernet82/0/0 10.10.1.1/24 vpp# set interface ip address FortyGigabitEthernet82/0/1 10.10.2.1/24 @@ -109,7 +109,7 @@ configuration file "cap2/dns.yaml". Total-tx-bytes : 166886 bytes Total-tx-sw-bytes : 166716 bytes Total-rx-bytes : 166886 byte - + Total-tx-pkt : 2528 pkts Total-rx-pkt : 2528 pkts Total-sw-tx-pkt : 2526 pkts diff --git a/docs/usecases/simpleperf/trex1.rst b/docs/usecases/simpleperf/trex1.rst index 0daa57f7035..1704b3f13b0 100644 --- a/docs/usecases/simpleperf/trex1.rst +++ b/docs/usecases/simpleperf/trex1.rst @@ -15,7 +15,7 @@ generated using the traffic configuration file "avl/sfr_delay_10_1g.yaml": Total-tx-bytes : 251062132504 bytes Total-tx-sw-bytes : 21426636 bytes Total-rx-bytes : 251040139922 byte - + Total-tx-pkt : 430598064 pkts Total-rx-pkt : 430554755 pkts Total-sw-tx-pkt : 324646 pkts diff --git a/docs/usecases/simpleperf/trex2.rst b/docs/usecases/simpleperf/trex2.rst index 590bfd05629..e1ff98f1dc8 100644 --- a/docs/usecases/simpleperf/trex2.rst +++ b/docs/usecases/simpleperf/trex2.rst @@ -18,36 +18,36 @@ In one of terminals start TRex in stateless mode. Use *Ctrl-C* to stop. # cd v2.46/ # ./trex -i - -Per port stats table - ports | 0 | 1 | 2 | 3 + -Per port stats table + ports | 0 | 1 | 2 | 3 ----------------------------------------------------------------------------------------- - opackets | 0 | 0 | 0 | 0 - obytes | 0 | 0 | 0 | 0 - ipackets | 6 | 6 | 5 | 5 - ibytes | 384 | 384 | 320 | 320 - ierrors | 0 | 0 | 0 | 0 - oerrors | 0 | 0 | 0 | 0 - Tx Bw | 0.00 bps | 0.00 bps | 0.00 bps | 0.00 bps - - -Global stats enabled + opackets | 0 | 0 | 0 | 0 + obytes | 0 | 0 | 0 | 0 + ipackets | 6 | 6 | 5 | 5 + ibytes | 384 | 384 | 320 | 320 + ierrors | 0 | 0 | 0 | 0 + oerrors | 0 | 0 | 0 | 0 + Tx Bw | 0.00 bps | 0.00 bps | 0.00 bps | 0.00 bps + + -Global stats enabled Cpu Utilization : 0.0 % - Platform_factor : 1.0 - Total-Tx : 0.00 bps - Total-Rx : 238.30 bps - Total-PPS : 0.00 pps - Total-CPS : 0.00 cps - - Expected-PPS : 0.00 pps - Expected-CPS : 0.00 cps - Expected-BPS : 0.00 bps - - Active-flows : 0 Clients : 0 Socket-util : 0.0000 % - Open-flows : 0 Servers : 0 Socket : 0 Socket/Clients : -nan - drop-rate : 0.00 bps - current time : 21.4 sec - test duration : 0.0 sec + Platform_factor : 1.0 + Total-Tx : 0.00 bps + Total-Rx : 238.30 bps + Total-PPS : 0.00 pps + Total-CPS : 0.00 cps + + Expected-PPS : 0.00 pps + Expected-CPS : 0.00 cps + Expected-BPS : 0.00 bps + + Active-flows : 0 Clients : 0 Socket-util : 0.0000 % + Open-flows : 0 Servers : 0 Socket : 0 Socket/Clients : -nan + drop-rate : 0.00 bps + current time : 21.4 sec + test duration : 0.0 sec *** TRex is shutting down - cause: 'CTRL + C detected' - All cores stopped !! + All cores stopped !! In the other terminal start the TRex console. With this console we will execute the TRex commands. @@ -55,28 +55,28 @@ In the other terminal start the TRex console. With this console we will execute # cd v2.46/ # ./trex -console - + Using 'python' as Python interpreter - - + + Connecting to RPC server on localhost:4501 [SUCCESS] - - + + Connecting to publisher server on localhost:4500 [SUCCESS] - - + + Acquiring ports [0, 1, 2, 3]: [SUCCESS] - - + + Server Info: - + Server version: v2.46 @ STL Server mode: Stateless Server CPU: 2 x Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz - Ports count: 4 x 10Gbps @ VMXNET3 Ethernet Controller - + Ports count: 4 x 10Gbps @ VMXNET3 Ethernet Controller + -=TRex Console v3.0=- - + Type 'help' or '?' for supported actions trex> @@ -85,26 +85,26 @@ Start some traffic using the **stl/imix.py** traffic profile. .. code-block:: console trex>start -f ./stl/imix.py -p 0 1 2 3 -m 9475mbps - + Removing all streams from port(s) [0, 1, 2, 3]: [SUCCESS] - - + + Attaching 3 streams to port(s) [0]: [SUCCESS] - - + + Attaching 3 streams to port(s) [1]: [SUCCESS] - - + + Attaching 3 streams to port(s) [2]: [SUCCESS] - - + + Attaching 3 streams to port(s) [3]: [SUCCESS] - - + + Starting traffic on port(s) [0, 1, 2, 3]: [SUCCESS] - + 80.94 [ms] - + trex> The **-f ./stl/imix.py** argument specifies the file that is used to create the @@ -116,77 +116,77 @@ In the other terminal the display shows the statistics related the traffic flows .. code-block:: console - -Per port stats table + -Per port stats table ports | 0 | 1 | 2 | 3 ----------------------------------------------------------------------------------------- - opackets | 789907304 | 789894738 | 790017701 | 790017132 - obytes | 285397726750 | 285392406754 | 285406864578 | 285405883070 - ipackets | 1563501970 | 45 | 1563504693 | 44 - ibytes | 564870783050 | 2880 | 564873491682 | 2816 - ierrors | 15728759 | 0 | 15732451 | 0 - oerrors | 0 | 0 | 0 | 0 - Tx Bw | 606.55 Mbps | 606.19 Mbps | 606.25 Mbps | 606.51 Mbps - - -Global stats enabled - Cpu Utilization : 100.0 % 2.4 Gb/core - Platform_factor : 1.0 - Total-Tx : 2.43 Gbps - Total-Rx : 2.40 Gbps - Total-PPS : 841.44 Kpps - Total-CPS : 0.00 cps - - Expected-PPS : 0.00 pps - Expected-CPS : 0.00 cps - Expected-BPS : 0.00 bps - - Active-flows : 0 Clients : 0 Socket-util : 0.0000 % - Open-flows : 0 Servers : 0 Socket : 0 Socket/Clients : -nan - Total_queue_full : 6529970196 - drop-rate : 0.00 bps - current time : 4016.8 sec - test duration : 0.0 sec - + opackets | 789907304 | 789894738 | 790017701 | 790017132 + obytes | 285397726750 | 285392406754 | 285406864578 | 285405883070 + ipackets | 1563501970 | 45 | 1563504693 | 44 + ibytes | 564870783050 | 2880 | 564873491682 | 2816 + ierrors | 15728759 | 0 | 15732451 | 0 + oerrors | 0 | 0 | 0 | 0 + Tx Bw | 606.55 Mbps | 606.19 Mbps | 606.25 Mbps | 606.51 Mbps + + -Global stats enabled + Cpu Utilization : 100.0 % 2.4 Gb/core + Platform_factor : 1.0 + Total-Tx : 2.43 Gbps + Total-Rx : 2.40 Gbps + Total-PPS : 841.44 Kpps + Total-CPS : 0.00 cps + + Expected-PPS : 0.00 pps + Expected-CPS : 0.00 cps + Expected-BPS : 0.00 bps + + Active-flows : 0 Clients : 0 Socket-util : 0.0000 % + Open-flows : 0 Servers : 0 Socket : 0 Socket/Clients : -nan + Total_queue_full : 6529970196 + drop-rate : 0.00 bps + current time : 4016.8 sec + test duration : 0.0 sec + More statistics can be displayed on the TRex console using the **tui** command. .. code-block:: console trex>tui - + Global Statistics - - connection : localhost, Port 4501 total_tx_L2 : 2.45 Gb/sec - version : STL @ v2.46 total_tx_L1 : 2.59 Gb/sec - cpu_util. : 99.89% @ 2 cores (1 per port) total_rx : 2.42 Gb/sec - rx_cpu_util. : 4.03% / 837.39 Kpkt/sec total_pps : 846.96 Kpkt/sec - async_util. : 0.05% / 1.76 KB/sec drop_rate : 0 b/sec - queue_full : 42,750,771 pkts - + + connection : localhost, Port 4501 total_tx_L2 : 2.45 Gb/sec + version : STL @ v2.46 total_tx_L1 : 2.59 Gb/sec + cpu_util. : 99.89% @ 2 cores (1 per port) total_rx : 2.42 Gb/sec + rx_cpu_util. : 4.03% / 837.39 Kpkt/sec total_pps : 846.96 Kpkt/sec + async_util. : 0.05% / 1.76 KB/sec drop_rate : 0 b/sec + queue_full : 42,750,771 pkts + Port Statistics - - port | 0 | 1 | 2 | 3 | total + + port | 0 | 1 | 2 | 3 | total -----------+-------------------+-------------------+-------------------+-------------------+------------------ - owner | root | root | root | root | - link | UP | UP | UP | UP | - state | TRANSMITTING | TRANSMITTING | TRANSMITTING | TRANSMITTING | - speed | 10 Gb/s | 10 Gb/s | 10 Gb/s | 10 Gb/s | - CPU util. | 99.89% | 99.89% | 99.89% | 99.89% | - -- | | | | | - Tx bps L2 | 612.76 Mbps | 613.07 Mbps | 612.52 Mbps | 612.77 Mbps | 2.45 Gbps - Tx bps L1 | 646.64 Mbps | 646.96 Mbps | 646.4 Mbps | 646.64 Mbps | 2.59 Gbps - Tx pps | 211.72 Kpps | 211.8 Kpps | 211.73 Kpps | 211.71 Kpps | 846.96 Kpps - Line Util. | 6.47 % | 6.47 % | 6.46 % | 6.47 % | - --- | | | | | - Rx bps | 1.21 Gbps | \u25bc\u25bc\u25bc 23.03 bps | 1.21 Gbps | 5.94 bps | 2.42 G bps - Rx pps | 418.59 Kpps | 0.04 pps | 418.77 Kpps | 0.01 pps | 837.36 Kpps - ---- | | | | | - opackets | 5227126 | 5227271 | 5432528 | 5432354 | 21319279 - ipackets | 10526000 | 5 | 10527054 | 4 | 21053063 - obytes | 1890829910 | 1891039152 | 1965259162 | 1965124338 | 7712252562 - ibytes | 3807894454 | 320 | 3808149896 | 256 | 7616044926 - tx-pkts | 5.23 Mpkts | 5.23 Mpkts | 5.43 Mpkts | 5.43 Mpkts | 21.32 Mpkts - rx-pkts | 10.53 Mpkts | 5 pkts | 10.53 Mpkts | 4 pkts | 21.05 Mpkts - tx-bytes | 1.89 GB | 1.89 GB | 1.97 GB | 1.97 GB | 7.71 GB - rx-bytes | 3.81 GB | 320 B | 3.81 GB | 256 B | 7.62 GB - ----- | | | | | - oerrors | 0 | 0 | 0 | 0 | 0 - ierrors | 133,370 | 0 | 132,529 | 0 | 265,899 + owner | root | root | root | root | + link | UP | UP | UP | UP | + state | TRANSMITTING | TRANSMITTING | TRANSMITTING | TRANSMITTING | + speed | 10 Gb/s | 10 Gb/s | 10 Gb/s | 10 Gb/s | + CPU util. | 99.89% | 99.89% | 99.89% | 99.89% | + -- | | | | | + Tx bps L2 | 612.76 Mbps | 613.07 Mbps | 612.52 Mbps | 612.77 Mbps | 2.45 Gbps + Tx bps L1 | 646.64 Mbps | 646.96 Mbps | 646.4 Mbps | 646.64 Mbps | 2.59 Gbps + Tx pps | 211.72 Kpps | 211.8 Kpps | 211.73 Kpps | 211.71 Kpps | 846.96 Kpps + Line Util. | 6.47 % | 6.47 % | 6.46 % | 6.47 % | + --- | | | | | + Rx bps | 1.21 Gbps | \u25bc\u25bc\u25bc 23.03 bps | 1.21 Gbps | 5.94 bps | 2.42 G bps + Rx pps | 418.59 Kpps | 0.04 pps | 418.77 Kpps | 0.01 pps | 837.36 Kpps + ---- | | | | | + opackets | 5227126 | 5227271 | 5432528 | 5432354 | 21319279 + ipackets | 10526000 | 5 | 10527054 | 4 | 21053063 + obytes | 1890829910 | 1891039152 | 1965259162 | 1965124338 | 7712252562 + ibytes | 3807894454 | 320 | 3808149896 | 256 | 7616044926 + tx-pkts | 5.23 Mpkts | 5.23 Mpkts | 5.43 Mpkts | 5.43 Mpkts | 21.32 Mpkts + rx-pkts | 10.53 Mpkts | 5 pkts | 10.53 Mpkts | 4 pkts | 21.05 Mpkts + tx-bytes | 1.89 GB | 1.89 GB | 1.97 GB | 1.97 GB | 7.71 GB + rx-bytes | 3.81 GB | 320 B | 3.81 GB | 256 B | 7.62 GB + ----- | | | | | + oerrors | 0 | 0 | 0 | 0 | 0 + ierrors | 133,370 | 0 | 132,529 | 0 | 265,899 diff --git a/docs/usecases/trafficgen.md b/docs/usecases/trafficgen.md deleted file mode 100644 index fe3d4c98904..00000000000 --- a/docs/usecases/trafficgen.md +++ /dev/null @@ -1,105 +0,0 @@ -Vpp Stateless Traffic Generation -================================ - -It's simple to configure vpp as a high-performance stateless traffic -generator. A couple of vpp worker threads running on an older system -can easily generate 20 MPPS' worth of traffic. - -In the configurations shown below, we connect a vpp traffic generator -and a vpp UUT using two 40 gigabit ethernet ports on each system: - -``` - +-------------------+ +-------------------+ - | traffic generator | | UUT | - | port 0 | <=======> | port 0 | - | 192.168.40.2/24 | | 192.168.40.1/24 | - +-------------------+ +-------------------+ - - +-------------------+ +-------------------+ - | traffic generator | | UUT | - | port 1 | <=======> | port 1 | - | 192.168.41.2/24 | | 192.168.41.1/24 | - +-------------------+ +-------------------+ -``` - -Traffic Generator Setup Script ------------------------------- - -``` - set int ip address FortyGigabitEthernet2/0/0 192.168.40.2/24 - set int ip address FortyGigabitEthernet2/0/1 192.168.41.2/24 - set int state FortyGigabitEthernet2/0/0 up - set int state FortyGigabitEthernet2/0/1 up - - comment { send traffic to the VPP UUT } - - packet-generator new { - name worker0 - worker 0 - limit 0 - rate 1.2e7 - size 128-128 - tx-interface FortyGigabitEthernet2/0/0 - node FortyGigabitEthernet2/0/0-output - data { IP4: 1.2.40 -> 3cfd.fed0.b6c8 - UDP: 192.168.40.10 -> 192.168.50.10 - UDP: 1234 -> 2345 - incrementing 114 - } - } - - packet-generator new { - name worker1 - worker 1 - limit 0 - rate 1.2e7 - size 128-128 - tx-interface FortyGigabitEthernet2/0/1 - node FortyGigabitEthernet2/0/1-output - data { IP4: 1.2.4 -> 3cfd.fed0.b6c9 - UDP: 192.168.41.10 -> 192.168.51.10 - UDP: 1234 -> 2345 - incrementing 114 - } - } - - comment { delete return traffic on sight } - - ip route add 192.168.50.0/24 via drop - ip route add 192.168.51.0/24 via drop -``` - -Note 1: the destination MAC addresses shown in the configuration (e.g. -3cfd.fed0.b6c8 and 3cfd.fed0.b6c9) **must** match the vpp UUT port MAC -addresses. - -Note 2: this script assumes that /etc/vpp/startup.conf and/or the -command-line in use specifies (at least) two worker threads. Uncomment -"workers 2" in the cpu configuration section of /etc/vpp/startup.conf: - -``` - ## Specify a number of workers to be created - ## Workers are pinned to N consecutive CPU cores while skipping "skip-cores" CPU core(s) - ## and main thread's CPU core - workers 2 -``` - -Any plausible packet generator script - including one which replays -pcap captures - can be used. - - -UUT Setup Script ----------------- - -The vpp UUT uses a couple of static routes to forward traffic back to -the traffic generator: - -``` - set int ip address FortyGigabitEthernet2/0/0 192.168.40.1/24 - set int ip address FortyGigabitEthernet2/0/1 192.168.41.1/24 - set int state FortyGigabitEthernet2/0/0 up - set int state FortyGigabitEthernet2/0/1 up - - ip route add 192.168.50.10/32 via 192.168.41.2 - ip route add 192.168.51.10/32 via 192.168.40.2 -``` diff --git a/docs/usecases/trafficgen.rst b/docs/usecases/trafficgen.rst new file mode 100644 index 00000000000..82dba96c171 --- /dev/null +++ b/docs/usecases/trafficgen.rst @@ -0,0 +1,104 @@ +Stateless Traffic Gen with VPP +============================== + +It’s simple to configure vpp as a high-performance stateless traffic +generator. A couple of vpp worker threads running on an older system can +easily generate 20 MPPS’ worth of traffic. + +In the configurations shown below, we connect a vpp traffic generator +and a vpp UUT using two 40 gigabit ethernet ports on each system: + +:: + + +-------------------+ +-------------------+ + | traffic generator | | UUT | + | port 0 | <=======> | port 0 | + | 192.168.40.2/24 | | 192.168.40.1/24 | + +-------------------+ +-------------------+ + + +-------------------+ +-------------------+ + | traffic generator | | UUT | + | port 1 | <=======> | port 1 | + | 192.168.41.2/24 | | 192.168.41.1/24 | + +-------------------+ +-------------------+ + +Traffic Generator Setup Script +------------------------------ + +:: + + set int ip address FortyGigabitEthernet2/0/0 192.168.40.2/24 + set int ip address FortyGigabitEthernet2/0/1 192.168.41.2/24 + set int state FortyGigabitEthernet2/0/0 up + set int state FortyGigabitEthernet2/0/1 up + + comment { send traffic to the VPP UUT } + + packet-generator new { + name worker0 + worker 0 + limit 0 + rate 1.2e7 + size 128-128 + tx-interface FortyGigabitEthernet2/0/0 + node FortyGigabitEthernet2/0/0-output + data { IP4: 1.2.40 -> 3cfd.fed0.b6c8 + UDP: 192.168.40.10 -> 192.168.50.10 + UDP: 1234 -> 2345 + incrementing 114 + } + } + + packet-generator new { + name worker1 + worker 1 + limit 0 + rate 1.2e7 + size 128-128 + tx-interface FortyGigabitEthernet2/0/1 + node FortyGigabitEthernet2/0/1-output + data { IP4: 1.2.4 -> 3cfd.fed0.b6c9 + UDP: 192.168.41.10 -> 192.168.51.10 + UDP: 1234 -> 2345 + incrementing 114 + } + } + + comment { delete return traffic on sight } + + ip route add 192.168.50.0/24 via drop + ip route add 192.168.51.0/24 via drop + +Note 1: the destination MAC addresses shown in the configuration (e.g. +3cfd.fed0.b6c8 and 3cfd.fed0.b6c9) **must** match the vpp UUT port MAC +addresses. + +Note 2: this script assumes that /etc/vpp/startup.conf and/or the +command-line in use specifies (at least) two worker threads. Uncomment +“workers 2” in the cpu configuration section of /etc/vpp/startup.conf: + +:: + + ## Specify a number of workers to be created + ## Workers are pinned to N consecutive CPU cores while skipping "skip-cores" CPU core(s) + ## and main thread's CPU core + workers 2 + +Any plausible packet generator script - including one which replays pcap +captures - can be used. + +UUT Setup Script +---------------- + +The vpp UUT uses a couple of static routes to forward traffic back to +the traffic generator: + +:: + + set int ip address FortyGigabitEthernet2/0/0 192.168.40.1/24 + set int ip address FortyGigabitEthernet2/0/1 192.168.41.1/24 + set int state FortyGigabitEthernet2/0/0 up + set int state FortyGigabitEthernet2/0/1 up + + ip route add 192.168.50.10/32 via 192.168.41.2 + ip route add 192.168.51.10/32 via 192.168.40.2 diff --git a/docs/usecases/vhost/index.rst b/docs/usecases/vhost/index.rst index 002ebc17639..dd189b467e0 100644 --- a/docs/usecases/vhost/index.rst +++ b/docs/usecases/vhost/index.rst @@ -1,7 +1,7 @@ .. _vhost: -FD.io VPP with Virtual Machines -=============================== +VPP with Virtual Machines +========================= This chapter will describe how to use FD.io VPP with virtual machines. We describe how to create Vhost port with VPP and how to connect them to VPP. We will also discuss some limitations of Vhost. diff --git a/docs/usecases/vhost/vhost.rst b/docs/usecases/vhost/vhost.rst index f62faade306..e5ff59110b7 100644 --- a/docs/usecases/vhost/vhost.rst +++ b/docs/usecases/vhost/vhost.rst @@ -12,11 +12,11 @@ refer to `virsh man page <https://linux.die.net/man/1/virsh>`_. The image that we use is based on an Ubuntu cloud image downloaded from: `Ubuntu Cloud Images <https://cloud-images.ubuntu.com/xenial/current>`_. -All FD.io VPP commands are being run from a su shell. +All FD.io VPP commands are being run from a su shell. .. _vhosttopo: -Topology +Topology --------- In this case we will use 2 systems. One system we will be running standard linux, the other will @@ -45,7 +45,7 @@ this system to verify our connectivity to our VM with ping. __/ __/ _ \ (_)__ | | / / _ \/ _ \ _/ _// // / / / _ \ | |/ / ___/ ___/ /_/ /____(_)_/\___/ |___/_/ /_/ - + vpp# clear interfaces vpp# show int Name Idx State Counter Count @@ -56,7 +56,7 @@ this system to verify our connectivity to our VM with ping. For more information on the interface commands refer to: :ref:`intcommands` -The next step will be to create the virtual port using the :ref:`createvhostuser` command. +The next step will be to create the virtual port using the ``createvhostuser`` command. This command will create the virtual port in VPP and create a linux socket that the VM will use to connect to VPP. @@ -79,7 +79,7 @@ Creating the VPP port: Notice the interface **VirtualEthernet0/0/0**. In this example we created the virtual interface as a client. -We can get more detail on the vhost connection with the :ref:`showvhost` command. +We can get more detail on the vhost connection with the ``showvhost`` command. .. code-block:: console @@ -100,14 +100,14 @@ We can get more detail on the vhost connection with the :ref:`showvhost` command protocol features (0x3) VHOST_USER_PROTOCOL_F_MQ (0) VHOST_USER_PROTOCOL_F_LOG_SHMFD (1) - + socket filename /tmp/vm00.sock type client errno "No such file or directory" - + rx placement: tx placement: spin-lock thread 0 on vring 0 thread 1 on vring 0 - + Memory regions (total 0) Notice **No such file or directory** and **Memory regions (total 0)**. This is because the diff --git a/docs/usecases/vhost/vhost02.rst b/docs/usecases/vhost/vhost02.rst index d316463c15a..17bbafc854f 100644 --- a/docs/usecases/vhost/vhost02.rst +++ b/docs/usecases/vhost/vhost02.rst @@ -3,8 +3,7 @@ Creating the Virtual Machine ---------------------------- -We will now create the virtual machine. We use the "virsh create command". For the complete file we -use refer to :ref:`xmlexample`. +We will now create the virtual machine. We use the "virsh create command". It is important to note that in the XML file we specify the socket path that is used to connect to FD.io VPP. @@ -85,26 +84,26 @@ and in the example. protocol features (0x3) VHOST_USER_PROTOCOL_F_MQ (0) VHOST_USER_PROTOCOL_F_LOG_SHMFD (1) - + socket filename /tmp/vm00.sock type client errno "Success" - + rx placement: thread 1 on vring 1, polling tx placement: spin-lock thread 0 on vring 0 thread 1 on vring 0 - + Memory regions (total 2) region fd guest_phys_addr memory_size userspace_addr mmap_offset mmap_addr ====== ===== ================== ================== ================== ================== =============== === 0 31 0x0000000000000000 0x00000000000a0000 0x00007f1db9c00000 0x0000000000000000 0x00007f7db0400 000 1 32 0x00000000000c0000 0x000000000ff40000 0x00007f1db9cc0000 0x00000000000c0000 0x00007f7d94ec0 000 - + Virtqueue 0 (TX) qsz 1024 last_avail_idx 0 last_used_idx 0 avail.flags 0 avail.idx 256 used.flags 1 used.idx 0 kickfd 33 callfd 34 errfd -1 - + Virtqueue 1 (RX) qsz 1024 last_avail_idx 8 last_used_idx 8 avail.flags 0 avail.idx 8 used.flags 1 used.idx 8 diff --git a/docs/usecases/vhost/vhost03.rst b/docs/usecases/vhost/vhost03.rst index ed583349bc6..05d1fd1fb11 100644 --- a/docs/usecases/vhost/vhost03.rst +++ b/docs/usecases/vhost/vhost03.rst @@ -17,7 +17,7 @@ Use the "set interface l2 bridge" command. vpp# show bridge 100 det BD-ID Index BSN Age(min) Learning U-Forwrd UU-Flood Flooding ARP-Term BVI-Intf 100 1 0 off on on on on off N/A - + Interface If-idx ISN SHG BVI TxFlood VLAN-Tag-Rewrite VirtualEthernet0/0/0 3 1 0 - * none TenGigabitEthernet86/0/0 1 1 0 - * none @@ -85,4 +85,4 @@ system as **tx packets**. The reverse is true on the way in. tx packets 16 tx bytes 1476 local0 0 down - vpp# + vpp# diff --git a/docs/usecases/vhost/vhost04.rst b/docs/usecases/vhost/vhost04.rst index 256c0b8ffa4..02e05bc66bf 100644 --- a/docs/usecases/vhost/vhost04.rst +++ b/docs/usecases/vhost/vhost04.rst @@ -11,7 +11,7 @@ Destroy the VMs with "virsh destroy" Id Name State ---------------------------------------------------- 65 iperf-server3 running - + cto@tf-ucs-3:~$ virsh destroy iperf-server3 Domain iperf-server3 destroyed diff --git a/docs/usecases/vhost/vhost05.rst b/docs/usecases/vhost/vhost05.rst index 4eba6e17101..c7c80d55831 100644 --- a/docs/usecases/vhost/vhost05.rst +++ b/docs/usecases/vhost/vhost05.rst @@ -11,7 +11,7 @@ VPP performance with vHost is limited by the Qemu vHost driver. FD.io VPP 18.04 shows with 2 threads, 2 cores and a Queue size of 1024 the maximum NDR throughput was about 7.5 Mpps. This is about the limit at this time. -For all the details on the CSIT VM vhost connection refer to the +For all the details on the CSIT VM vhost connection refer to the `CSIT VM vHost performance tests <https://docs.fd.io/csit/rls1804/report/vpp_performance_tests/packet_throughput_graphs/vm_vhost.html>`_. diff --git a/docs/usecases/vmxnet3.rst b/docs/usecases/vmxnet3.rst index 427ebbaf115..3cf730cbdc6 100644 --- a/docs/usecases/vmxnet3.rst +++ b/docs/usecases/vmxnet3.rst @@ -124,10 +124,10 @@ Bind the driver with the following commands: Network devices using DPDK-compatible driver ============================================ <none> - + Network devices using kernel driver =================================== - 0000:03:00.0 'VMXNET3 Ethernet Controller' if=ens160 drv=vmxnet3 unused=vfio-pci,uio_pci_generic + 0000:03:00.0 'VMXNET3 Ethernet Controller' if=ens160 drv=vmxnet3 unused=vfio-pci,uio_pci_generic 0000:0b:00.0 'VMXNET3 Ethernet Controller' drv=vfio-pci unused=vmxnet3,uio_pci_generic 0000:13:00.0 'VMXNET3 Ethernet Controller' drv=vfio-pci unused=vmxnet3,uio_pci_generic ..... diff --git a/docs/usecases/vpp_init_sswan_resp.md b/docs/usecases/vpp_init_sswan_resp.md deleted file mode 100644 index 40da137cbd2..00000000000 --- a/docs/usecases/vpp_init_sswan_resp.md +++ /dev/null @@ -1,195 +0,0 @@ -VPP as IKEv2 initiator and strongSwan as responder -================================================== - -Prerequisites -------------- - -To make the examples easier to configure ``docker`` it is required to pull strongSwan docker image. The networking is done using Linux' veth interfaces and namespaces. - -Setup ------ - -First a topology: - -``` -192.168.3.2 192.168.5.2 - + loopback - | + -+----+----+ 192.168.10.2 +-----+----+ -| VPP | |strongSwan| -|initiator+----------------------+responder | -+---------+ +----------+ - 192.168.10.1 -``` - -Create veth interfaces and namespaces and configure them: - -``` -sudo ip link add gw type veth peer name swanif -sudo ip link set dev gw up - -sudo ip netns add ns -sudo ip link add veth_priv type veth peer name priv -sudo ip link set dev priv up -sudo ip link set dev veth_priv up netns ns - -sudo ip netns exec ns \ - bash -c " - ip link set dev lo up - ip addr add 192.168.3.2/24 dev veth_priv - ip route add 192.168.5.0/24 via 192.168.3.1" -``` - - -Create directory with strongswan configs that will be mounted to the docker container -``` -mkdir /tmp/sswan -``` - -Create the ``ipsec.conf`` file in the ``/tmp/sswan`` directory with following content: - -``` -config setup - strictcrlpolicy=no - -conn initiator - mobike=no - auto=add - type=tunnel - keyexchange=ikev2 - ike=aes256gcm16-prfsha256-modp2048! - esp=aes256gcm16-esn! - -# local: - leftauth=psk - leftid=@sswan.vpn.example.com - leftsubnet=192.168.5.0/24 - -# remote: (gateway) - rightid=@roadwarrior.vpp - right=192.168.10.2 - rightauth=psk - rightsubnet=192.168.3.0/24 -``` - -``/tmp/sswan/ipsec.secrets`` -``` -: PSK 'Vpp123' -``` - -``/tmp/sswan/strongswan.conf`` -``` -charon { - load_modular = yes - plugins { - include strongswan.d/charon/*.conf - } - filelog { - /tmp/charon.log { - time_format = %b %e %T - ike_name = yes - append = no - default = 2 - flush_line = yes - } - } -} -include strongswan.d/*.conf -``` - -Start docker container with strongSwan: - -``` - docker run --name sswan -d --privileged --rm --net=none \ - -v /tmp/sswan:/conf -v /tmp/sswan:/etc/ipsec.d philplckthun/strongswan -``` - -Finish configuration of initiator's private network: - -``` -pid=$(docker inspect --format "{{.State.Pid}}" sswan) -sudo ip link set netns $pid dev swanif - -sudo nsenter -t $pid -n ip addr add 192.168.10.1/24 dev swanif -sudo nsenter -t $pid -n ip link set dev swanif up - -sudo nsenter -t $pid -n ip addr add 192.168.5.2/32 dev lo -sudo nsenter -t $pid -n ip link set dev lo up -``` - -Start VPP ... - -``` -sudo /usr/bin/vpp unix { \ - cli-listen /tmp/vpp.sock \ - gid $(id -g) } \ - api-segment { prefix vpp } \ - plugins { plugin dpdk_plugin.so { disable } } -``` - -... and configure it: - -``` -create host-interface name gw -set interface ip addr host-gw 192.168.10.2/24 -set interface state host-gw up - -create host-interface name priv -set interface ip addr host-priv 192.168.3.1/24 -set interface state host-priv up - -ikev2 profile add pr1 -ikev2 profile set pr1 auth shared-key-mic string Vpp123 -ikev2 profile set pr1 id local fqdn roadwarrior.vpp -ikev2 profile set pr1 id remote fqdn sswan.vpn.example.com - -ikev2 profile set pr1 traffic-selector local ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 -ikev2 profile set pr1 traffic-selector remote ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 - -ikev2 profile set pr1 responder host-gw 192.168.10.1 -ikev2 profile set pr1 ike-crypto-alg aes-gcm-16 256 ike-dh modp-2048 -ikev2 profile set pr1 esp-crypto-alg aes-gcm-16 256 - -create ipip tunnel src 192.168.10.2 dst 192.168.10.1 -ikev2 profile set pr1 tunnel ipip0 -ip route add 192.168.5.0/24 via 192.168.10.1 ipip0 -set interface unnumbered ipip0 use host-gw -``` - -Initiate the IKEv2 connection: - -``` -vpp# ikev2 initiate sa-init pr1 -``` - -``` -vpp# show ikev2 sa details - iip 192.168.10.2 ispi f717b0cbd17e27c3 rip 192.168.10.1 rspi e9b7af7fc9b13361 - encr:aes-gcm-16 prf:hmac-sha2-256 dh-group:modp-2048 - nonce i:eb0354613b268c6372061bbdaab13deca37c8a625b1f65c073d25df2ecfe672e - r:70e1248ac09943047064f6a2135fa2a424778ba03038ab9c4c2af8aba179ed84 - SK_d 96bd4feb59be2edf1930a12a3a5d22e30195ee9f56ea203c5fb6cba5dd2bb80f - SK_e i:00000000: 5b75b9d808c8467fd00a0923c06efee2a4eb1d033c57532e05f9316ed9c56fe9 - 00000020: c4db9114 - r:00000000: 95121b63372d20b83558dc3e209b9affef042816cf071c86a53543677b40c15b - 00000020: f169ab67 - SK_p i:fb40d1114c347ddc3228ba004d4759d58f9c1ae6f1746833f908d39444ef92b1 - r:aa049828240cb242e1d5aa625cd5914dc8f8e980a74de8e06883623d19384902 - identifier (i) id-type fqdn data roadwarrior.vpp - identifier (r) id-type fqdn data sswan.vpn.example.com - child sa 0:encr:aes-gcm-16 esn:yes - spi(i) 9dffd57a spi(r) c4e0ef53 - SK_e i:290c681694f130b33d511335dd257e78721635b7e8aa87930dd77bb1d6dd3f42 - r:0a09fa18cf1cf65c6324df02b46dcc998b84e5397cf911b63e0c096053946c2e - traffic selectors (i):0 type 7 protocol_id 0 addr 192.168.3.0 - 192.168.3.255 port 0 - 65535 - traffic selectors (r):0 type 7 protocol_id 0 addr 192.168.5.0 - 192.168.5.255 port 0 - 65535 -``` - -Now we can generate some traffic between responder's and initiator's private networks and see it works. - -``` -$ sudo ip netns exec ns ping 192.168.5.2 -PING 192.168.5.2 (192.168.5.2) 56(84) bytes of data. -64 bytes from 192.168.5.2: icmp_seq=1 ttl=63 time=0.450 ms -64 bytes from 192.168.5.2: icmp_seq=2 ttl=63 time=0.630 ms -``` diff --git a/docs/usecases/vpp_resp_sswan_init.md b/docs/usecases/vpp_resp_sswan_init.md deleted file mode 100644 index 613a4b69cf6..00000000000 --- a/docs/usecases/vpp_resp_sswan_init.md +++ /dev/null @@ -1,197 +0,0 @@ -VPP as IKEv2 responder and strongSwan as initiator -================================================== - - -Prerequisites -------------- - -To make the examples easier to configure ``docker`` it is required to pull strongSwan docker image. The networking is done using Linux' veth interfaces and namespaces. - -Setup ------ - -First a topology: - -``` -192.168.3.2 192.168.5.2 - + loopback - | + -+----+----+ 192.168.10.2 +-----+----+ -| VPP | |initiator | -|responder+----------------------+strongSwan| -+---------+ +----------+ - 192.168.10.1 -``` - -Create veth interfaces and namespaces and configure them: - -``` -sudo ip link add gw type veth peer name swanif -sudo ip link set dev gw up - -sudo ip netns add ns -sudo ip link add veth_priv type veth peer name priv -sudo ip link set dev priv up -sudo ip link set dev veth_priv up netns ns - -sudo ip netns exec ns \ - bash -c " - ip link set dev lo up - ip addr add 192.168.3.2/24 dev veth_priv - ip route add 192.168.5.0/24 via 192.168.3.1" -``` - - -Create directory with strongswan configs that will be mounted to the docker container -``` -mkdir /tmp/sswan -``` - -Create the ``ipsec.conf`` file in the ``/tmp/sswan`` directory with following content: -``` -config setup - strictcrlpolicy=no - -conn initiator - mobike=no - auto=add - type=tunnel - keyexchange=ikev2 - ike=aes256gcm16-prfsha256-modp2048! - esp=aes256gcm16-esn! - - # local: - leftauth=psk - leftid=@roadwarrior.vpn.example.com - leftsubnet=192.168.5.0/24 - - # remote: (vpp gateway) - rightid=@vpp.home - right=192.168.10.2 - rightauth=psk - rightsubnet=192.168.3.0/24 -``` - -``/tmp/sswan/ipsec.secrets`` -``` -: PSK 'Vpp123' -``` - -``/tmp/sswan/strongswan.conf`` -``` -charon { - load_modular = yes - plugins { - include strongswan.d/charon/*.conf - } - filelog { - /tmp/charon.log { - time_format = %b %e %T - ike_name = yes - append = no - default = 2 - flush_line = yes - } - } -} -include strongswan.d/*.conf -``` - -Start docker container with strongSwan: - -``` - docker run --name sswan -d --privileged --rm --net=none \ - -v /tmp/sswan:/conf -v /tmp/sswan:/etc/ipsec.d philplckthun/strongswan -``` - -Finish configuration of initiator's private network: - -``` -pid=$(docker inspect --format "{{.State.Pid}}" sswan) -sudo ip link set netns $pid dev swanif - -sudo nsenter -t $pid -n ip addr add 192.168.10.1/24 dev swanif -sudo nsenter -t $pid -n ip link set dev swanif up - -sudo nsenter -t $pid -n ip addr add 192.168.5.2/32 dev lo -sudo nsenter -t $pid -n ip link set dev lo up -``` - -Start VPP ... - -``` -sudo /usr/bin/vpp unix { \ - cli-listen /tmp/vpp.sock \ - gid $(id -g) } \ - api-segment { prefix vpp } \ - plugins { plugin dpdk_plugin.so { disable } } -``` - -... and configure it: - -``` -create host-interface name gw -set interface ip addr host-gw 192.168.10.2/24 -set interface state host-gw up - -create host-interface name priv -set interface ip addr host-priv 192.168.3.1/24 -set interface state host-priv up - -ikev2 profile add pr1 -ikev2 profile set pr1 auth shared-key-mic string Vpp123 -ikev2 profile set pr1 id local fqdn vpp.home -ikev2 profile set pr1 id remote fqdn roadwarrior.vpn.example.com - -ikev2 profile set pr1 traffic-selector local ip-range 192.168.3.0 - 192.168.3.255 port-range 0 - 65535 protocol 0 -ikev2 profile set pr1 traffic-selector remote ip-range 192.168.5.0 - 192.168.5.255 port-range 0 - 65535 protocol 0 - -create ipip tunnel src 192.168.10.2 dst 192.168.10.1 -ikev2 profile set pr1 tunnel ipip0 -ip route add 192.168.5.0/24 via 192.168.10.1 ipip0 -set interface unnumbered ipip0 use host-gw -``` - -Initiate the IKEv2 connection: - -``` -$ sudo docker exec sswan ipsec up initiator - -... -CHILD_SA initiator{1} established with SPIs c320c95f_i 213932c2_o and TS 192.168.5.0/24 === 192.168.3.0/24 -connection 'initiator' established successfully -``` - -``` -vpp# show ikev2 sa details - -iip 192.168.10.1 ispi 7849021d9f655f1b rip 192.168.10.2 rspi 5a9ca7469a035205 - encr:aes-gcm-16 prf:hmac-sha2-256 dh-group:modp-2048 - nonce i:692ce8fd8f1c1934f63bfa2b167c4de2cff25640dffe938cdfe01a5d7f6820e6 - r:3ed84a14ea8526063e5aa762312be225d33e866d7152b9ce23e50f0ededca9e3 - SK_d 9a9b896ed6c35c78134fcd6e966c04868b6ecacf6d5088b4b2aee8b05d30fdda - SK_e i:00000000: 1b1619788d8c812ca5916c07e635bda860f15293099f3bf43e8d88e52074b006 - 00000020: 72c8e3e3 - r:00000000: 89165ceb2cef6a6b3319f437386292d9ef2e96d8bdb21eeb0cb0d3b92733de03 - 00000020: bbc29c50 - SK_p i:fe35fca30985ee75e7c8bc0d7bc04db7a0e1655e997c0f5974c31458826b6fef - r:0dd318662a96a25fcdf4998d8c6e4180c67c03586cf91dab26ed43aeda250272 - identifier (i) id-type fqdn data roadwarrior.vpn.example.com - identifier (r) id-type fqdn data vpp.home - child sa 0:encr:aes-gcm-16 esn:yes - spi(i) c320c95f spi(r) 213932c2 - SK_e i:2a6c9eae9dbed202c0ae6ccc001621aba5bb0b01623d4de4d14fd27bd5185435 - r:15e2913d39f809040ca40a02efd27da298b6de05f67bd8f10210da5e6ae606fb - traffic selectors (i):0 type 7 protocol_id 0 addr 192.168.5.0 - 192.168.5.255 port 0 - 65535 - traffic selectors (r):0 type 7 protocol_id 0 addr 192.168.3.0 - 192.168.3.255 port 0 - 65535 - -``` - -Now we can generate some traffic between responder's and initiator's private networks and see it works. - -``` -$ sudo ip netns exec ns ping 192.168.5.2 -PING 192.168.5.2 (192.168.5.2) 56(84) bytes of data. -64 bytes from 192.168.5.2: icmp_seq=1 ttl=63 time=1.02 ms -64 bytes from 192.168.5.2: icmp_seq=2 ttl=63 time=0.599 ms -``` diff --git a/docs/usecases/ConnectingVPC.rst b/docs/usecases/vppcloud/ConnectingVPC.rst index f5b2c5dc9ae..f5b2c5dc9ae 100644 --- a/docs/usecases/ConnectingVPC.rst +++ b/docs/usecases/vppcloud/ConnectingVPC.rst diff --git a/docs/usecases/automatingthedeployment.rst b/docs/usecases/vppcloud/automatingthedeployment.rst index 25317e2d0a0..25317e2d0a0 100644 --- a/docs/usecases/automatingthedeployment.rst +++ b/docs/usecases/vppcloud/automatingthedeployment.rst diff --git a/docs/usecases/vppcloud.rst b/docs/usecases/vppcloud/index.rst index 22bb984e1e0..728eb2e9bfc 100644 --- a/docs/usecases/vppcloud.rst +++ b/docs/usecases/vppcloud/index.rst @@ -1,7 +1,7 @@ .. _vppcloud: -VPP inside the Cloud -==================== +VPP in the Cloud +================ This section will cover the VPP deployment inside two different Public Cloud Provider: Amazon AWS and Microsoft Azure. Furthermore, we describe how to interconnect several Public Cloud Regions together with Segment Routing per IPv6 and we show some Performance Evaluation. Finally, we make our Terraform scripts available to the community, which could help in automating the VPP deployment inside the Cloud. diff --git a/docs/usecases/vppinaws.rst b/docs/usecases/vppcloud/vppinaws.rst index 468915a5b20..8d5662d2bd8 100644 --- a/docs/usecases/vppinaws.rst +++ b/docs/usecases/vppcloud/vppinaws.rst @@ -3,21 +3,21 @@ .. toctree:: VPP in AWS -___________________ +========== Warning: before starting this guide you should have a minimum knowledge on how `AWS works <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/concepts.html>`_! -First of all, you should log into your Virtual Machine inside AWS (we suggest to create an instance with Ubuntu 16.04 on a m5 type) and download some useful packages to make VPP installation as smooth as possible: +First of all, you should log into your Virtual Machine inside AWS (we suggest to create an instance with Ubuntu 16.04 on a m5 type) and download some useful packages to make VPP installation as smooth as possible: .. code-block:: console - $ sudo apt-get update - $ sudo apt-get upgrade - $ sudo apt-get install build-essential - $ sudo apt-get install python-pip - $ sudo apt-get install libnuma-dev - $ sudo apt-get install make - $ sudo apt install libelf-dev + $ sudo apt-get update + $ sudo apt-get upgrade + $ sudo apt-get install build-essential + $ sudo apt-get install python-pip + $ sudo apt-get install libnuma-dev + $ sudo apt-get install make + $ sudo apt install libelf-dev @@ -30,14 +30,14 @@ Afterwards, types the following commands to install VPP: -In this case we downloaded VPP version 18.07 but actually you can use any VPP version available. Then, you can install VPP with all of its plugins: +In this case we downloaded VPP version 18.07 but actually you can use any VPP version available. Then, you can install VPP with all of its plugins: .. code-block:: console - $ sudo apt-get update - $ sudo apt-get install vpp - $ sudo apt-get install vpp-plugins vpp-dbg vpp-dev vpp-api-java vpp-api-python vpp-api-lua + $ sudo apt-get update + $ sudo apt-get install vpp + $ sudo apt-get install vpp-plugins vpp-dbg vpp-dev vpp-api-java vpp-api-python vpp-api-lua diff --git a/docs/usecases/vppinazure.rst b/docs/usecases/vppcloud/vppinazure.rst index f64e2a7e7d1..f1bfa427788 100644 --- a/docs/usecases/vppinazure.rst +++ b/docs/usecases/vppcloud/vppinazure.rst @@ -3,14 +3,14 @@ .. toctree:: VPP in Azure -___________________ +============ Before starting, a few notes: -* in our configuration we use only DPDK 18.02, since with the newer versions, such as DPDK 18.05, we obtained several problems during VPP installation (mostly related with MLX4 PMD Drivers). +* in our configuration we use only DPDK 18.02, since with the newer versions, such as DPDK 18.05, we obtained several problems during VPP installation (mostly related with MLX4 PMD Drivers). * Some of the commands are taken from `Azure’s DPDK page <https://docs.microsoft.com/en-us/azure/virtual-network/setup-dpdk>`_. @@ -20,9 +20,9 @@ Firstly, we install the DPDK dependencies: .. code-block:: console - $ sudo add-apt-repository ppa:canonical-server/dpdk-azure –y - $ sudo apt-get update - $ sudo apt-get install -y librdmacm-dev librdmacm1 build-essential libnuma-dev libmnl-dev + $ sudo add-apt-repository ppa:canonical-server/dpdk-azure –y + $ sudo apt-get update + $ sudo apt-get install -y librdmacm-dev librdmacm1 build-essential libnuma-dev libmnl-dev Then, we download DPDK 18.02: @@ -37,11 +37,11 @@ Inside config/common_base, modify: .. code-block:: console - CONFIG_RTE_BUILD_SHARED_LIB=n - CONFIG_RTE_LIBRTE_MLX4_PMD=y - CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS=y - CONFIG_RTE_LIBRTE_TAP_PMD=y - CONFIG_RTE_LIBRTE_FAILSAFE_PMD=y + CONFIG_RTE_BUILD_SHARED_LIB=n + CONFIG_RTE_LIBRTE_MLX4_PMD=y + CONFIG_RTE_LIBRTE_MLX4_DLOPEN_DEPS=y + CONFIG_RTE_LIBRTE_TAP_PMD=y + CONFIG_RTE_LIBRTE_FAILSAFE_PMD=y and then: @@ -67,16 +67,16 @@ After the reboot, we type these commands: .. code-block:: console - $ echo 1024 | sudo tee /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages - $ mkdir /mnt/huge - $ sudo mount -t hugetlbfs nodev /mnt/huge - $ grep Huge /proc/meminfo - $ modprobe -a ib_uverbs - $ cd x86_64-native-linuxapp-gcc/ - $ ls - $ cd lib/ - $ ls - $ sudo cp librte_pmd_mlx4_glue.so.18.02.0 /usr/lib + $ echo 1024 | sudo tee /sys/devices/system/node/node*/hugepages/hugepages-2048kB/nr_hugepages + $ mkdir /mnt/huge + $ sudo mount -t hugetlbfs nodev /mnt/huge + $ grep Huge /proc/meminfo + $ modprobe -a ib_uverbs + $ cd x86_64-native-linuxapp-gcc/ + $ ls + $ cd lib/ + $ ls + $ sudo cp librte_pmd_mlx4_glue.so.18.02.0 /usr/lib **Now we focus on VPP installation:** @@ -88,8 +88,8 @@ Firstly, we download VPP .. code-block:: console - $ git clone https://gerrit.fd.io/r/vpp - $ git checkout v18.07 + $ git clone https://gerrit.fd.io/r/vpp + $ git checkout v18.07 Then, we build VPP, using the external DPDK configuration we previously made: @@ -97,10 +97,10 @@ We modify the path inside the vpp.mk file: .. code-block:: console - $ build-data/platforms/vpp.mk - $ vpp_uses_external_dpdk = yes - $ vpp_dpdk_inc_dir = <PATH_TO_DESTDIR_NAME_FROM_ABOVE>/include/dpdk/ - $ vpp_dpdk_lib_dir =<PATH_TO_DESTDIR_NAME_FROM_ABOVE>/lib + $ build-data/platforms/vpp.mk + $ vpp_uses_external_dpdk = yes + $ vpp_dpdk_inc_dir = <PATH_TO_DESTDIR_NAME_FROM_ABOVE>/include/dpdk/ + $ vpp_dpdk_lib_dir =<PATH_TO_DESTDIR_NAME_FROM_ABOVE>/lib <PATH_TO_DESTDIR_NAME_FROM_ABOVE> is whatever the path used when compiling DPDK above. These paths have to be absolute path in order for it to work. @@ -108,34 +108,34 @@ we modify build-data/platforms/vpp.mk to use .. code-block:: console - vpp_uses_dpdk_mlx4_pmd = yes + vpp_uses_dpdk_mlx4_pmd = yes .. code-block:: console - $ make build - $ cd build-root/ - $ make V=0 PLATFORM=vpp TAG=vpp install-deb - $ sudo dpkg -i *.deb + $ make build + $ cd build-root/ + $ make V=0 PLATFORM=vpp TAG=vpp install-deb + $ sudo dpkg -i *.deb Finally, we modify the startup.conf file: .. code-block:: console - $ cd /etc/vpp - $ sudo nano startup.conf + $ cd /etc/vpp + $ sudo nano startup.conf Inside the DPDK block, the following commands: .. code-block:: console - ## Whitelist specific interface by specifying PCI address - dev 000X:00:0X.0 - dev 000X:00:0X.0 - - # Running failsafe - vdev net_vdev_netvsc0,iface=eth1 - vdev net_vdev_netvsc1,iface=eth2 + ## Whitelist specific interface by specifying PCI address + dev 000X:00:0X.0 + dev 000X:00:0X.0 + + # Running failsafe + vdev net_vdev_netvsc0,iface=eth1 + vdev net_vdev_netvsc1,iface=eth2 *Please refer to Azure DPDK document to pick the right iface to use for failsafe vdev.* diff --git a/docs/usecases/webapp.md b/docs/usecases/webapp.md deleted file mode 100644 index 3b1a3b7b5b7..00000000000 --- a/docs/usecases/webapp.md +++ /dev/null @@ -1,274 +0,0 @@ -Building VPP web applications -============================= - -Vpp includes a versatile http/https "static" server plugin. We quote -the word static in the previous sentence because the server is easily -extended. This note describes how to build a Hugo site which includes -both monitoring and control functions. - -Let's assume that we have a vpp data-plane plugin which needs a -monitoring and control web application. Here's how to build one. - -Step 1: Add URL handlers ------------------------- - -Individual URL handlers are pretty straightforward. You can -return just about anything you like, but as we work through -the example you'll see why returning data in .json format -tends to work out pretty well. - -``` - static int - handle_get_status (http_builtin_method_type_t reqtype, - u8 * request, http_session_t * hs) - { - my_main_t *mm = &my_main; - u8 *s = 0; - - /* Construct a .json reply */ - s = format (s, "{\"status\": {"); - s = format (s, " \"thing1\": \"%s\",", mm->thing1_value_string); - s = format (s, " \"thing2\": \"%s\",", mm->thing2_value_string); - /* ... etc ... */ - s = format (s, " \"lastthing\": \"%s\"", mm->last_value_string); - s = format (s, "}}"); - - /* And tell the static server plugin how to send the results */ - hs->data = s; - hs->data_offset = 0; - hs->cache_pool_index = ~0; - hs->free_data = 1; /* free s when done with it, in the framework */ - return 0; - } -``` - -Words to the Wise: Chrome has a very nice set of debugging -tools. Select "More Tools -> Developer Tools". Right-hand sidebar -appears with html source code, a javascript debugger, network results -including .json objects, and so on. - -Note: .json object format is **intolerant** of both missing and extra -commas, missing and extra curly-braces. It's easy to waste a -considerable amount of time debugging .json bugs. - -Step 2: Register URL handlers with the server ---------------------------------------------- - -Call http_static_server_register_builtin_handler() as shown. It's -likely but not guaranteed that the static server plugin will be -available. - - -``` - int - plugin_url_init (vlib_main_t * vm) - { - void (*fp) (void *, char *, int); - - /* Look up the builtin URL registration handler */ - fp = vlib_get_plugin_symbol ("http_static_plugin.so", - "http_static_server_register_builtin_handler"); - - if (fp == 0) - { - clib_warning ("http_static_plugin.so not loaded..."); - return -1; - } - - (*fp) (handle_get_status, "status.json", HTTP_BUILTIN_METHOD_GET); - (*fp) (handle_get_run, "run.json", HTTP_BUILTIN_METHOD_GET); - (*fp) (handle_get_reset, "reset.json", HTTP_BUILTIN_METHOD_GET); - (*fp) (handle_get_stop, "stop.json", HTTP_BUILTIN_METHOD_GET); - return 0; - } -``` - -Make sure to start the http static server **before** calling -plugin_url_init(...), or the registrations will disappear. - -Step 3: Install Hugo, pick a theme, and create a site ------------------------------------------------------ - -Please refer to the Hugo documentation. - -See [the Hugo Quick Start -Page](https://gohugo.io/getting-started/quick-start). Prebuilt binary -artifacts for many different environments are available on -[the Hugo release page](https://github.com/gohugoio/hugo/releases). - -To pick a theme, visit [the Hugo Theme -site](https://themes.gohugo.io). Decide what you need your site to -look like. Stay away from complex themes unless you're prepared to -spend considerable time tweaking and tuning. - -The "Introduction" theme is a good choice for a simple site, YMMV. - -Step 4: Create a "rawhtml" shortcode ------------------------------------- - -Once you've initialized your new site, create the directory -<site-root>/layouts/shortcodes. Create the file "rawhtml.html" in that -directory, with the following contents: - -``` - <!-- raw html --> - {{.Inner}} -``` -This is a key trick which allows a static Hugo site to include -javascript code. - -Step 5: create Hugo content which interacts with vpp ----------------------------------------------------- - -Now it's time to do some web front-end coding in javascript. Of -course, you can create static text, images, etc. as described in the -Hugo documentation. Nothing changes in that respect. - -To include dynamically-generated data in your Hugo pages, splat down -some <div> HTML tags, and define a few buttons: - -``` - {{< rawhtml >}} - <div id="Thing1"></div> - <div id="Thing2"></div> - <div id="Lastthing"></div> - <input type="button" value="Run" onclick="runButtonClick()"> - <input type="button" value="Reset" onclick="resetButtonClick()"> - <input type="button" value="Stop" onclick="stopButtonClick()"> - <div id="Message"></div> - {{< /rawhtml >}} -``` - -Time for some javascript code to interact with vpp: - - {{< rawhtml >}} - <script> - async function getStatusJson() { - pump_url = location.href + "status.json"; - const json = await fetch(pump_url, { - method: 'GET', - mode: 'no-cors', - cache: 'no-cache', - headers: { - 'Content-Type': 'application/json', - }, - }) - .then((response) => response.json()) - .catch(function(error) { - console.log(error); - }); - - return json.status; - }; - - async function sendButton(which) { - my_url = location.href + which + ".json"; - const json = await fetch(my_url, { - method: 'GET', - mode: 'no-cors', - cache: 'no-cache', - headers: { - 'Content-Type': 'application/json', - }, - }) - .then((response) => response.json()) - .catch(function(error) { - console.log(error); - }); - return json.message; - }; - - async function getStatus() { - const status = await getStatusJson(); - - document.getElementById("Thing1").innerHTML = status.thing1; - document.getElementById("Thing2").innerHTML = status.thing2; - document.getElementById("Lastthing").innerHTML = status.lastthing; - }; - - async function runButtonClick() { - const json = await sendButton("run"); - document.getElementById("Message").innerHTML = json.Message; - } - - async function resetButtonClick() { - const json = await sendButton("reset"); - document.getElementById("Message").innerHTML = json.Message; - } - async function stopButtonClick() { - const json = await sendButton("stop"); - document.getElementById("Message").innerHTML = json.Message; - } - - getStatus(); - - </script> - {{< /rawhtml >}} - -At this level, javascript coding is pretty simple. Unless you know -exactly what you're doing, please follow the async function / await -pattern shown above. - -Step 6: compile the website ---------------------------- - -At the top of the website workspace, simply type "hugo". The compiled -website lands in the "public" subdirectory. - -You can use the Hugo static server - with suitable stub javascript -code - to see what your site will eventually look like. To start the -hugo static server, type "hugo server". Browse to -"http://localhost:1313". - -Step 7: configure vpp ---------------------- - -In terms of command-line args: you may wish to use poll-sleep-usec 100 -to keep the load average low. Totally appropriate if vpp won't be -processing a lot of packets or handling high-rate http/https traffic. - -``` - unix { - ... - poll-sleep-usec 100 - startup-config ... see below ... - ... - } -``` - -If you wish to provide an https site, configure tls. The simplest tls -configuration uses a built-in test certificate - which will annoy -Chrome / Firefox - but it's sufficient for testing: - -``` - tls { - use-test-cert-in-ca - } -``` - - - - -### vpp startup configuration - -Enable the vpp static server by way of the startup config mentioned above: - -``` - http static server www-root /myhugosite/public uri tcp://0.0.0.0/2345 cache-size 5m fifo-size 8192 -``` - -The www-root must be specified, and must correctly name the compiled -hugo site root. If your Hugo site is located at /myhugosite, specify -"www-root /myhugosite/public" in the "http static server" stanza. The -uri shown above binds to TCP port 2345. - -If you're using https, use a uri like "tls://0.0.0.0/443" instead of -the uri shown above. - -You may want to add a Linux host interface to view the full-up site locally: - -``` - create tap host-if-name lstack host-ip4-addr 192.168.10.2/24 - set int ip address tap0 192.168.10.1/24 - set int state tap0 up -``` diff --git a/docs/usecases/webapp.rst b/docs/usecases/webapp.rst new file mode 100644 index 00000000000..f76fd5b6353 --- /dev/null +++ b/docs/usecases/webapp.rst @@ -0,0 +1,279 @@ +Web applications with VPP +========================= + +Vpp includes a versatile http/https “static” server plugin. We quote the +word static in the previous sentence because the server is easily +extended. This note describes how to build a Hugo site which includes +both monitoring and control functions. + +Let’s assume that we have a vpp data-plane plugin which needs a +monitoring and control web application. Here’s how to build one. + +Step 1: Add URL handlers +------------------------ + +Individual URL handlers are pretty straightforward. You can return just +about anything you like, but as we work through the example you’ll see +why returning data in .json format tends to work out pretty well. + +:: + + static int + handle_get_status (http_builtin_method_type_t reqtype, + u8 * request, http_session_t * hs) + { + my_main_t *mm = &my_main; + u8 *s = 0; + + /* Construct a .json reply */ + s = format (s, "{\"status\": {"); + s = format (s, " \"thing1\": \"%s\",", mm->thing1_value_string); + s = format (s, " \"thing2\": \"%s\",", mm->thing2_value_string); + /* ... etc ... */ + s = format (s, " \"lastthing\": \"%s\"", mm->last_value_string); + s = format (s, "}}"); + + /* And tell the static server plugin how to send the results */ + hs->data = s; + hs->data_offset = 0; + hs->cache_pool_index = ~0; + hs->free_data = 1; /* free s when done with it, in the framework */ + return 0; + } + +Words to the Wise: Chrome has a very nice set of debugging tools. Select +“More Tools -> Developer Tools”. Right-hand sidebar appears with html +source code, a javascript debugger, network results including .json +objects, and so on. + +Note: .json object format is **intolerant** of both missing and extra +commas, missing and extra curly-braces. It’s easy to waste a +considerable amount of time debugging .json bugs. + +Step 2: Register URL handlers with the server +--------------------------------------------- + +Call http_static_server_register_builtin_handler() as shown. It’s likely +but not guaranteed that the static server plugin will be available. + +:: + + int + plugin_url_init (vlib_main_t * vm) + { + void (*fp) (void *, char *, int); + + /* Look up the builtin URL registration handler */ + fp = vlib_get_plugin_symbol ("http_static_plugin.so", + "http_static_server_register_builtin_handler"); + + if (fp == 0) + { + clib_warning ("http_static_plugin.so not loaded..."); + return -1; + } + + (*fp) (handle_get_status, "status.json", HTTP_BUILTIN_METHOD_GET); + (*fp) (handle_get_run, "run.json", HTTP_BUILTIN_METHOD_GET); + (*fp) (handle_get_reset, "reset.json", HTTP_BUILTIN_METHOD_GET); + (*fp) (handle_get_stop, "stop.json", HTTP_BUILTIN_METHOD_GET); + return 0; + } + +Make sure to start the http static server **before** calling +plugin_url_init(…), or the registrations will disappear. + +Step 3: Install Hugo, pick a theme, and create a site +----------------------------------------------------- + +Please refer to the Hugo documentation. + +See `the Hugo Quick Start +Page <https://gohugo.io/getting-started/quick-start>`__. Prebuilt binary +artifacts for many different environments are available on `the Hugo +release page <https://github.com/gohugoio/hugo/releases>`__. + +To pick a theme, visit `the Hugo Theme +site <https://themes.gohugo.io>`__. Decide what you need your site to +look like. Stay away from complex themes unless you’re prepared to spend +considerable time tweaking and tuning. + +The “Introduction” theme is a good choice for a simple site, YMMV. + +Step 4: Create a “rawhtml” shortcode +------------------------------------ + +Once you’ve initialized your new site, create the directory +/layouts/shortcodes. Create the file “rawhtml.html” in that directory, +with the following contents: + +:: + + <!-- raw html --> + {{.Inner}} + +This is a key trick which allows a static Hugo site to include +javascript code. + +Step 5: create Hugo content which interacts with vpp +---------------------------------------------------- + +Now it’s time to do some web front-end coding in javascript. Of course, +you can create static text, images, etc. as described in the Hugo +documentation. Nothing changes in that respect. + +To include dynamically-generated data in your Hugo pages, splat down +some + +.. raw:: html + + <div> + +HTML tags, and define a few buttons: + +:: + + {{< rawhtml >}} + <div id="Thing1"></div> + <div id="Thing2"></div> + <div id="Lastthing"></div> + <input type="button" value="Run" onclick="runButtonClick()"> + <input type="button" value="Reset" onclick="resetButtonClick()"> + <input type="button" value="Stop" onclick="stopButtonClick()"> + <div id="Message"></div> + {{< /rawhtml >}} + +Time for some javascript code to interact with vpp: + +:: + + {{< rawhtml >}} + <script> + async function getStatusJson() { + pump_url = location.href + "status.json"; + const json = await fetch(pump_url, { + method: 'GET', + mode: 'no-cors', + cache: 'no-cache', + headers: { + 'Content-Type': 'application/json', + }, + }) + .then((response) => response.json()) + .catch(function(error) { + console.log(error); + }); + + return json.status; + }; + + async function sendButton(which) { + my_url = location.href + which + ".json"; + const json = await fetch(my_url, { + method: 'GET', + mode: 'no-cors', + cache: 'no-cache', + headers: { + 'Content-Type': 'application/json', + }, + }) + .then((response) => response.json()) + .catch(function(error) { + console.log(error); + }); + return json.message; + }; + + async function getStatus() { + const status = await getStatusJson(); + + document.getElementById("Thing1").innerHTML = status.thing1; + document.getElementById("Thing2").innerHTML = status.thing2; + document.getElementById("Lastthing").innerHTML = status.lastthing; + }; + + async function runButtonClick() { + const json = await sendButton("run"); + document.getElementById("Message").innerHTML = json.Message; + } + + async function resetButtonClick() { + const json = await sendButton("reset"); + document.getElementById("Message").innerHTML = json.Message; + } + async function stopButtonClick() { + const json = await sendButton("stop"); + document.getElementById("Message").innerHTML = json.Message; + } + + getStatus(); + + </script> + {{< /rawhtml >}} + +At this level, javascript coding is pretty simple. Unless you know +exactly what you’re doing, please follow the async function / await +pattern shown above. + +Step 6: compile the website +--------------------------- + +At the top of the website workspace, simply type “hugo”. The compiled +website lands in the “public” subdirectory. + +You can use the Hugo static server - with suitable stub javascript code +- to see what your site will eventually look like. To start the hugo +static server, type “hugo server”. Browse to “http://localhost:1313”. + +Step 7: configure vpp +--------------------- + +In terms of command-line args: you may wish to use poll-sleep-usec 100 +to keep the load average low. Totally appropriate if vpp won’t be +processing a lot of packets or handling high-rate http/https traffic. + +:: + + unix { + ... + poll-sleep-usec 100 + startup-config ... see below ... + ... + } + +If you wish to provide an https site, configure tls. The simplest tls +configuration uses a built-in test certificate - which will annoy Chrome +/ Firefox - but it’s sufficient for testing: + +:: + + tls { + use-test-cert-in-ca + } + +vpp startup configuration +~~~~~~~~~~~~~~~~~~~~~~~~~ + +Enable the vpp static server by way of the startup config mentioned +above: + +:: + + http static server www-root /myhugosite/public uri tcp://0.0.0.0/2345 cache-size 5m fifo-size 8192 + +The www-root must be specified, and must correctly name the compiled +hugo site root. If your Hugo site is located at /myhugosite, specify +“www-root /myhugosite/public” in the “http static server” stanza. The +uri shown above binds to TCP port 2345. + +If you’re using https, use a uri like “tls://0.0.0.0/443” instead of the +uri shown above. + +You may want to add a Linux host interface to view the full-up site +locally: + +:: + + create tap host-if-name lstack host-ip4-addr 192.168.10.2/24 + set int ip address tap0 192.168.10.1/24 + set int state tap0 up |