summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2017-10-11vhost: crash under heavy traffic condition due to memory corruption (VPP-1016)Steven1-2/+33
With heavy traffic, tx code path may crash due to memory corruption Thread 5 "vpp_wk_2" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7fff3995c700 (LWP 2505)] 0x00007ffff73675e8 in vhost_user_if_input (vm=0x7fffb5f5bf9c, vum=0x7ffff7882a40 <vhost_user_main>, vui=0x7fffb65570c4, qid=0, node=0x7fffb6577dac, mode=VNET_HW_INTERFACE_RX_MODE_POLLING) at /home/sluong/vpp-master/vpp/build-data/../src/vnet/devices/virtio/vhost-user.c:1610 1610 bi_current = (vum->cpus[thread_index].rx_buffers) [vum->cpus[thread_index].rx_buffers_len]; (gdb) p vum->cpus[thread_index].rx_buffers_len $2 = 793212607 (gdb) Apparently, some code accidentally wrote the bad value in rx_buffers_len. rx_buffers_len should never be greater than 1024 since that is how many buffers we request each time. After debugging many hours, I discovered that the memory corruption happens in the tx code path right here on line 2176. { vhost_copy_t *cpy = &vum->cpus[thread_index].copy[copy_len]; copy_len++; cpy->len = bytes_left; cpy->len = (cpy->len > buffer_len) ? buffer_len : cpy->len; cpy->dst = buffer_map_addr; cpy->src = (uword) vlib_buffer_get_current (current_b0) + current_b0->current_length - bytes_left; (gdb) p cpy $3 = (vhost_copy_t *) 0x7fffb554077c (gdb) p copy_len $4 = 1025 (gdb) p &vum->cpus[3].rx_buffers_len $8 = (u32 *) 0x7fffb5540784 copy_len is picking up the index entry 1024 before it was incremented. copy array has only 1024 members (0 - 1023 are valid). The assignment here in cpy surely causes memory corruption. It is only discovered later when the memory location that it corrupted is used. The condition for the crash is to transmit jumbo frames under heavy volume. Since ring size is 1024, with one packet taking up one index for frame size (less 2048), it does not cause overflow. With jumbo frames, it requires multiple indices for one packet, it can cause the overflow under heavy traffic. The fix is to do copy out when we have 1000 entries in the array to avoid overflow. Change-Id: Iefbc739b8e80470f1cf13123113f8331ffcd0eb2 Signed-off-by: Steven <sluong@cisco.com> (cherry picked from commit aa5df48cb233b377b5910694e2440a16e5973864)
2017-10-11Deps added to Makefile must also be in spec file.Thomas F Herbert2-1/+8
Deps are required for Fedora too. This patch should be in version 17.10 because it fixes breaking rpm builds in some circumstances. JIRA: VPP-1015 Change-Id: I10807069742cdd6b09a0f34d9d05e9cae4146ec3 Signed-off-by: Thomas F Herbert <therbert@redhat.com>
2017-10-09Fix bug with temporary directory when building rpms.Thomas F Herbert1-1/+1
Fixes bug introduced in commit 5349f94d. JIRA: VPP-1014 Change-Id: Ia18f4c6f5f1124306cce790a36f6de970d186687 Signed-off-by: Thomas F Herbert <therbert@redhat.com>
2017-10-09NAT: fixed ICMP broken translation for GRE tunnel interface (VPP-1008)Matus Fabian3-31/+27
Change-Id: Ie3245b96c511cc30915e70e8c881f445291a38c2 Signed-off-by: Matus Fabian <matfabia@cisco.com>
2017-10-06make test: Create link to failed test dir on timeout. (VPP-1011)Dave Wallace2-1/+7
- Also change default coredump configuration from "coredump-size unlimited" to "full-coredump" Change-Id: Iefedc2636f2d9696b7575b34e91dd7be49f601fa Signed-off-by: Dave Wallace <dwallacelf@gmail.com> (cherry picked from commit 981fadf928dadac683d2f629edf738aa91510af3)
2017-10-06tuntap: Introduce per thread structure to suport multi-threads (VPP-1012)Steven2-65/+99
https://gerrit.fd.io/r/#/c/8551/ decoupled the global variable, namely tm->iovecs from TX and RX. However, to support multi-threads, we have to eliminate the use of this global variable with per thread variable. I notice that rx_buffers must also be per thread variable. So, we introduce per thread struct to contain rx_buffers and iovecs. Each thread will find the per thread struct with thread_index. Change-Id: I61abf2fdace8d722525a382ac72f0d04a173b9ce Signed-off-by: Steven <sluong@cisco.com> (cherry picked from commit 4cd257667406d0500a81323ef91f5c7c8c902b25)
2017-10-05fix buffer allocation for sparse jumbo frames in vhostPierre Pfister1-1/+3
A bug was reported where a jumbo packet would stay in vhost queue forever or until a large enough number of other packets arrived in the queue too. This is due to a bug in vhost input node buffer allocation. The fix is to make sure that vhost always allocates at least enough buffers for one single big packet. '40' is used to account for 65kB frames. Change-Id: I1d293028854165083e30cd798fab9d4140230b78 Signed-off-by: Pierre Pfister <ppfister@cisco.com>
2017-10-05drop python3 dependency (VPP-1010)Klement Sekera4-29/+26
Change-Id: I99c2c1d0d5b96f33efdb58dd3a2897a752e65349 Signed-off-by: Klement Sekera <ksekera@cisco.com>
2017-10-05make test: archive failed test data with build logs. (VPP-1011)Dave Wallace2-17/+32
- Fix invocation of compress_failed.sh - Fix compress_failed to copy compressed results files to $WORKSPACE/archives and return failure exit code. Failed test case data will be copied to logs.fd.io and found in the archives/<make test data dir>-FAILED directory in the build log link in the vpp-verify-master-ubuntu1604 jenkins job page. For example: https://logs.fd.io/production/vex-yul-rot-jenkins-1/vpp-verify-master-ubuntu1604/7353/archives/ Change-Id: Ife9a0737115e69c0a8441e3bb0133af1528d909b Signed-off-by: Dave Wallace <dwallacelf@gmail.com> (cherry picked from commit 25dc16715ee3fc0a600e2f58841173249bfae501)
2017-10-05memif: crash on slave mode (VPP-1006)Steven1-0/+1
Crash was seen on recent image with this BT on top of the stack (gdb) bt full (mif=0x7fffb6226568) at /vpp/build-data/../src/plugins/memif/memif.c:297 ring = 0x0 <<<<<<<<<< i = 0 j = 0 buffer_offset = 65792 r = 0x7fffb5e59f80 alloc = {flags = 1, name = 0x7fffb449f965 "memif region", size = 4260096, numa_node = 0, addr = 0x7fff41dac000, fd = 11, log2_page_size = 12, n_pages = 1041} err = 0x0 __FUNCTION__ = "memif_init_regions_and_queues" The crash happened at this line. ring = memif_get_ring (mif, MEMIF_RING_S2M, i); ring=>head = ring->tail = 0; <===== Please note that the crash is caused by dereferencing NULL rinng. Put breakpoint into the function. I notice that mif->regions[0].shm is not initialized. (gdb) p mif->regions[0].shm $8 = (void *) 0x0 It looks like we forgot to set shm after clib_mem_vm_ext_alloc(). Add the missing cide and the crash is fixed. Change-Id: Ib722a6c241c77acfa8e33962106b57faa50e1ea7 Signed-off-by: Steven <sluong@cisco.com> (cherry picked from commit 9fefa9a697daf0e949ea7a2700ecaf2ba4d1d2cb)
2017-10-04session: fill in bind handle for sock flavored api (VPP-1005)Florin Coras1-1/+6
Change-Id: I492bea060ba5c219ea75e19ebfdad79b1074e04b Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-04Propagate duplicate IF addr add/del error up to API.Jon Loeliger3-8/+30
Identify and complain when the same IP prefix is assigned to two different SW interfaces: vpp# set int ip address TenGigabitEthernet6/0/0 1.2.3.4/32 vpp# set int ip address TenGigabitEthernet6/0/1 1.2.3.4/32 set interface ip address: Prefix 1.2.3.4/32 already found on interface TenGigabitEthernet6/0/0 Change-Id: I1aee1b6a7ddd00d3109a53d8e1b6ce97bf45e372 Signed-off-by: Jon Loeliger <jdl@netgate.com> (cherry picked from commit 35ffa3e8f6b032f6e324234d495f769049d8feea)
2017-10-04Set MAC address needs the HW interface indexNeale Ranns2-2/+6
Change-Id: I7b175d57b85e626aab00221b6dac0498aebcbeae Signed-off-by: Neale Ranns <nranns@cisco.com> (cherry picked from commit d867a7cf6baffcebbf1b6e408272ec22dc55dd68)
2017-10-04Dump of deag/lookup routes has is_drop=1 (VPP-995)Neale Ranns1-0/+2
Change-Id: I58772a83e22885a9ea8a7a981d2bcb4b31a050d2 Signed-off-by: Neale Ranns <nranns@cisco.com> (cherry picked from commit 7b7ba572ab486d57b59c12af521175a6bcd7a52b)
2017-10-03Update L2FIB entry timestamp only if BD aging enabled (VPP-1002)John Lo3-3/+6
Change L2 learning path so it update stale timestamp in MAC entry only if aging is enabled on the BD for the MAC entry. Change-Id: I7babe986ceef3c030d8ef9185076c42b405f7b0f Signed-off-by: John Lo <loj@cisco.com> (cherry picked from commit 5a6508d7269266b4a3ecacdd197ea3514a0c0e28)
2017-10-03ip: fix probing of already resolved destinations (VPP-998)Florin Coras4-10/+28
Change-Id: I3e6276e6829dfee5a7aeae1b4ab4c3d2f2e932a4 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-02L2-FIB:add mac learn events testEyal Bari3-56/+76
fixes an issue where events were not sent if BD doesn't enable mac aging Change-Id: Iddc53cb5c45e560633e6c5cff2731dccfc70ad5b Signed-off-by: Eyal Bari <ebari@cisco.com> (cherry picked from commit 24db0ec78fb651c4c585ebf30e07108240574045)
2017-09-29cdp/lldp: punt for no buffer (VPP-997)Steven2-0/+6
When making a call to vlib_packet_template_get_packet(), it is possible to get back a NULL if the system runs out of buffer. This can happen when there is buffer leaks. But don't crash just because we run out of buffers, just punt. Change-Id: Ie90ea41f3dda6e583d48959cbd18ff124158d7f8 Signed-off-by: Steven <sluong@cisco.com> (cherry picked from commit 0ff5c563d5048991dbd02a3892dccde8305a7e30)
2017-09-28tun/tap: Bad packets sent to kernel via tun/tap interfaceSteven2-22/+29
It was observed that under heavy traffic, VPP accidentally sent traffic with the wrong source and destination to the tun/tap interface. Traffic appears to be sent to the wrong direction. This problem is only seen when worker thread is configured. When worker thread is used, TX and RX may reside in different core. Yet both TX and RX threads are sharing the same global variable, namely iovecs without any mutex or memory barrier protection. This creates a race condition when heavy traffic is blasted to VPP, like 1000 pps. We could create a mutex or memory barrier to ensure atomic memory access. But why bother? It is a lot cheaper to just decouple the iovecs such that TX and RX have their own iovecs. Change-Id: I86a5a19bd8de54d54f32e1f0845bae6a81bbf686 Signed-off-by: Steven <sluong@cisco.com> (cherry picked from commit 4ff586d1c6fc5c40e1548cd6f221a8a7f3ad033b)
2017-09-28General documentation updatesChris Luke16-83/+117
- We now have several developer-focused docs, so create an index page for them. - Rework several docs to fit into the index structure. - Experiment with code highlighting; tweak the CSS slightly to make it slightly nicer to look at. Change-Id: I4185a18f84fa0764745ca7a3148276064a3155c6 Signed-off-by: Chris Luke <chrisy@flirble.org> (cherry picked from commit 64ebb5ff1338140d94c7f9ee72138fe84d89de2e)
2017-09-2717.10 change default branch in gitreviewv17.10-rc1Florin Coras1-0/+1
Change-Id: Icd2cc7e328719b3964dfe344caf8ed9858283661 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-09-27VPP-990 remove registered handler if control ping failsv18.01-rc0Matej Perina3-0/+14
Change-Id: I5ca5763f0dc0a73cc6f014b855426b7ac180f356 Signed-off-by: Matej Perina <mperina@cisco.com>
2017-09-27LISP: add API handlers for set/get transport protocolFilip Tehlar4-0/+194
Change-Id: Ib675164c475edcdbe3013df7b847adf5e050c53f Signed-off-by: Filip Tehlar <ftehlar@cisco.com>
2017-09-27VLAN support on host(af-packet) interface.Akshaya N1-3/+26
On host interface if a VLAN tagged packet is received, linux kernel removes the VLAN header from packet byte stream and adds metadata in tpacket2_hdr. This patch explicitely checks for the presense of VLAN metadata and adds it in VPP packet. Change-Id: I0ba35c1e98dbc008ce18d032f22f2717d610c1aa Signed-off-by: Akshaya N <akshaya@rtbrick.com>
2017-09-27Update vagrant centos config to CentOS 7.4Dave Wallace3-2/+9
Change-Id: I45c1227b53ba9e57b94f1bc68de939cd3ce9d619 Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
2017-09-27Fix: unnecesary uio binding for Mellanox NICSteve Shin1-1/+3
UIO binding is not required for Mellanox NIC and calling vlib_pci_bind_to_uio() should be skipped. Change-Id: I10ea457bc3c8d4be8117dec51d5bd940ee416a44 Signed-off-by: Steve Shin <jonshin@cisco.com>
2017-09-27Various fixes for issues found by Coverity (VPP-972)Chris Luke5-3/+25
174267: Revisit this string termination issue 174816: Add check for NULL when trace is enabled 177211: Add notation that mutex is not required here 177117: Added check for log2_page_size == 0 and returns an error if so 163697,163698: Added missing sw_if_index validation Change-Id: I5a76fcf6505c785bfb3269e353360031c6a0fd0f Signed-off-by: Chris Luke <chrisy@flirble.org>
2017-09-27acl-plugin: take 2 at VPP-991 fix, this time with a test case which verifies it.Andrew Yourtchenko3-3/+42
The replacement of [] with pool_elt_at_index and subsequent fixing it was incorrect - it was equivalent to &[], since it returns a pointer to the element. I've added VPP-993 previously to create a testcase, so this commit partially fulfills that one as well. Change-Id: I5b15e3ce48316f0429232aacf885e8f7c63d9522 Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2017-09-27make test: clean ext binaries when doing test-wipeKlement Sekera1-0/+1
Change-Id: I9f5212ee670ea91c6b35f1406c256d0687b9c6b5 Signed-off-by: Klement Sekera <ksekera@cisco.com>
2017-09-27Update CSIT tests 1700906 -> 170926Jan Gelety1-1/+1
- update of CSIT operational branch to be used for VPP-patch test Change-Id: If582dc7c5e37bd3cda7ba4858e98fc504e2b7b1e Signed-off-by: Jan Gelety <jgelety@cisco.com>
2017-09-26Fix SUSE dependencies to contemplate both python and python3 scripts.Marco Varlese1-1/+1
Change-Id: Ib677955448833dfeb1291490340f5ea1e417213b Signed-off-by: Marco Varlese <marco.varlese@suse.com>
2017-09-26tcp: update snd_nxt after congestion recoveryFlorin Coras2-9/+7
Change-Id: I2cf4c4850b9c3c093a7dce0cec89b9f710f69393 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-09-26Add thread-safe event signaller, use RPC where requiredDave Barach5-5/+73
Update ping code to use the new function Change-Id: Ieb753b23f8402cbe5667c22747896784c8ece937 Signed-off-by: Florin Coras <fcoras@cisco.com> Signed-off-by: Dave Barach <dave@barachs.net>
2017-09-26checkstyle: ignore old clang-format (centos)Klement Sekera1-2/+9
Change-Id: Iecf35bd9fd760856e32eb1c0c9542ffbed472379 Signed-off-by: Klement Sekera <ksekera@cisco.com>
2017-09-26make test: don't recompile ext if not neededKlement Sekera1-4/+4
Skip recompilation of test binaries from test/ext if these are up-to-date. This speeds up repeated test runs. Change-Id: I96dbfafc372398e3d858d8419219ef35c47bd0f3 Signed-off-by: Klement Sekera <ksekera@cisco.com>
2017-09-26jvpp: lowering verbosity level for jvpp testsMatej Perina2-56/+55
Change-Id: Ie38dad209cce6d546379b4a5e449b34fbcadf171 Signed-off-by: Matej Perina <mperina@cisco.com>
2017-09-26NAT: remove worker_by_in lookup hash table (VPP-992)Matus Fabian4-97/+30
Change-Id: I3873d3e411bf93cac82e73a0b8e3b22563aaf217 Signed-off-by: Matus Fabian <matfabia@cisco.com>
2017-09-26acl-plugin: test: move the API calls to vpp_papi_provider.pyAndrew Yourtchenko4-152/+79
Change-Id: I1d3818027b8a1fcb1ec12016e3476b5c22a2d5a5 Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2017-09-26Memory overwritten when using unformat %u (VPP-987)Aequitas3-19/+19
Change-Id: I7d8f807fb502d61688aa1dee25fa4edcbeb32f41 Signed-off-by: Aequitas <wang.junqi@zte.com.cn>
2017-09-25Fix Ubuntu java dependency regression.Dave Wallace1-0/+1
- introduced by e6f3b467 "Fix for ssl dependency on debian 9" Change-Id: If41e517b2a55d2028ade6671f407831cfcf205c4 Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
2017-09-25Vagrant fails if Vagrantfile is a symlink on Windows 10.Dave Wallace4-117/+131
- Revert Vagrantfile symlink to the default - Update README and env.sh Change-Id: Ib1a557b897e0217b162c31118a4c265769dd1760 Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
2017-09-25Refactor multi-host socket_test.sh for bare-metal.Dave Wallace2-18/+51
Change-Id: I4fcde6652e0c66315a453250c6e02cd32176833d Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
2017-09-25tcp: do not sample rtt for retransmitted segmentsFlorin Coras3-81/+100
Change-Id: I365c31607332a944ef498369881332b515894ed7 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-09-25acl-plugin: use vec_elt_at_index rather than pool_elt_at_index to access ↵Andrew Yourtchenko1-2/+2
vector elements bb7f0f644 aimed to fix the coverity issue has incorrectly replaced the previous [] access with pool_elt_at_index(), for an element of a vector, with predictably interesting result. VPP-991 has uncovered the issue. Change-Id: Ifd3fb70332d3fdd1c4ff8570372f394913f7b6c8 Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2017-09-25Fix usage string for vatJerome Tollet1-1/+2
Change-Id: Idad65cbb3765500a66f1097126076a2c5fdb4f1b Signed-off-by: Jerome Tollet <jtollet@cisco.com>
2017-09-25Fix sending GARP/NA on Bonded Interface Active/Backup Link Up/DownJohn Lo5-72/+101
For bonded interface in Active/Backup mode (mode 1), we need to send a GARP/NA packet, if IP address is present, on slave link state change to up or down to help with route convergence. The callback from DPDK happens in a separate thread so we need to make sure RPC call is used to signal the send_garp_na process in the main thread. Also need to fix DPDK polling so the slave links are not polled. Change-Id: If5fd8ea2d28c54dd28726ac403ad366386ce9651 Signed-off-by: John Lo <loj@cisco.com>
2017-09-25Add binary API documentationDave Barach4-71/+475
Change-Id: Id1a5da12b13d87bacfa81094f471b95db40c39be Signed-off-by: Dave Barach <dave@barachs.net>
2017-09-25NAT: session number limitation to avoid running out of memory crash (VPP-984)Matus Fabian4-38/+120
Change-Id: I7f18f8c4ba609d96950dc1f833feb967d4a099b7 Signed-off-by: Matus Fabian <matfabia@cisco.com>
2017-09-23libmemif: Jumbo frames data/buffer length fixJakub Grajciar2-83/+195
Change-Id: Icadf1c28b4ab712a210a8e037200ab29d6c53fe4 Signed-off-by: Jakub Grajciar <grajciar.jakub@gmail.com>
2017-09-22openSUSE build fixMarco Varlese2-4/+6
* Fixed package dependency * Fixed bash unary operation error Change-Id: I782dda8ffd807931241fa6034c110f5fedbeca8e Signed-off-by: Marco Varlese <marco.varlese@suse.com>