aboutsummaryrefslogtreecommitdiffstats
path: root/src/vnet/devices
AgeCommit message (Collapse)AuthorFilesLines
2019-07-31devices interface tests: vhosst GSO supportSteven Luong7-19/+296
Add gso option in create vhost interface to support gso and checksum offload. Tested with the following startup options in qemu: csum=on,gso=on,guest_csum=on,guest_tso4=on,guest_tso6=on,guest_ufo=on, host_tso4=on,host_tso6=on,host_ufo=on Type: feature Change-Id: I9ba1ee33677a694c4a0dfe66e745b098995902b8 Signed-off-by: Steven Luong <sluong@cisco.com>
2019-07-30tap: fix segv when host-if-name is not givenMohsin Kazmi1-8/+10
Type: fix Fixes: c30d87e6139c64eceade54972715b402c625763d Change-Id: I86b606b18ff6a30709b7aff089fd5dd00103bd7f Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-07-24tap: print the interface name on cli when createdMohsin Kazmi2-0/+6
Type: feature Change-Id: If11f00574322c35c1780c31d5f7b47d30e083e35 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-07-23api: binary api cleanupDave Barach4-10/+18
Multiple API message handlers call vnet_get_sup_hw_interface(...) without checking the inbound sw_if_index. This can cause a pool_elt_at_index ASSERT in a debug image, and major disorder in a production image. Given that a number of places are coded as follows, add an "api_visible_or_null" variant of vnet_get_sup_hw_interface, which returns NULL given an invalid sw_if_index, or a hidden sw interface: - hw = vnet_get_sup_hw_interface (vnm, sw_if_index); + hw = vnet_get_sup_hw_interface_api_visible_or_null (vnm, sw_if_index); if (hw == NULL || memif_device_class.index != hw->dev_class_index) return clib_error_return (0, "not a memif interface"); Rename two existing xxx_safe functions -> xxx_or_null to make it obvious what they return. Type: fix Change-Id: I29996e8d0768fd9e0c5495bd91ff8bedcf2c5697 Signed-off-by: Dave Barach <dave@barachs.net>
2019-07-23devices: vhost handling VHOST_USER_SET_FEATURESSteven Luong1-0/+1
Some combinations of new qemu (2.11) and old dpdk (16.10) may send VHOST_USER_SET_FEATURES at the end of the protocol exchange which the vhost interface is already declared up and ready. Unfortunately, the process of VHOST_USER_SET_FEATURES will cause the interface to go down. Not sure if it is correct or needed. Because there is no additional messages thereafter, the hardware interface stays down. The fix is to check the interface again at the end of processing VHOST_USER_SET_FEATURES. If it is up and ready, we bring back the hardware interface. Type: fix Change-Id: I490cd03820deacbd8b44d8f2cb38c26349dbe3b2 Signed-off-by: Steven Luong <sluong@cisco.com>
2019-07-18tap: fix memory errors with create/delete APIBenoît Ganne1-7/+9
CLI allocates vectors consumed by tap_create_if(), whereas API pass null-terminated C-strings allocated on API segment. Do not try to be too clever here, and just allocate our own private copies. Type: fix Fixes: 8d879e1a6bac47240a232893e914815f781fd4bf Ticket: VPP-1724 Change-Id: I3ccdb8e0fcd4cb9be414af9f38cf6c33931a1db7 Signed-off-by: Benoît Ganne <bganne@cisco.com>
2019-07-18vlib: convert frame_index into real pointersAndreas Schultz1-1/+1
The fast path almost always has to deal with the real pointers. Deriving the frame pointer from a frame_index requires a load of the 32bit frame_index from memory, another 64bit load of the heap base pointer and some calculations. Lets store the full pointer instead and do a single 64bit load only. This helps avoiding problems when the heap is grown and frames are allocated below vm->heap_aligned_base. Type: refactor Change-Id: Ifa6e6e984aafe1e2755bff80f0a4dfcddee3623c Signed-off-by: Andreas Schultz <andreas.schultz@travelping.com> Signed-off-by: Dave Barach <dave@barachs.net>
2019-06-29devices: virtio pci leaking spinlockSteven Luong1-0/+1
Memory is dirt cheap. But there is no need to throw it away. Type: fix Change-Id: I155130ab3c435b1c04d7c0e9f54795b8de9383d9 Signed-off-by: Steven Luong <sluong@cisco.com>
2019-06-28tap: fix memory errors in create/deleteBenoît Ganne1-1/+5
If the host interface name is not specified at creation, host_if_name was wrongly set to a stack-allocated variable. Make sure it always points to a heap allocated vector. At deletion time, we must free all allocated vectors. Type:fix Change-Id: I17751f38e95097998d51225fdccbf3ce3c365593 Signed-off-by: Benoît Ganne <bganne@cisco.com>
2019-06-20tap: fix the total length of packet for stats byteMohsin Kazmi1-3/+3
Type: fix Fixes: 8389fb9 Change-Id: I31076db78507736631609146d4cca28597aca704 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-06-20tap: add support to configure tap interface host MTU sizeMohsin Kazmi8-2/+78
This patch adds support to configure host mtu size using api, cli or startup.conf. Type: feature Change-Id: I8ab087d82dbe7dedc498825c1a3ea3fcb2cce030 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-05-28tap: crash in multi-thread environmentMohsin Kazmi3-5/+4
In tap tx routine, virtio_interface_tx_inline, there used to be an interface spinlock to ensure packets are processed in an orderly fashion clib_spinlock_lock_if_init (&vif->lockp); When virtio code was introduced in 19.04, that line is changed to clib_spinlock_lock_if_init (&vring->lockp); to accommodate multi-queues. Unfortunately, althrough the spinlock exists in the vring, it was never initialized for tap, only for virtio. As a result, many nasty things can happen when running tap interface in multi-thread environment. Crash is inevitable. The fix is to initialize vring->lockp for tap and remove vif->lockp as it is not used anymore. Change-Id: I82b15d3e9b0fb6add9b9ac49bf602a538946634a Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com> (cherry picked from commit c2c89782d34df0dc7197b18b042b4c2464a101ef)
2019-05-27virtio: Add gso support for native virtio driverMohsin Kazmi5-7/+77
Change-Id: I7b735f5a540e8c278bac88245acb3f8c041c49c0 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-05-24Tap: Fix the indirect buffers allocation VPP-1660Mohsin Kazmi4-48/+61
Indirect buffers are used to store indirect descriptors to xmit big packets. This patch moves the indirect buffer allocation from interface creation to device node. Now it allocates or deallocates buffers during tx for chained buffers. Change-Id: I55cec208a2a7432e12fe9254a7f8ef84a9302bd5 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com> (cherry picked from commit 55203e745f5e3f1f6c4dbe99d6eab8dee4d13ea6)
2019-05-16init / exit function orderingDave Barach1-6/+6
The vlib init function subsystem now supports a mix of procedural and formally-specified ordering constraints. We should eliminate procedural knowledge wherever possible. The following schemes are *roughly* equivalent: static clib_error_t *init_runs_first (vlib_main_t *vm) { clib_error_t *error; ... do some stuff... if ((error = vlib_call_init_function (init_runs_next))) return error; ... } VLIB_INIT_FUNCTION (init_runs_first); and static clib_error_t *init_runs_first (vlib_main_t *vm) { ... do some stuff... } VLIB_INIT_FUNCTION (init_runs_first) = { .runs_before = VLIB_INITS("init_runs_next"), }; The first form will [most likely] call "init_runs_next" on the spot. The second form means that "init_runs_first" runs before "init_runs_next," possibly much earlier in the sequence. Please DO NOT construct sets of init functions where A before B actually means A *right before* B. It's not necessary - simply combine A and B - and it leads to hugely annoying debugging exercises when trying to switch from ad-hoc procedural ordering constraints to formal ordering constraints. Change-Id: I5e4353503bf43b4acb11a45fb33c79a5ade8426c Signed-off-by: Dave Barach <dave@barachs.net>
2019-05-07Fix af_packet issues:jackiechen19851-31/+57
1. Fix af_packet memory leak; 2. Fix close socket twice; 3. Adjust debug log for syscall; 4. Adjust dhcp client output log; Change-Id: I96bfaef16c4fad80c5da0d9ac602f911fee1670d Signed-off-by: jackiechen1985 <xiaobo.chen@tieto.com>
2019-05-06virtio: refactor ctrl queue supportMohsin Kazmi1-22/+32
Change-Id: Ifb16351f39e5eb2cd154e70a1c96243e4842e80d Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-05-01virtio: Fix virtio buffer allocationMohsin Kazmi1-1/+1
Change-Id: I0ffb468aef56f5fd223218a83425771595863666 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-05-01virtio: remove configurable queue size supportMohsin Kazmi5-41/+27
Native virtio device through legacy driver can't support configurable queue size. Change-Id: I76c446a071bef8a469873010325d830586aa84bd Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-04-25tap: Fix the indirect buffer allocationMohsin Kazmi1-1/+1
Change-Id: I73f76c25754f6fb14a49ae47b6404f3cbabbeeb5 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-04-17tap: clean-up when linux will delete the tap interfaceMohsin Kazmi2-0/+43
When container is deleted which has tap interface attached, Linux also delete the tap interface leaving the VPP side of tap. This patch does a clean up job to remove that VPP side of tap interface. To produce the behavior: In VPP: create tap On linux: sudo ip netns add ns1 sudo ip link set dev tap0 netns ns1 sudo ip netns del ns1 Change-Id: Iaed1700073a9dc64e626c1d0c449f466c143f3ae Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-04-15tap: fix the crashMohsin Kazmi1-0/+3
Crash will happen when someone will try to setup a tap interface in host namespace without providing the host side of tap interface custom name. This patch fixes the problem by using the default name in this case. Change-Id: Ic1eaea5abd01bc6c766d0e0fcacae29ab7a7ec45 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-04-10API: Fix shared memory only action handlers.Ole Troan3-70/+1
Some API action handlers called vl_msg_ai_send_shmem() directly. That breaks Unix domain socket API transport. A couple (bond / vhost) also tried to send a sw_interface_event directly, but did not send the message to all that had registred interest. That scheme never worked correctly. Refactored and improved the interface event code. Change-Id: Idb90edfd8703c6ae593b36b4eeb4d3ed7da5c808 Signed-off-by: Ole Troan <ot@cisco.com>
2019-04-08fixing typosJim Thompson4-4/+4
Change-Id: I215e1e0208a073db80ec6f87695d734cf40fabe3 Signed-off-by: Jim Thompson <jim@netgate.com>
2019-04-08virtio: Fix the coverity warningsMohsin Kazmi1-6/+11
Change-Id: I7c6e4bf2abf08193e54a736510c07eeacd6aebe7 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-04-08minor spelling errors (both in comments)Jim Thompson1-1/+1
Change-Id: I9282a838738d0ba54255bef347abf4735be29820 Signed-off-by: Jim Thompson <jim@netgate.com>
2019-04-06Pipe: fix double count on TX (TX counting is done in interface-output)Neale Ranns1-11/+1
Change-Id: I550313a36ae02eb3faa2f1a5e3614f55275a00cf Signed-off-by: Neale Ranns <nranns@cisco.com>
2019-04-03virtio: Add support for multiqueueMohsin Kazmi7-88/+493
Change-Id: Id71ffa77e977651f219ac09d1feef334851209e1 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-03-28Add RDMA ibverb driver pluginBenoît Ganne1-10/+2
RDMA ibverb is a userspace API to efficiently rx/tx packets. This is an initial, unoptimized driver targeting Mellanox cards. Next steps should include batching, multiqueue and additional cards. Change-Id: I0309c7a543f75f2f9317eaf63ca502ac7a093ef9 Signed-off-by: Benoît Ganne <bganne@cisco.com>
2019-03-28Typos. A bunch of typos I've been collecting.Paul Vinciguerra1-1/+1
Change-Id: I53ab8d17914e6563110354e4052109ac02bf8f3b Signed-off-by: Paul Vinciguerra <pvinci@vinciconsulting.com>
2019-03-15Revert "API: Cleanup APIs interface.api"Ole Trøan3-3/+6
This reverts commit e63325e3ca03c847963863446345e6c80a2c0cfd. Allow time for CSIT to accommodate. Change-Id: I59435e4ab5e05e36a2796c3bf44889b5d4823cc2 Signed-off-by: ot@cisco.com
2019-03-15API: Cleanup APIs interface.apiJakub Grajciar3-6/+3
Use of consistent API types for interface.api Change-Id: Ieb54cebb4ac96b432a3f0b41596718aa2f34885b Signed-off-by: Jakub Grajciar <jgrajcia@cisco.com>
2019-03-13vhost-user: restart vpp may cause vhost to crashSteven Luong1-1/+1
Fix a typo in vhost_user_rx_discard_packet which may cause txvq->last_avail_idx to go wild. Change-Id: Ifaeb58835dff9b7ea82c061442722f1dcaa5d9a4 Signed-off-by: Steven Luong <sluong@cisco.com> (cherry picked from commit 39382976701926c1f34191c1311829c15a53cb01)
2019-03-13deprecate VLIB_DEVICE_TX_FUNCTION_MULTIARCHFilip Tehlar3-21/+15
Change-Id: I8819bcb9e228e7a432f4a7b67b6107f984927cd4 Signed-off-by: Filip Tehlar <ftehlar@cisco.com>
2019-03-04devices: migrate old MULTIARCH macros to VLIB_NODE_FNFilip Tehlar3-18/+9
Change-Id: I911fb3f1c6351b37580c5dbde6939a549431a92d Signed-off-by: Filip Tehlar <ftehlar@cisco.com>
2019-02-23vhoat: potential crash in map_guest_mem using debug imageSteven Luong1-1/+14
map_guest_mem may be called from worker-thread/dataplane. It has a call to vlib_log and may crash inside vlib_log's ASSERT statement /* make sure we are running on the main thread to avoid use in dataplane code, for dataplane logging consider use of event-logger */ ASSERT (vlib_get_thread_index () == 0); The fix is to convert the vlib_log call in map_guest_map to event logger Change-Id: Iaaf6d86782aa8a18d25e0209f22dc31f04668d56 Signed-off-by: Steven Luong <sluong@cisco.com>
2019-02-22tapv2: coverity strikes backSteven Luong1-5/+4
while https://gerrit.fd.io/r/#/c/16590/ fixed the leaked fd which coverity reported at that time, new coverity run reports simailar leaked fd in a different goto punt path. It would be nice if coverity reported both of them at the same time. Or perhaps it did and I just missed it. Anyway, the new fix is to put the close (fd) statement prior to the return of tap_create_if routine which should catch all goto's. Change-Id: I0a51ed3710e32d5d74c9cd9b5066a667153e2f9d Signed-off-by: Steven Luong <sluong@cisco.com>
2019-02-22Add no-append flag to vlib_frame_tDamjan Marion1-0/+1
Change-Id: I01c4f5755d579282773ac227b0bc24f8ddbb2bd1 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-02-21vhost: VPP stalls with vhost performing control plane actionsSteven Luong3-214/+259
Symptom ------- With NDR traffic blasting at VPP, bringing up a new VM with vhost connection to VPP causes packet drops. I am able to recreate this problem easily using a simple setup like this. TREX-------------- switch ---- VPP |---------------| |-------| Cause ----- The reason for the packet drops is due to vhost holding onto the worker barrier lock for too long in vhost_user_socket_read(). There are quite a few of system calls inside the routine. At the end of the routine, it unconditionally calls vhost_user_update_iface_state() for all message types. vhost_user_update_iface_state() also unconditionally calls vhost_user_rx_thread_placement() and vhost_user_tx_thread_placement(). vhost_user_rx_thread_placement scraps out all existing cpu/queue mappings for the interface and creates brand new cpu/queue mappings for the interface. This process is very disruptive and very expensive. In my opinion, this area of code needs a makeover. Fixes ----- * vhost_user_socket_read() is rewritten that it should not hold onto the worker barrier lock for system calls, or at least minimize the need for doing it. * Remove the call to vhost_user_update_iface_state as a default route at the end of vhost_user_socket_read(). There is only a couple of message types which really need to call vhost_user_update_iface_state(). We put the call to those message types which need it. * Remove vhost_user_rx_thread_placement() and vhost_user_tx_thread_placement from vhost_user_update_iface_state(). There is no need to repetatively change the cpu/queue mappings. * vhost_user_rx_thread_placement() is actually quite expensive. It should be called only once per queue for the interface. There is no need to scrap the existing cpu/queue mappings and create new cpu/queue mappings when the additional queues becomes active/enable. * Change to create the cpu/queue mappings for the first RX when the interface is created. Dont remove the cpu/queue mapping when the interface is disconnected. Remove the cpu/queue mapping only when the interface is deleted. The create vhost user interface CLI also has some very expensive system calls if the command is entered with the optional keyword "server" As a bonus, This patch makes the create vhost user interface binary-api and CLI thread safe. Do the protection for the small amount of code which is thread unsafe. Change-Id: I4a19cbf7e9cc37ea01286169882e5603e6d7eb77 Signed-off-by: Steven Luong <sluong@cisco.com>
2019-02-19tap gso: experimental supportAndrew Yourtchenko7-8/+248
This commit adds a "gso" parameter to existing "create tap..." CLI, and a "no-gso" parameter for the compatibility with the future, when/if defaults change. It makes use of the lowest bit of the "tap_flags" field in the API call in order to allow creation of GSO interfaces via API as well. It does the necessary syscalls to enable the GSO and checksum offload support on the kernel side and sets two flags on the interface: virtio-specific virtio_if_t.gso_enabled, and vnet_hw_interface_t.flags & VNET_HW_INTERFACE_FLAG_SUPPORTS_GSO. The first one, if enabled, triggers the marking of the GSO-encapsulated packets on ingress with VNET_BUFFER_F_GSO flag, and setting vnet_buffer2(b)->gso_size to the desired L4 payload size. VNET_HW_INTERFACE_FLAG_SUPPORTS_GSO determines the egress packet processing in interface-output for such packets: When the flag is set, they are sent out almost as usual (just taking care to set the vnet header for virtio). When the flag is not enabled (the case for most interfaces), the egress path performs the re-segmentation such that the L4 payload of the transmitted packets equals gso_size. The operations in the datapath are enabled only when there is at least one GSO-compatible interface in the system - this is done by tracking the count in interface_main.gso_interface_count. This way the impact of conditional checks for the setups that do not use GSO is minimized. "show tap" CLI shows the state of the GSO flag on the interface, and the total count of GSO-enabled interfaces (which is used to enable the GSO-related processing in the packet path). This commit lacks IPv6 extension header traversal support of any kind - the L4 payload is assumed to follow the IPv6 header. Also it performs the offloads only for TCP (TSO - TCP segmentation offload). The UDP fragmentation offload (UFO) is not part of it. For debug purposes it also adds the debug CLI: "set tap gso {<interface> | sw_if_index <sw_idx>} <enable|disable>" Change-Id: Ifd562db89adcc2208094b3d1032cee8c307aaef9 Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
2019-02-14Add -fno-common compile optionBenoît Ganne3-3/+3
-fno-common makes sure we do not have multiple declarations of the same global symbol across compilation units. It helps debug nasty linkage bugs by guaranteeing that all reference to a global symbol use the same underlying object. It also helps avoiding benign mistakes such as declaring enum as global objects instead of types in headers (hence the minor fixes scattered across the source). Change-Id: I55c16406dc54ff8a6860238b90ca990fa6b179f1 Signed-off-by: Benoît Ganne <bganne@cisco.com>
2019-02-09buffers: fix typoDamjan Marion4-7/+6
Change-Id: I4e836244409c98739a13092ee252542a2c5fe259 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-02-06buffers: make buffer data size configurable from startup configDamjan Marion4-5/+6
Example: buffers { default data-size 1536 } Change-Id: I5b4436850ca18025c9fdcfc7ed648c2c2732d660 Signed-off-by: Damjan Marion <damarion@cisco.com>
2019-02-06virtio: enable msix interrupt modeMohsin Kazmi4-44/+114
Change-Id: Idd560f3afde1dd03bc3d6fbb2070096146865f50 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-02-06virtio: Use new buffer optimizationMohsin Kazmi5-2/+24
Change-Id: Ifc98373371b967c49a75989eac415ddda1dcf15f Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-01-30virtio: fix the device order (legacy or modern)Mohsin Kazmi1-3/+3
Change-Id: I60f88d50f062b004e6dea487bd627d303d0a5e75 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-01-29virtio: Support legacy and transitional virtio devicesMohsin Kazmi1-3/+11
Change-Id: Ib1316482dd7b1ae3c27c7eeb55839ed8af9ca162 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-01-24virtio: Minor fixes and header cleanupMohsin Kazmi5-24/+7
Change-Id: I2e5fd45abcd07e9eda6184587889bdcd9613a159 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-01-23virtio: Add support for loggingMohsin Kazmi4-31/+97
Change-Id: Ieadf0a97379ed8b17241e454895c4e5e195dc52f Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
2019-01-21virtio: Native virtio driverMohsin Kazmi11-173/+2069
Change-Id: Id7fccf2f805e578fb05032aeb2b649a74c3c0e56 Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
verbosity=verbose, result_pipe=result_pipe, failfast=failfast, print_summary=False).run(suite) finished_pipe.send(result.wasSuccessful()) finished_pipe.close() keep_alive_pipe.close() class TestCaseWrapper(object): def __init__(self, testcase_suite, manager): self.keep_alive_parent_end, self.keep_alive_child_end = Pipe( duplex=False) self.finished_parent_end, self.finished_child_end = Pipe(duplex=False) self.result_parent_end, self.result_child_end = Pipe(duplex=False) self.testcase_suite = testcase_suite if sys.version[0] == '2': self.stdouterr_queue = manager.StreamQueue() else: from multiprocessing import get_context self.stdouterr_queue = manager.StreamQueue(ctx=get_context()) self.logger = get_parallel_logger(self.stdouterr_queue) self.child = Process(target=test_runner_wrapper, args=(testcase_suite, self.keep_alive_child_end, self.stdouterr_queue, self.finished_child_end, self.result_child_end, self.logger) ) self.child.start() self.last_test_temp_dir = None self.last_test_vpp_binary = None self._last_test = None self.last_test_id = None self.vpp_pid = None self.last_heard = time.time() self.core_detected_at = None self.testcases_by_id = {} self.testclasess_with_core = {} for testcase in self.testcase_suite: self.testcases_by_id[testcase.id()] = testcase self.result = TestResult(testcase_suite, self.testcases_by_id) @property def last_test(self): return self._last_test @last_test.setter def last_test(self, test_id): self.last_test_id = test_id if test_id in self.testcases_by_id: testcase = self.testcases_by_id[test_id] self._last_test = testcase.shortDescription() if not self._last_test: self._last_test = str(testcase) else: self._last_test = test_id def add_testclass_with_core(self): if self.last_test_id in self.testcases_by_id: test = self.testcases_by_id[self.last_test_id] class_name = unittest.util.strclass(test.__class__) test_name = "'{}' ({})".format(get_test_description(descriptions, test), self.last_test_id) else: test_name = self.last_test_id class_name = re.match(r'((tearDownClass)|(setUpClass)) ' r'\((.+\..+)\)', test_name).groups()[3] if class_name not in self.testclasess_with_core: self.testclasess_with_core[class_name] = ( test_name, self.last_test_vpp_binary, self.last_test_temp_dir) def close_pipes(self): self.keep_alive_child_end.close() self.finished_child_end.close() self.result_child_end.close() self.keep_alive_parent_end.close() self.finished_parent_end.close() self.result_parent_end.close() def was_successful(self): return self.result.was_successful() def stdouterr_reader_wrapper(unread_testcases, finished_unread_testcases, read_testcases): read_testcase = None while read_testcases.is_set() or unread_testcases: if finished_unread_testcases: read_testcase = finished_unread_testcases.pop() unread_testcases.remove(read_testcase) elif unread_testcases: read_testcase = unread_testcases.pop() if read_testcase: data = '' while data is not None: sys.stdout.write(data) data = read_testcase.stdouterr_queue.get() read_testcase.stdouterr_queue.close() finished_unread_testcases.discard(read_testcase) read_testcase = None def handle_failed_suite(logger, last_test_temp_dir, vpp_pid): if last_test_temp_dir: # Need to create link in case of a timeout or core dump without failure lttd = os.path.basename(last_test_temp_dir) failed_dir = os.getenv('FAILED_DIR') link_path = '%s%s-FAILED' % (failed_dir, lttd) if not os.path.exists(link_path): os.symlink(last_test_temp_dir, link_path) logger.error("Symlink to failed testcase directory: %s -> %s" % (link_path, lttd)) # Report core existence core_path = get_core_path(last_test_temp_dir) if os.path.exists(core_path): logger.error( "Core-file exists in test temporary directory: %s!" % core_path) check_core_path(logger, core_path) logger.debug("Running 'file %s':" % core_path) try: info = check_output(["file", core_path]) logger.debug(info) except CalledProcessError as e: logger.error("Subprocess returned with return code " "while running `file' utility on core-file " "returned: " "rc=%s", e.returncode) except OSError as e: logger.error("Subprocess returned with OS error while " "running 'file' utility " "on core-file: " "(%s) %s", e.errno, e.strerror) except Exception as e: logger.exception("Unexpected error running `file' utility " "on core-file") if vpp_pid: # Copy api post mortem api_post_mortem_path = "/tmp/api_post_mortem.%d" % vpp_pid if os.path.isfile(api_post_mortem_path): logger.error("Copying api_post_mortem.%d to %s" % (vpp_pid, last_test_temp_dir)) shutil.copy2(api_post_mortem_path, last_test_temp_dir) def check_and_handle_core(vpp_binary, tempdir, core_crash_test): if is_core_present(tempdir): if debug_core: print('VPP core detected in %s. Last test running was %s' % (tempdir, core_crash_test)) print(single_line_delim) spawn_gdb(vpp_binary, get_core_path(tempdir)) print(single_line_delim) elif compress_core: print("Compressing core-file in test directory `%s'" % tempdir) os.system("gzip %s" % get_core_path(tempdir)) def handle_cores(failed_testcases): for failed_testcase in failed_testcases: tcs_with_core = failed_testcase.testclasess_with_core if tcs_with_core: for test, vpp_binary, tempdir in tcs_with_core.values(): check_and_handle_core(vpp_binary, tempdir, test) def process_finished_testsuite(wrapped_testcase_suite, finished_testcase_suites, failed_wrapped_testcases, results): results.append(wrapped_testcase_suite.result) finished_testcase_suites.add(wrapped_testcase_suite) stop_run = False if failfast and not wrapped_testcase_suite.was_successful(): stop_run = True if not wrapped_testcase_suite.was_successful(): failed_wrapped_testcases.add(wrapped_testcase_suite) handle_failed_suite(wrapped_testcase_suite.logger, wrapped_testcase_suite.last_test_temp_dir, wrapped_testcase_suite.vpp_pid) return stop_run def run_forked(testcase_suites): wrapped_testcase_suites = set() # suites are unhashable, need to use list results = [] unread_testcases = set() finished_unread_testcases = set() manager = StreamQueueManager() manager.start() for i in range(concurrent_tests): if testcase_suites: wrapped_testcase_suite = TestCaseWrapper(testcase_suites.pop(0), manager) wrapped_testcase_suites.add(wrapped_testcase_suite) unread_testcases.add(wrapped_testcase_suite) else: break read_from_testcases = threading.Event() read_from_testcases.set() stdouterr_thread = threading.Thread(target=stdouterr_reader_wrapper, args=(unread_testcases, finished_unread_testcases, read_from_testcases)) stdouterr_thread.start() failed_wrapped_testcases = set() stop_run = False try: while wrapped_testcase_suites: finished_testcase_suites = set() for wrapped_testcase_suite in wrapped_testcase_suites: while wrapped_testcase_suite.result_parent_end.poll(): wrapped_testcase_suite.result.process_result( *wrapped_testcase_suite.result_parent_end.recv()) wrapped_testcase_suite.last_heard = time.time() while wrapped_testcase_suite.keep_alive_parent_end.poll(): wrapped_testcase_suite.last_test, \ wrapped_testcase_suite.last_test_vpp_binary, \ wrapped_testcase_suite.last_test_temp_dir, \ wrapped_testcase_suite.vpp_pid = \ wrapped_testcase_suite.keep_alive_parent_end.recv() wrapped_testcase_suite.last_heard = time.time() if wrapped_testcase_suite.finished_parent_end.poll(): wrapped_testcase_suite.finished_parent_end.recv() wrapped_testcase_suite.last_heard = time.time() stop_run = process_finished_testsuite( wrapped_testcase_suite, finished_testcase_suites, failed_wrapped_testcases, results) or stop_run continue fail = False if wrapped_testcase_suite.last_heard + test_timeout < \ time.time(): fail = True wrapped_testcase_suite.logger.critical( "Child test runner process timed out " "(last test running was `%s' in `%s')!" % (wrapped_testcase_suite.last_test, wrapped_testcase_suite.last_test_temp_dir)) elif not wrapped_testcase_suite.child.is_alive(): fail = True wrapped_testcase_suite.logger.critical( "Child test runner process unexpectedly died " "(last test running was `%s' in `%s')!" % (wrapped_testcase_suite.last_test, wrapped_testcase_suite.last_test_temp_dir)) elif wrapped_testcase_suite.last_test_temp_dir and \ wrapped_testcase_suite.last_test_vpp_binary: if is_core_present( wrapped_testcase_suite.last_test_temp_dir): wrapped_testcase_suite.add_testclass_with_core() if wrapped_testcase_suite.core_detected_at is None: wrapped_testcase_suite.core_detected_at = \ time.time() elif wrapped_testcase_suite.core_detected_at + \ core_timeout < time.time(): wrapped_testcase_suite.logger.critical( "Child test runner process unresponsive and " "core-file exists in test temporary directory " "(last test running was `%s' in `%s')!" % (wrapped_testcase_suite.last_test, wrapped_testcase_suite.last_test_temp_dir)) fail = True if fail: wrapped_testcase_suite.child.terminate() try: # terminating the child process tends to leave orphan # VPP process around if wrapped_testcase_suite.vpp_pid: os.kill(wrapped_testcase_suite.vpp_pid, signal.SIGTERM) except OSError: # already dead pass wrapped_testcase_suite.result.crashed = True wrapped_testcase_suite.result.process_result( wrapped_testcase_suite.last_test_id, ERROR) stop_run = process_finished_testsuite( wrapped_testcase_suite, finished_testcase_suites, failed_wrapped_testcases, results) or stop_run for finished_testcase in finished_testcase_suites: finished_testcase.child.join() finished_testcase.close_pipes() wrapped_testcase_suites.remove(finished_testcase) finished_unread_testcases.add(finished_testcase) finished_testcase.stdouterr_queue.put(None) if stop_run: while testcase_suites: results.append(TestResult(testcase_suites.pop(0))) elif testcase_suites: new_testcase = TestCaseWrapper(testcase_suites.pop(0), manager) wrapped_testcase_suites.add(new_testcase) unread_testcases.add(new_testcase) time.sleep(0.1) except Exception: for wrapped_testcase_suite in wrapped_testcase_suites: wrapped_testcase_suite.child.terminate() wrapped_testcase_suite.stdouterr_queue.put(None) raise finally: read_from_testcases.clear() stdouterr_thread.join(test_timeout) manager.shutdown() handle_cores(failed_wrapped_testcases) return results class SplitToSuitesCallback: def __init__(self, filter_callback): self.suites = {} self.suite_name = 'default' self.filter_callback = filter_callback self.filtered = unittest.TestSuite() def __call__(self, file_name, cls, method): test_method = cls(method) if self.filter_callback(file_name, cls.__name__, method): self.suite_name = file_name + cls.__name__ if self.suite_name not in self.suites: self.suites[self.suite_name] = unittest.TestSuite() self.suites[self.suite_name].addTest(test_method) else: self.filtered.addTest(test_method) test_option = "TEST" def parse_test_option(): f = os.getenv(test_option, None) filter_file_name = None filter_class_name = None filter_func_name = None if f: if '.' in f: parts = f.split('.') if len(parts) > 3: raise Exception("Unrecognized %s option: %s" % (test_option, f)) if len(parts) > 2: if parts[2] not in ('*', ''): filter_func_name = parts[2] if parts[1] not in ('*', ''): filter_class_name = parts[1] if parts[0] not in ('*', ''): if parts[0].startswith('test_'): filter_file_name = parts[0] else: filter_file_name = 'test_%s' % parts[0] else: if f.startswith('test_'): filter_file_name = f else: filter_file_name = 'test_%s' % f if filter_file_name: filter_file_name = '%s.py' % filter_file_name return filter_file_name, filter_class_name, filter_func_name def filter_tests(tests, filter_cb): result = unittest.suite.TestSuite() for t in tests: if isinstance(t, unittest.suite.TestSuite): # this is a bunch of tests, recursively filter... x = filter_tests(t, filter_cb) if x.countTestCases() > 0: result.addTest(x) elif isinstance(t, unittest.TestCase): # this is a single test parts = t.id().split('.') # t.id() for common cases like this: # test_classifier.TestClassifier.test_acl_ip # apply filtering only if it is so if len(parts) == 3: if not filter_cb(parts[0], parts[1], parts[2]): continue result.addTest(t) else: # unexpected object, don't touch it result.addTest(t) return result class FilterByTestOption: def __init__(self, filter_file_name, filter_class_name, filter_func_name): self.filter_file_name = filter_file_name self.filter_class_name = filter_class_name self.filter_func_name = filter_func_name def __call__(self, file_name, class_name, func_name): if self.filter_file_name: fn_match = fnmatch.fnmatch(file_name, self.filter_file_name) if not fn_match: return False if self.filter_class_name and class_name != self.filter_class_name: return False if self.filter_func_name and func_name != self.filter_func_name: return False return True class FilterByClassList: def __init__(self, classes_with_filenames): self.classes_with_filenames = classes_with_filenames def __call__(self, file_name, class_name, func_name): return '.'.join([file_name, class_name]) in self.classes_with_filenames def suite_from_failed(suite, failed): failed = {x.rsplit('.', 1)[0] for x in failed} filter_cb = FilterByClassList(failed) suite = filter_tests(suite, filter_cb) return suite class AllResults(dict): def __init__(self): super(AllResults, self).__init__() self.all_testcases = 0 self.results_per_suite = [] self[PASS] = 0 self[FAIL] = 0 self[ERROR] = 0 self[SKIP] = 0 self[TEST_RUN] = 0 self.rerun = [] self.testsuites_no_tests_run = [] def add_results(self, result): self.results_per_suite.append(result) result_types = [PASS, FAIL, ERROR, SKIP, TEST_RUN] for result_type in result_types: self[result_type] += len(result[result_type]) def add_result(self, result): retval = 0 self.all_testcases += result.testcase_suite.countTestCases() self.add_results(result) if result.no_tests_run(): self.testsuites_no_tests_run.append(result.testcase_suite) if result.crashed: retval = -1 else: retval = 1 elif not result.was_successful(): retval = 1 if retval != 0: self.rerun.append(result.testcase_suite) return retval def print_results(self): print('') print(double_line_delim) print('TEST RESULTS:') print(' Scheduled tests: {}'.format(self.all_testcases)) print(' Executed tests: {}'.format(self[TEST_RUN])) print(' Passed tests: {}'.format( colorize(str(self[PASS]), GREEN))) if self[SKIP] > 0: print(' Skipped tests: {}'.format( colorize(str(self[SKIP]), YELLOW))) if self.not_executed > 0: print(' Not Executed tests: {}'.format( colorize(str(self.not_executed), RED))) if self[FAIL] > 0: print(' Failures: {}'.format( colorize(str(self[FAIL]), RED))) if self[ERROR] > 0: print(' Errors: {}'.format( colorize(str(self[ERROR]), RED))) if self.all_failed > 0: print('FAILURES AND ERRORS IN TESTS:') for result in self.results_per_suite: failed_testcase_ids = result[FAIL] errored_testcase_ids = result[ERROR] old_testcase_name = None if failed_testcase_ids or errored_testcase_ids: for failed_test_id in failed_testcase_ids: new_testcase_name, test_name = \ result.get_testcase_names(failed_test_id) if new_testcase_name != old_testcase_name: print(' Testcase name: {}'.format( colorize(new_testcase_name, RED))) old_testcase_name = new_testcase_name print(' FAILURE: {} [{}]'.format( colorize(test_name, RED), failed_test_id)) for failed_test_id in errored_testcase_ids: new_testcase_name, test_name = \ result.get_testcase_names(failed_test_id) if new_testcase_name != old_testcase_name: print(' Testcase name: {}'.format( colorize(new_testcase_name, RED))) old_testcase_name = new_testcase_name print(' ERROR: {} [{}]'.format( colorize(test_name, RED), failed_test_id)) if self.testsuites_no_tests_run: print('TESTCASES WHERE NO TESTS WERE SUCCESSFULLY EXECUTED:') tc_classes = set() for testsuite in self.testsuites_no_tests_run: for testcase in testsuite: tc_classes.add(get_testcase_doc_name(testcase)) for tc_class in tc_classes: print(' {}'.format(colorize(tc_class, RED))) print(double_line_delim) print('') @property def not_executed(self): return self.all_testcases - self[TEST_RUN] @property def all_failed(self): return self[FAIL] + self[ERROR] def parse_results(results): """ Prints the number of scheduled, executed, not executed, passed, failed, errored and skipped tests and details about failed and errored tests. Also returns all suites where any test failed. :param results: :return: """ results_per_suite = AllResults() crashed = False failed = False for result in results: result_code = results_per_suite.add_result(result) if result_code == 1: failed = True elif result_code == -1: crashed = True results_per_suite.print_results() if crashed: return_code = -1 elif failed: return_code = 1 else: return_code = 0 return return_code, results_per_suite.rerun def parse_digit_env(env_var, default): value = os.getenv(env_var, default) if value != default: if value.isdigit(): value = int(value) else: print('WARNING: unsupported value "%s" for env var "%s",' 'defaulting to %s' % (value, env_var, default)) value = default return value if __name__ == '__main__': verbose = parse_digit_env("V", 0) test_timeout = parse_digit_env("TIMEOUT", 600) # default = 10 minutes retries = parse_digit_env("RETRIES", 0) debug = os.getenv("DEBUG", "n").lower() in ["gdb", "gdbserver"] debug_core = os.getenv("DEBUG", "").lower() == "core" compress_core = framework.BoolEnvironmentVariable("CORE_COMPRESS") step = framework.BoolEnvironmentVariable("STEP") force_foreground = framework.BoolEnvironmentVariable("FORCE_FOREGROUND") run_interactive = debug or step or force_foreground try: num_cpus = len(os.sched_getaffinity(0)) except AttributeError: num_cpus = multiprocessing.cpu_count() shm_free = psutil.disk_usage('/dev/shm').free print('OS reports %s available cpu(s). Free shm: %s' % ( num_cpus, "{:,}MB".format(shm_free / (1024 * 1024)))) test_jobs = os.getenv("TEST_JOBS", "1").lower() # default = 1 process if test_jobs == 'auto': if run_interactive: concurrent_tests = 1 print('Interactive mode required, running on one core') else: shm_max_processes = 1 if shm_free < min_req_shm: raise Exception('Not enough free space in /dev/shm. Required ' 'free space is at least %sM.' % (min_req_shm >> 20)) else: extra_shm = shm_free - min_req_shm shm_max_processes += extra_shm // shm_per_process concurrent_tests = min(cpu_count(), shm_max_processes) print('Found enough resources to run tests with %s cores' % concurrent_tests) elif test_jobs.isdigit(): concurrent_tests = int(test_jobs) print("Running on %s core(s) as set by 'TEST_JOBS'." % concurrent_tests) else: concurrent_tests = 1 print('Running on one core.') if run_interactive and concurrent_tests > 1: raise NotImplementedError( 'Running tests interactively (DEBUG is gdb or gdbserver or STEP ' 'is set) in parallel (TEST_JOBS is more than 1) is not supported') parser = argparse.ArgumentParser(description="VPP unit tests") parser.add_argument("-f", "--failfast", action='store_true', help="fast failure flag") parser.add_argument("-d", "--dir", action='append', type=str, help="directory containing test files " "(may be specified multiple times)") args = parser.parse_args() failfast = args.failfast descriptions = True print("Running tests using custom test runner") # debug message filter_file, filter_class, filter_func = parse_test_option() print("Active filters: file=%s, class=%s, function=%s" % ( filter_file, filter_class, filter_func)) filter_cb = FilterByTestOption(filter_file, filter_class, filter_func) ignore_path = os.getenv("VENV_PATH", None) cb = SplitToSuitesCallback(filter_cb) for d in args.dir: print("Adding tests from directory tree %s" % d) discover_tests(d, cb, ignore_path) # suites are not hashable, need to use list suites = [] tests_amount = 0 for testcase_suite in cb.suites.values(): tests_amount += testcase_suite.countTestCases() suites.append(testcase_suite) print("%s out of %s tests match specified filters" % ( tests_amount, tests_amount + cb.filtered.countTestCases())) if not running_extended_tests: print("Not running extended tests (some tests will be skipped)") attempts = retries + 1 if attempts > 1: print("Perform %s attempts to pass the suite..." % attempts) if run_interactive and suites: # don't fork if requiring interactive terminal print('Running tests in foreground in the current process') full_suite = unittest.TestSuite() map(full_suite.addTests, suites) result = VppTestRunner(verbosity=verbose, failfast=failfast, print_summary=True).run(full_suite) was_successful = result.wasSuccessful() if not was_successful: for test_case_info in result.failed_test_cases_info: handle_failed_suite(test_case_info.logger, test_case_info.tempdir, test_case_info.vpp_pid) if test_case_info in result.core_crash_test_cases_info: check_and_handle_core(test_case_info.vpp_bin_path, test_case_info.tempdir, test_case_info.core_crash_test) sys.exit(not was_successful) else: print('Running each VPPTestCase in a separate background process' ' with {} parallel process(es)'.format(concurrent_tests)) exit_code = 0 while suites and attempts > 0: results = run_forked(suites) exit_code, suites = parse_results(results) attempts -= 1 if exit_code == 0: print('Test run was successful') else: print('%s attempt(s) left.' % attempts) sys.exit(exit_code)