Age | Commit message (Collapse) | Author | Files | Lines |
|
When I type in 'quit' on the slave instance, the master instance crashes
on this line.
0: /home/sluong/vpp-master/vpp/build-data/../src/vlib/unix/input.c:200 (linux_epoll_input) assertion `! pool_is_free (um->file_pool, _e)' fails
Aborted (core dumped)
Below is the decode from gdb
line_number=0, fmt=0x7f57af6cc9a0 "%s:%d (%s) assertion `%s' fails")
at /home/sluong/vpp-master/vpp/build-data/../src/vppinfra/error.c:143
vm=0x7f57af8e2400 <vlib_global_main>, node=0x7f576d40ad80, frame=0x0)
at /home/sluong/vpp-master/vpp/build-data/../src/vlib/unix/input.c:200
vm=0x7f57af8e2400 <vlib_global_main>, node=0x7f576d40ad80,
type=VLIB_NODE_TYPE_PRE_INPUT, dispatch_state=VLIB_NODE_STATE_POLLING,
frame=0x0, last_time_stamp=1525665215050617)
at /home/sluong/vpp-master/vpp/build-data/../src/vlib/main.c:1016
vm=0x7f57af8e2400 <vlib_global_main>, is_main=1)
at /home/sluong/vpp-master/vpp/build-data/../src/vlib/main.c:1500
I am able to reproduce the problem consistently with the below procedure.
1. Create 3 memif interfaces between slave and master instances.
2. Type 'quit' on the slave. Neither crashes the first time.
3. Bring back the slave. Type 'quit' on the master. Neither crashes.
4. Bring back the master. Type 'quit' on the slave. The master crashes.
There are two places the interrupt line is disconnected and the unix file is
removed via the call unix_file_del ()
1. memif_int_fd_read_ready ()
2. memif_disconnect () which is called via multiple places in memif.
When the crash happens, the unix file was removed from memif_disconnect ()
via memif_conn_fd_read_ready () with size of the message == 0 in recvmsg ().
It is noted when the unix file was removed from memif_int_fd_read_ready (),
it never crashes. It is a race condition. However, if I follow the
aformentioned procedure, the crash always happens.
The reason the crash happens when memif_disconnect () removes the unix file
is because there may still be pending input in linux_epoll_input (). When
linux_epoll_input () tries to access the unix file via the line 200
unix_file_t *f = pool_elt_at_index (um->file_pool, i);
it crashes.
We could add code in linux_epoll_input () to avoid the crash if the index
for the particular file_pool is already free. Or we could fix memif to not
remove the unix file in memif_conn_fd_read_ready () when recvmsg () got 0
byte and just postpone the unix file deletion in memif_int_fd_read_ready ()
later after linux_epoll_input () got a chance to run to empty the input
stream.
I choose to fix the problem in the latter approach. I split the function
memif_disconnect () into two parts. For the code path which
memif_conn_fd_read_ready () calls memif_disconnect (), it does not remove the
unix file. All other calls to memif_disconnect () will continue to do what
it uses to do to avoid regression.
Please let me know if I should fix the problem other way.
Change-Id: I8efe2a3d24c6581609bc7b6fe82c2b59c22d8e4b
Signed-off-by: Steven <sluong@cisco.com>
|
|
Migrate memif to use vnet device infra APIs. No new function is added.
Change-Id: I70e440d2ae1e673876365041f31fe78997aceecf
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I72298aaae7d172082ece3a8edea4217c11b28d79
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I220b0b95449f24cc547206e38ab8e10019115ec0
Signed-off-by: Milan Lenco <milan.lenco@pantheon.tech>
|
|
This patch deprecates stack-based thread identification,
Also removes requirement that thread stacks are adjacent.
Finally, possibly annoying for some folks, it renames
all occurences of cpu_index and cpu_number with thread
index. Using word "cpu" is misleading here as thread can
be migrated ti different CPU, and also it is not related
to linux cpu index.
Change-Id: I68cdaf661e701d2336fc953dcb9978d10a70f7c1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I935620798d6fe82b99b6bd564749e20a189b4ae3
Signed-off-by: Milan Lenco <milan.lenco@pantheon.tech>
|
|
Change-Id: I844ec53b55ceaa1e00996f5cf8a018537ea8b481
Signed-off-by: Milan Lenco <milan.lenco@pantheon.tech>
|
|
Change-Id: I86089e9bb604adfc260a111685001be1c897ce53
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I94c06b07a39f07ceba87bf3e7fcfc70e43231e8a
Signed-off-by: Damjan Marion <damarion@cisco.com>
Co-Authored-By: Milan Lenco <Milan.Lenco@pantheon.tech>
|