Age | Commit message (Collapse) | Author | Files | Lines |
|
Type: refactor
Switch from a wrapped byte space to a "continuous" one wherein fifo
chunks are appended to the fifo as more data is enqueued and chunks are
removed as data is dequeued.
The fifo is still subject to a maximum size, i.e., maximum number of
bytes that can be enqueued, so the max number of chunks associated to
the fifo is also constrained.
When enqueueing data, which must fit within the available free space, if
not enough "supporting" chunk memory is available, the fifo asks the
fifo segment for enough chunk memory to ensure that the write can
succeed. To avoid allocating large amounts of small chunks due to small
writes, if possible, the size of the chunks requested is lower capped by
min_alloc.
When dequeuing data, all the chunks that have been completely drained,
i.e., head moved beyond the chunks’ end bytes, are unlinked from the
fifo and returned to the fifo segment. The one exception to this is the
last chunk which is never unlinked.
Change-Id: I98c1dbd9135fb79650365c7e40c29238b96cd4ee
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Avoid tracking with rbtrees all of the chunks associated to a fifo.
Instead, only track chunks when doing out-of-order operations (peek or
ooo enqueue).
Type: refactor
Change-Id: I9f8bd266211746637d98e6a12ffc4b2d6346950a
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Type: fix
- when the fifo is wrapped, and if applicable, insert a new chunk after
the tail-chunk and rebuild the rb_tree.
- make sure that this new algorithm can be applied only when the fifo is
used by a single thread (master-thread of the fifo).
Signed-off-by: Ryujiro Shibuya <ryujiro.shibuya@owmobility.com>
Change-Id: I3fc187bc496ea537ca24381e4abc08d2906c9e03
|
|
Type: fix
Change-Id: Ia362ad821db1fd506e973e1844cc3ec74703cc17
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Type:fix
Change-Id: I39c339abf1b51105ef1bcf3d6f0f4f6ded54f32d
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Type: fix
- make sure that chunks and the rbtree are initialized if fifo segment
allocates multiple chunks for the fifo.
- ensure head/tail chunks are updated on all enqueue/dequeue events,
including when dropping data.
- more unit tests
Also fixes dequeue drop updates of head chunk.
Change-Id: I77f3550bc4e8b4e077f80ea87fe82b83ed013aeb
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Type:fix
Change-Id: I8405bf8d93b4468c54f4f3c5dcd21ef91a6b1048
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Type: fix
This was causing issues in QUIC when an app client & the protocol
app compete for the worker msg_queue. Might not be ideal performance-
wise.
Change-Id: I629892253d5b5d968f31ad1d56f18463e143d6b4
Signed-off-by: Nathan Skrzypczak <nathan.skrzypczak@gmail.com>
|
|
Default chunk is no longer embedded into the fifo and on free is
returned to its respective chunk list.
Change-Id: Ifc5d214eaa6eca44356eb79dd75650fb8569113f
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ie519683bb90aae6fb95f2a09e251cded1890ed41
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
As opposed to growing, this is not a bulk operation, instead dependent
on how the producer/consumer advance head and tail, the fifo will shrink
in one or multiple steps.
Only once the fifo's nitems and size are reduced to their appropriate
values, equal or larger to what was requested, can the fifo chunks be
collected by the owner. Chunk collection must be done with the segment
heap pushed.
Change-Id: Iae407ccf48d85320aa3c1e0304df56c5972c88c1
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
If head/tail are stored as "absolute" values that are normalized to [0,
fifo_size] interval, when fifo is shrunk/grown the consumer and producer
have to independently update to the new fifo size and fix head and tail,
respectively.
If the head and tail are stored as normalized values, under the right
conditions, they don't need to be fixed when fifo size changes.
This reverts one of the changes in gerrit 18223.
Change-Id: I55a908828afe90925cf7c20186a940b25e5805f9
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ie76c69641c8598164d0d00fd498018037258fd86
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
These were introduced with the switch to unbound tail/head size, so they
only affect master. Added unit tests to avoid future surprises.
Change-Id: I83b6c9efbe31d8092ba59b8e2ed46f4da97f35db
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ie96706b4d8bcb32d2d5f065bc765f95f4e9369e7
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I984f347fb465c0c405cef668d8690457e81788e2
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: If23a04623a7138c9f6c98ee9ecfa587396618a60
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- make only the chunk copying (memcpy) code march aware
- cleanup dependencies
Change-Id: I369378264cacfcdaf0823353b957876554eaa17c
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ia56cad89b85b7a99ab4bfb85318a45a71381fb53
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Fifos can use multiple memory chunks for simple read/write operations.
Adding/removing chunks after assignment not yet supported.
Change-Id: I2aceab6aea78059d74e0d3a9993c40d5196d077b
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Problems Addressed:
- Contention of cursize by producer and consumer.
- Reduce the no of modulo operations.
Changes:
- Synchronization between producer and consumer changed from cursize
to head and tail indexes
Implications: reduces the usable size of fifo by 1.
- Using weaker memory ordering C++11 atomics to access head and tail
based on producer and consumer role.
- Head and tail indexes are unsigned 32 bit integers. Additions and
subtraction on them are implicit 32 bit Modulo operation.
- Adding weaker memory ordering variants of max_enq, max_deq, is_empty
and is_full Using them appropriately in all places.
Perfomance improvement (iperf3 via Hoststack):
iperf3 Server: Marvell ThunderX2(AArch64) - iperf3 Client: Skylake(x86)
~6%(256 rxd/txd) - ~11%(2048 rxd/txd)
Change-Id: I1d484e000e437430fdd5a819657d1c6b62443018
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
|
|
Change-Id: I33cd6e44d126c73c1f4c16b2041ea607b4d7f39f
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I7fb5402d4a530b5f2ffd9bb5787632099f4b4189
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I03884b6cde9d4c38ae13d1994fd8d37d44016ef0
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Improves TCP iperf3 performance by ~3% on AArch64.
Change-Id: I1e51bd8403ba45ec6af4c2f96b95e884c1ae0d67
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
|
|
Change-Id: Id4f37f5d4a03160572954a416efa1ef9b3d79ad1
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: I91c9d040fc9b9b63f7109eeaac334c47fb1226cf
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ied34720ca5a6e6e717eea4e86003e854031b6eab
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
This is first part of addition of atomic macros with only macros for
__sync builtins.
- Based on earlier patch by Damjan (https://gerrit.fd.io/r/#/c/10729/)
Additionally
- clib_atomic_release macro added and used in the absence
of any memory barrier.
- clib_atomic_bool_cmp_and_swap added
Change-Id: Ie4e48c1e184a652018d1d0d87c4be80ddd180a3b
Original-patch-by: Damjan Marion <damarion@cisco.com>
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
Reviewed-by: Steve Capper <steve.capper@arm.com>
|
|
Change-Id: I0d42a0c71fea7dd669fb1fe5ded7e6e944245c7d
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ifa4fceef7edbe43d444790a624957db0817064de
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I6a4335654882a2ca66d3d465e35e350868242b8d
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I4bd9c9f73499711e04b38d53daa5c917a4285bf5
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Have vcl poll and wait on the event message queues as opposed to
constantly polling the session fifos. This also adds event signaling to
cut through sessions.
On the downside, because we can't wait on multiple condvars, i.e., when
we have multiple message queues because of cut-through registrations, we
do timed waits.
Change-Id: I29ade95dba449659fe46008bb1af502276a7c5fd
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- Cleanup session state after last ack and avoid using a cleanup timer.
- Change session cleanup to free the session as opposed to waiting for
delete notify.
- When in close-wait, postpone sending the fin on close until all
outstanding data has been sent.
- Don't flush rx fifo unless in closed state
Change-Id: Ic2a4f0d5568b65c83f4b55b6c469a7b24b947f39
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
If the caller is the session owning thread or the main thread with a
worker barrier sync (cli/api) add an event to the pending disconnects
vector in the session node and entirely avoid using the event queue.
Useful for bursts of disconnects (like an app detach).
If disconnects come from a processes, be willing to retry enqueueing the
disconnect to the event queue multiple times.
Change-Id: Ieece1f1091b713f94c41c703b6e805bc8498816a
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- rework the function to declutter and avoid building more than one tx
frame
- add dual loop although benefits in my tests seem to be minimal
- improve tcp/udp echo external apps. They have slightly better
throughput than internal echo apps.
- udp bugfixes
Change-Id: Iea4a245b1b1bb407a7f403dedcce2664a49f774b
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- adds session layer support for datagram based protocols
- updates udp to work in pure connectionless and datagram mode. The
existing connected mode is now 'accessible' for apps as a dummy UDPC,
as in, connected udp, protocol.
- updates udp_echo, echo client, echo server code to work in datagram
mode.
Change-Id: I2960c0d2d246cb166005f545794ec31fe0d546dd
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
It consists of two main parts. First, add an application transport type
whereby applications can offer transport to other applications. For
instance, a tls app can offer transport services to other applications.
And second, a tls transport app that leverages the mbedtls library for
tls protocol implementation.
Change-Id: I616996c6e6539a9e2368fab8a1ac874d7c5d9838
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I25238debb7081b4467aec4620dfdef33fbef3295
Signed-off-by: Chris Luke <chrisy@flirble.org>
|
|
- set session state as closed on session manager delete
- enable retransmit as opposed to persist timer after persist timer completes
- properly discard buffer chain bytes when new data overlaps ooo
segments
- don't use rxt bytes in snd space estimate used on tx path
Change-Id: Id9cab686e532e5fe70c775d5440260e8eb890a9f
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- Round up requested fifo size to the next power of two
- Maintain per-segment power-of-two freelists
- Allocate fifos in chunks, to amortize alignment overhead
- Detach builtin test client application after each run
so we can use different fifo sizes each time
- Be more suspicious of session / application indices
Useful prep work for dynamically resizing fifos. As far as the svm
fifo code is concerned, it's OK to set fifo->nitems anywhere in the
interval: [0, 1<<(fifo->freelist_index) + FIFO_SEGMENT_MIN_FIFO_SIZE]
It's unlikely that setting nitems below the path MTU will work out
very well...
Change-Id: Idad73a027dfb7412056cb02988b77e300fa7e8a7
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
- Clean up internal API client registration
- Add proxy server
- Add a reference count to the svm fifo
Change-Id: I5ace1c85497062ed412d26ae76a9e6741af1e984
Signed-off-by: Dave Barach <dave@barachs.net>
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- Fix rx sack option parsing
- Add session sack scoreboard tracing and replaying
- Add svm fifo tracing and replaying
- Scoreboard/svm fifo ooo segment reception fixes
- Improved overall debugging
Change-Id: Ieae07eba355e66f5935253232bb00f2dfb7ece00
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- Data structure preallocation.
- Input state machine fixes for mid-stream 3-way handshake retries.
- Batch connections in the builtin_client
- Multiple private fifo segment support
- Fix elog simultaneous event type registration
- Fix sacks when segment hole is added after highest sacked
- Add "accepting" session state for sessions pending accept
- Add ssvm non-recursive locking
- Estimate RTT for syn-ack
- Don't init fifo pointers. We're using relative offsets for ooo
segments
- CLI to dump individual session
Change-Id: Ie0598563fd246537bafba4feed7985478ea1d415
Signed-off-by: Dave Barach <dbarach@cisco.com>
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- multiarch on svm fifo
- avoid ip lookup on tx
Change-Id: Iab0d85204a710979417bca1d692cc47877131203
Signed-off-by: Florin Coras <fcoras@cisco.com>
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
- limit minimum rto per connection
- cleanup sack scoreboard
- switched svm fifo out-of-order data handling from absolute offsets to
relative offsets.
- improve cwnd handling when using sacks
- add cc event debug stats
- improved uri tcp test client/server: bugfixes and added half-duplex mode
- expanded builtin client/server
- updated uri socket client/server code to work in half-duplex
- ensure session node unsets fifo event for empty fifo
- fix session detach
Change-Id: Ia446972340e32a65e0694ee2844355167d0c170d
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- refactor existing congestion control code (RFC 6582/5681). Handling of ack
feedback now consists of: ack parsing, cc event detection, event handling,
congestion control update
- extend sack scoreboard to support sack based retransmissions
- basic implementation of Eifel detection algorithm (RFC 3522) for
detecting spurious retransmissions
- actually initialize the per-thread frame freelist hash tables
- increase worker stack size to 2mb
- fix session queue node out-of-buffer handling
- ensure that the local buffer cache vec_len matches reality
- avoid 2x spurious event requeues when short of buffers
- count out-of-buffer events
- make the builtin server thread-safe
- fix bihash template threading issue: need to paint -1 across uninitialized
working_copy_length vector elements (via rebase from master)
Change-Id: I646cb9f1add9a67d08f4a87badbcb117980ebfc4
Signed-off-by: Florin Coras <fcoras@cisco.com>
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
Also improves builtin client code.
Change-Id: I8bca1aa632028f95c373726efb0abf2ee0eff414
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
- Improve svm fifo handling of out-of-order segments
- Ensure tsval_recent is updated only if rcv_las falls withing the
segments's sequence space
- Avoid directly dropping old ACKs
- Improve debugging
Change-Id: I88dbe2394a0ad7eb389a4cc12d013a13733953aa
Signed-off-by: Florin Coras <fcoras@cisco.com>
|