summaryrefslogtreecommitdiffstats
path: root/src/vnet/tcp/tcp.c
AgeCommit message (Collapse)AuthorFilesLines
2018-12-13session/tcp: support tx flush markFlorin Coras1-0/+11
For tcp this means that the last enqueued data goes out with a psh bit set. Change-Id: I29d357ecae6f02e748b59a7b799150ec73d14ba2 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-12-05session/tcp: postpone cleanup on resetFlorin Coras1-1/+7
Change-Id: I45fd7538853f84c6c8bf804cc20acbc9601db3ba Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-14Remove c-11 memcpy checks from perf-critical codeDave Barach1-7/+7
Change-Id: Id4f37f5d4a03160572954a416efa1ef9b3d79ad1 Signed-off-by: Dave Barach <dave@barachs.net>
2018-11-13tcp: cubic fast convergenceFlorin Coras1-0/+25
Change-Id: I3a15960fe346763faf13e8728ce36c2f3bf7b05a Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-12tcp: handle disconnects after enq notificationsFlorin Coras1-0/+2
Make sure that we notify the app of the data enqueued in the burst before notifying of disconnect. Change-Id: I7747a5cbb4c6bc9132007f849c24ce04b7841273 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-09tcp: basic cubic implementationFlorin Coras1-1/+20
Because the code is not optimized, newreno is still the default congestion control algorithm. Change-Id: I7061cc80c5a75fa8e8265901fae4ea2888e35173 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-08tcp: pacer and mrtt estimation improvementsFlorin Coras1-3/+4
- update pacer once per burst - better estimate initial rtt - compute smoothed average for higher precision rtt estimate Change-Id: I06d41a98784cdf861bedfbee2e7d0afc0d0154ef Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-07tcp: consume incoming buffers instead of reusingFlorin Coras1-4/+11
Instead of reusing buffers for acking, consume all buffers and program output for (dup)ack generation. This implicitly fixes the drop counters that were artificially inflated by both data and feedback traffic. Moreover, the patch also significantly reduces the ack traffic as we now only generate an ack per frame, unless duplicate acks need to be sent. Because of the reduced feedback traffic, a sender's rx path and a receiver's tx path are now significantly less loaded. In particular, a sender can overwhelm a 40Gbps NIC and generate tx drop bursts for low rtts. Consequently, tx pacing is now enforced by default. Change-Id: I619c29a8945bf26c093f8f9e197e3c6d5d43868e Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-06tcp: dequeue acked only once per burstFlorin Coras1-3/+5
Avoid dequeuing acked bytes more than once per burst for a connection. Although the fifos do not use locks, size decrements are atomic, so they rely on locked instructions. Change-Id: Id65f4ea40b2c10057461402dfd0393034e6472d5 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-05tcp: send unsent data in fast recoveryFlorin Coras1-32/+19
Allows sending of unsent data in fast recovery and consolidates logic in tcp, instead of splitting it between tcp fast retransmit and tcp output path called by the session layer. Change-Id: I9b12cdf2aa2ac50b9f25e46856fed037163501fe Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-02tcp: coverity fixesFlorin Coras1-7/+7
Change-Id: Ib15d629c5fde7849bfa3307f42659e920eb0f463 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-02session: measure dispatch period only if under loadFlorin Coras1-1/+12
Also reset pacer on tcp retransmit timeout Change-Id: I5a9edee4c00d1d169248d79587a9b10437c2bd87 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-02tcp: minimize use of tlsFlorin Coras1-30/+20
Also propagate tcp worker context instead of retrieving it multiple times. Change-Id: I7b273b981826b37783566d0172a64cd6957f3b33 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-01tcp: fast retransmit pacingFlorin Coras1-4/+12
Force pacing for fast retransmit to avoid bursts of retransmitted packets. Change-Id: I2ff42c328899b36322c4de557b1f7d853dba8fe2 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-28session: extend connect api for internal appsFlorin Coras1-7/+1
Change-Id: Ie4c5cfc4c97acb321a46b4df589dc44de1b616ba Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-25vnet/tcp/tcp.c: address a corner case.Paul Vinciguerra1-4/+7
Avoid possible null pointer dereference Change-Id: If8023edb43aaf037234f4a7b5f191cb23b09c74d Signed-off-by: Paul Vinciguerra <pvinci@vinciconsulting.com>
2018-10-25session/tcp: improve cliFlorin Coras1-37/+46
Change-Id: I91c9d040fc9b9b63f7109eeaac334c47fb1226cf Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-25tcp/session: add tx pacerFlorin Coras1-5/+39
Adds tx pacing infrastructure for transport protocols that want to use it. Particularly useful for connections with non-negligible rtt and constrained network throughput as it avoids large tx bursts that lead to local interface tx or network drops. By default the pacer is disabled. To enabled it for tcp, add tx-pacing to tcp's startup conf. We are still slightly inefficient in the handling of incoming packets in established state so the pacer slightly affect maximum throughput in low lacency scenarios. Change-Id: Id445b2ffcd64cce015f75b773f7d722faa0f7ca9 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-23tcp: fast retransmit improvementsFlorin Coras1-12/+4
Patch is too large to be ported to 18.10 just days before release. - handle fast retransmits outside of established node and limit the retransmit burst size to avoid tx losses and worsening congestion. - in the absance of a tx pacer, use slow start after fast retransmit exists - add fast retransmit heuristic that re-retries sending the first segment if everything else fails - fine tuning Change-Id: I84a2ab8fbba8b97f1d2b26584dc11a1e2c33c8d2 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-23c11 safe string handling supportDave Barach1-11/+11
Change-Id: Ied34720ca5a6e6e717eea4e86003e854031b6eab Signed-off-by: Dave Barach <dave@barachs.net>
2018-10-18tcp: fix cleanup of non established connections (VPP-1463)Florin Coras1-1/+4
- fix delete of connection in syn-received - fix delete of half-open connection Change-Id: I72ff4b60406a2762d998328c52f41adea40d2c1b Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-17tcp: avoid sack processing when not needed (VPP-1460)Florin Coras1-2/+4
Change-Id: If81ee34e1f1e929de1a5b758ddb9aede4002e858 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-02tcp: fix close wait timeout with no finFlorin Coras1-0/+4
Change-Id: Icba9b0dc6dcb4b72288f966728201812d8d12144 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-09-27tcp: use scaled window for new connectsFlorin Coras1-3/+9
Change-Id: Idf83fce8ca176e57b323e3741034e3223f1d195a Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-09-25tcp: add option to cfg max rx fifo sizeFlorin Coras1-5/+8
Change-Id: Icff3d688506e7658330db004c58bcfcac273fcec Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-26tcp/session: tx optimizationsFlorin Coras1-1/+1
- cache and reuse tcp options and rcv_wnd for session layer tx bursts - avoid reading/setting total_length_not_including_first_buffer. It's part of a buffer's second cache line so it comes at a "cost". Change-Id: Id18219c2f7e07cf4c63ee74f9cdd9e5918904036 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-26tcp: cleanup functionsFlorin Coras1-37/+69
- sprinkle statics for functions - move some inlines from header files to corresponding .c files - replace some always_inlines with statics where inlining is not performance critical Change-Id: I371dbf63431ce7e27e4ebbbdd844a9546a1f1849 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-20tcp: add per worker ctx structureFlorin Coras1-14/+5
Change-Id: I28d3c31bdc4255a4ca223d80bcf44709fb39f4ed Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-19tcp: optimize tcp outputFlorin Coras1-1/+1
Change-Id: Idf17a0633a1618b12c22b1119e40c2e9d3192df9 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-11tcp: cleanup connection/session fixesFlorin Coras1-13/+16
- Cleanup session state after last ack and avoid using a cleanup timer. - Change session cleanup to free the session as opposed to waiting for delete notify. - When in close-wait, postpone sending the fin on close until all outstanding data has been sent. - Don't flush rx fifo unless in closed state Change-Id: Ic2a4f0d5568b65c83f4b55b6c469a7b24b947f39 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-05-26tcp: loss recovery improvements/fixesFlorin Coras1-3/+3
- fix newreno cwnd computation - reset snd_una_max on entering recovery - accept acks beyond snd_nxt but less than snd_congestion when in recovery - avoid entering fast recovery multiple times when using sacks - avoid as much as possible sending small segments when doing fast retransmit - more event logging Change-Id: I19dd151d7704e39d4eae06de3a26f5e124875366 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-05-25tcp: handle acks in close waitFlorin Coras1-1/+1
Thanks to Ning Li <muziding001@163.com> for reporting. Change-Id: I758bc6760ec5a9ec688172bc162a1873f96ab4f3 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-05-21tcp: unlock link-local adjacencies on connection cleanupFlorin Coras1-0/+47
Change-Id: I37705fb572045f42be4c2dabbd8460c8f8872167 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-03-02session: first approximation implementation of tlsFlorin Coras1-0/+2
It consists of two main parts. First, add an application transport type whereby applications can offer transport to other applications. For instance, a tls app can offer transport services to other applications. And second, a tls transport app that leverages the mbedtls library for tls protocol implementation. Change-Id: I616996c6e6539a9e2368fab8a1ac874d7c5d9838 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-12-11session: generalize handling of network transportsFlorin Coras1-4/+17
- compute session type out of transport and network protos - make session, session lookup and session queue code network protocol agnostic This does not update the session layer to support non-ip network layer protocols Change-Id: Ifc2f92845e158b649d59462eb7d51c12af536691 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-11-29session: fix preallocation of local endpoint tableFlorin Coras1-13/+0
Change-Id: I67a73e31bda9e497859297fcc1765e880572884a Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-11-28tcp: fix retransmissions under buffer shortageFlorin Coras1-2/+5
- add debugging scaffolding for simulating buffer shortage Change-Id: Ice519d74f9c4e4094c4586c548185135b7bb5f2d Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-11-16tcp: register with ip for header parsing by defaultFlorin Coras1-9/+12
Change-Id: I4e420bcc9241b03e179a939911059c0cc3704a51 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-16udp: refactor udp codeFlorin Coras1-218/+59
Change-Id: I44d5c9df7c49b8d4d5677c6d319033b2da3e6b80 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-12tcp: do not format sb if not established (VPP-1018)Florin Coras1-2/+3
Change-Id: I011dda118f37cb31a37dda270027612d0af57ca0 Signed-off-by: Florin Coras <fcoras@cisco.com> (cherry picked from commit 87f141172212b7568f519653ab32ebd1b5d34344)
2017-10-10session: add support for application namespacingFlorin Coras1-56/+53
Applications are now provided the option to select the namespace they are to be attached to and the scope of their attachement. Application namespaces are meant to: 1) constrain the scope of communication through the network by association with source interfaces and/or fib tables that provide the source ips to be used and limit the scope of routing 2) provide a namespace local scope to session layer communication, as opposed to the global scope provided by 1). That is, sessions can be established without assistance from transport and network layers. Albeit, zero/local-host ip addresses must still be provided in session establishment messages due to existing application idiosyncrasies. This mode of communication uses shared-memory fifos (cut-through sessions) exclusively. If applications request no namespace, they are assigned to the default one, which at its turn uses the default fib. Applications can request access to both local and global scopes for a namespace. If no scope is specified, session layer defaults to the global one. When a sw_if_index is provided for a namespace, zero-ip (INADDR_ANY) binds are converted to binds to the requested interface. Change-Id: Ia0f660bbf7eec7f89673f75b4821fc7c3d58e3d1 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-03tcp: updates to connection closing procedure (VPP-996)Florin Coras1-3/+34
- add separate TIME_WAIT time constant - fix output node for TIME_WAIT acks - ensure snd_nxt is snd_una_max after retransmitting fin - debugging improvements Change-Id: Ic947153346979853f2526824b229126e47aead86 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-09-20session: store tep port in net orderFlorin Coras1-2/+2
Change-Id: Ie3a99f09f44ec081d9b88a213bdb8d987fb462de Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-09-20TCP: fix "tcp src-address" command with IPv6Yoann Desmouceaux1-1/+1
When given a single IPv6 address, the "tcp src-address" command incorrectly infers the end of the range by copying sizeof(ip4_address_t) bytes from the given address. Change-Id: I100d5c6674d3a3980b8c018588988bdd32ff7269 Signed-off-by: Yoann Desmouceaux <ydesmouc@cisco.com>
2017-09-20tcp: add option to punt trafficPierre Pfister1-0/+33
Until now, if the stack didn't find a connection for a packet, it sent back a reset. With the punt option enabled, packets are now enqueued to error-punt where they can be handed off to the host os. Change-Id: I12dea8694b8bd24c92b0d601412928aa7b8046cb Signed-off-by: Florin Coras <fcoras@cisco.com> Signed-off-by: Pierre Pfister <ppfister@cisco.com>
2017-09-19session/tcp: improve preallocated segment handlingFlorin Coras1-2/+5
- add preallocated segment flag - don't remove pre-allocated segments except if application detaches - when preallocating fifos in multiple segments, completely fill a segment before moving to the next - detach server application from segment-managers when deleting app - batch syn/syn-ack/fin (re)transmissions - loosen up close-wait and time-wait times Change-Id: I412f53ce601cc83b3acc26aeffd7fa2d52d73b03 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-09-12tcp: horizontal scaling improvmentsFlorin Coras1-1/+25
- do not scale syn-ack window - fix the max number of outstanding syns in builtin client - fix syn-sent ack validation to use modulo arithmetic - improve retransmit timer handler - fix output buffer allocator leakeage - improved debugging Change-Id: Iac3bc0eadf7d0b494a93e22d210a3153b61b3273 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-09-01Add fixed-size, preallocated pool supportDave Barach1-12/+12
Simply call pool_init_fixed(...) before using the pool. Note that fixed, preallocated pools live in individually-mmap'ed address segments, except for the free element bitmap. A large fixed pool can exceed 4gb. Fix tcp buffer allocator leak, remove broken assert Change-Id: I4421082e12a77c41c6e20f7747f3150dcd01fc26 Signed-off-by: Dave Barach <dave@barachs.net>
2017-08-29session: segment manager improvementsFlorin Coras1-1/+5
- cleanup connects segment manager even if first - fix segment manager allocation for listen sessions - improve handling of process private segments (mheaps/main heap) - added segment manager cli Change-Id: Ic2ca97c3622ab2286d5fb5772aeb57680e64f769 Signed-off-by: Florin Coras <fcoras@cisco.com> Signed-off-by: Dave Wallace <dwallacelf@gmail.com>
2017-08-25TCP horizontal scalingDave Barach1-2/+21
- Remove frame handoff support machinery. We haven't used it in a long time. - Configuration support for the local endpoints bihash table - Drop lookup failure packets in tcp46_syn_sent Change-Id: Icd51e6785f74661c741e76fac23d21c4cc998d17 Signed-off-by: Dave Barach <dave@barachs.net>