aboutsummaryrefslogtreecommitdiffstats
path: root/src/vnet/tcp/tcp_input.c
AgeCommit message (Collapse)AuthorFilesLines
2018-11-14Remove c-11 memcpy checks from perf-critical codeDave Barach1-11/+14
Change-Id: Id4f37f5d4a03160572954a416efa1ef9b3d79ad1 Signed-off-by: Dave Barach <dave@barachs.net>
2018-11-12tcp: handle disconnects after enq notificationsFlorin Coras1-14/+50
Make sure that we notify the app of the data enqueued in the burst before notifying of disconnect. Change-Id: I7747a5cbb4c6bc9132007f849c24ce04b7841273 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-09tcp: basic cubic implementationFlorin Coras1-3/+8
Because the code is not optimized, newreno is still the default congestion control algorithm. Change-Id: I7061cc80c5a75fa8e8265901fae4ea2888e35173 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-08tcp: pacer and mrtt estimation improvementsFlorin Coras1-11/+51
- update pacer once per burst - better estimate initial rtt - compute smoothed average for higher precision rtt estimate Change-Id: I06d41a98784cdf861bedfbee2e7d0afc0d0154ef Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-07tcp: consume incoming buffers instead of reusingFlorin Coras1-755/+628
Instead of reusing buffers for acking, consume all buffers and program output for (dup)ack generation. This implicitly fixes the drop counters that were artificially inflated by both data and feedback traffic. Moreover, the patch also significantly reduces the ack traffic as we now only generate an ack per frame, unless duplicate acks need to be sent. Because of the reduced feedback traffic, a sender's rx path and a receiver's tx path are now significantly less loaded. In particular, a sender can overwhelm a 40Gbps NIC and generate tx drop bursts for low rtts. Consequently, tx pacing is now enforced by default. Change-Id: I619c29a8945bf26c093f8f9e197e3c6d5d43868e Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-06tcp: dequeue acked only once per burstFlorin Coras1-23/+54
Avoid dequeuing acked bytes more than once per burst for a connection. Although the fifos do not use locks, size decrements are atomic, so they rely on locked instructions. Change-Id: Id65f4ea40b2c10057461402dfd0393034e6472d5 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-05tcp: send unsent data in fast recoveryFlorin Coras1-11/+18
Allows sending of unsent data in fast recovery and consolidates logic in tcp, instead of splitting it between tcp fast retransmit and tcp output path called by the session layer. Change-Id: I9b12cdf2aa2ac50b9f25e46856fed037163501fe Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-02tcp: coverity fixesFlorin Coras1-1/+1
Change-Id: Ib15d629c5fde7849bfa3307f42659e920eb0f463 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-02session: measure dispatch period only if under loadFlorin Coras1-7/+5
Also reset pacer on tcp retransmit timeout Change-Id: I5a9edee4c00d1d169248d79587a9b10437c2bd87 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-02tcp: minimize use of tlsFlorin Coras1-16/+16
Also propagate tcp worker context instead of retrieving it multiple times. Change-Id: I7b273b981826b37783566d0172a64cd6957f3b33 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-11-01tcp: fast retransmit pacingFlorin Coras1-40/+47
Force pacing for fast retransmit to avoid bursts of retransmitted packets. Change-Id: I2ff42c328899b36322c4de557b1f7d853dba8fe2 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-25tcp/session: add tx pacerFlorin Coras1-11/+11
Adds tx pacing infrastructure for transport protocols that want to use it. Particularly useful for connections with non-negligible rtt and constrained network throughput as it avoids large tx bursts that lead to local interface tx or network drops. By default the pacer is disabled. To enabled it for tcp, add tx-pacing to tcp's startup conf. We are still slightly inefficient in the handling of incoming packets in established state so the pacer slightly affect maximum throughput in low lacency scenarios. Change-Id: Id445b2ffcd64cce015f75b773f7d722faa0f7ca9 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-23tcp: fast retransmit improvementsFlorin Coras1-58/+165
Patch is too large to be ported to 18.10 just days before release. - handle fast retransmits outside of established node and limit the retransmit burst size to avoid tx losses and worsening congestion. - in the absance of a tx pacer, use slow start after fast retransmit exists - add fast retransmit heuristic that re-retries sending the first segment if everything else fails - fine tuning Change-Id: I84a2ab8fbba8b97f1d2b26584dc11a1e2c33c8d2 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-23c11 safe string handling supportDave Barach1-2/+2
Change-Id: Ied34720ca5a6e6e717eea4e86003e854031b6eab Signed-off-by: Dave Barach <dave@barachs.net>
2018-10-21tcp: count first lost hole (VPP-1465)Florin Coras1-14/+17
Change-Id: I3ac136e2a10796d8fa86ddb6f0d6cabe5fa749f8 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-18tcp: fix sacks lost bytes counting (VPP-1465)Florin Coras1-16/+28
Change-Id: Ie46b3a81de4ed39b7b40e3879436f7e5a2908d98 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-18tls: fix connection failures/interrupts at scale (VPP-1464)Florin Coras1-3/+7
Change-Id: I0bc4062c1fd3202ee201acb36a2bb14fc6ee1543 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-17tcp: avoid sack processing when not needed (VPP-1460)Florin Coras1-1/+2
Change-Id: If81ee34e1f1e929de1a5b758ddb9aede4002e858 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-10-02tcp: accept fins if in recoveryFlorin Coras1-1/+3
Change-Id: I0c9c055fcc3d681c4032228a90cc81f484e200f0 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-09-27tcp: use scaled window for new connectsFlorin Coras1-2/+2
Change-Id: Idf83fce8ca176e57b323e3741034e3223f1d195a Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-09-25tcp: accept rst+ack in syn-rcvd stateFlorin Coras1-0/+2
Change-Id: I49da8be88dd033aae1b190e8e2163069ef480442 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-09-21tcp: accept fin-ack in syn-rcvd stateFlorin Coras1-1/+3
Change-Id: Ibe57abc7d2a06be80146ce28edf4c60ecda991eb Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-08-24tcp: fix cc recovery re-entry and persist timer popFlorin Coras1-6/+15
Change-Id: I89e8052f2d2c36dd3de5255c4ee570722dc58227 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-26tcp: avoid doing work in tcp_rcv_sacks for no sacksFlorin Coras1-2/+3
Change-Id: I00a0d7f57dc144d338d5ad45b0a6e3720c32c400 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-26tcp: cleanup functionsFlorin Coras1-19/+93
- sprinkle statics for functions - move some inlines from header files to corresponding .c files - replace some always_inlines with statics where inlining is not performance critical Change-Id: I371dbf63431ce7e27e4ebbbdd844a9546a1f1849 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-21tcp: move tracing out of established loopFlorin Coras1-27/+59
Change-Id: I4a7e4f616ed4a4df82be3112ac4702f7e4e2025c Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-19tcp: optimize tcp outputFlorin Coras1-0/+2
Change-Id: Idf17a0633a1618b12c22b1119e40c2e9d3192df9 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-19tcp: optimize tcp inputFlorin Coras1-134/+204
Change-Id: Ib98cfc93f6c574de5250f251925f7ed4e86f5f6f Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-11tcp: cleanup connection/session fixesFlorin Coras1-8/+16
- Cleanup session state after last ack and avoid using a cleanup timer. - Change session cleanup to free the session as opposed to waiting for delete notify. - When in close-wait, postpone sending the fin on close until all outstanding data has been sent. - Don't flush rx fifo unless in closed state Change-Id: Ic2a4f0d5568b65c83f4b55b6c469a7b24b947f39 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-06-10tcp: fix timer based recovery exit conditionFlorin Coras1-1/+0
Change-Id: I3f36e5760fd2935cc29d22601d4c0a1d2a22ba84 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-05-26tcp: loss recovery improvements/fixesFlorin Coras1-22/+47
- fix newreno cwnd computation - reset snd_una_max on entering recovery - accept acks beyond snd_nxt but less than snd_congestion when in recovery - avoid entering fast recovery multiple times when using sacks - avoid as much as possible sending small segments when doing fast retransmit - more event logging Change-Id: I19dd151d7704e39d4eae06de3a26f5e124875366 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-05-25tcp: handle acks in close waitFlorin Coras1-0/+1
Thanks to Ning Li <muziding001@163.com> for reporting. Change-Id: I758bc6760ec5a9ec688172bc162a1873f96ab4f3 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-05-23tcp: cc improvements and fixesFlorin Coras1-18/+19
Change-Id: I6615bb612bcc3f795b5f822ea55209bb30ef35b5 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-05-17tcp: handle link-local addressesFlorin Coras1-0/+2
Change-Id: I9ede6bc861350c7d9e78fa4d96cd584c2816d06f Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-04-30tcp/session: debug improvements/fixesFlorin Coras1-3/+2
Change-Id: I906e58b4f9827a79a6ab673f8fa2e03036c69820 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-04-20tcp: make newreno byte instead of acks dependentFlorin Coras1-0/+1
Should be more resilient to ack losses Change-Id: Icec3b93c1d290dec437fcc4e6fe5171906c9ba8a Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-04-20tcp: improve statsFlorin Coras1-119/+159
Change-Id: I9ab11ba9f958c679112eb22c8db39cb269a29dc7 Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-03-29tcp: fix fib index buffer taggingFlorin Coras1-0/+1
Change-Id: I373cc252df3621d44879b8eca70aed17d7752a2a Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-03-23tcp/session: sprinkle prefetchesFlorin Coras1-0/+8
Change-Id: Idef3c665580c13d72e99f43d16b8b13cc6ab746f Signed-off-by: Florin Coras <fcoras@cisco.com>
2018-01-25session: add support for memfd segmentsFlorin Coras1-2/+3
- update segment manager and session api to work with both flavors of ssvm segments - added generic ssvm slave/master init and del functions - cleanup/refactor tcp_echo - fixed uses of svm fifo pool as vector Change-Id: Ieee8b163faa407da6e77e657a2322de213a9d2a0 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-12-15Fix icmp/udp/tcp punt/drop pathsVijayabhaskar Katamreddy1-10/+14
Send packets to ip4/6_punt/drop nodes instead of error-drop/punt nodes dbarach: clean up an annoying checkstyle issue: indent 2.2.10 (OpenSUSE version) and indent 2.2.11 (Ubuntu / CentOS versions) had an artistic disagreement about ip_frag.c. Change-Id: I660bee28a064af9c6c70371363081e941d1c3a94 Signed-off-by: Vijayabhaskar Katamreddy <vkatamre@cisco.com> Signed-off-by: Dave Barach <dave@barachs.net>
2017-11-28tcp: fix retransmissions under buffer shortageFlorin Coras1-2/+3
- add debugging scaffolding for simulating buffer shortage Change-Id: Ice519d74f9c4e4094c4586c548185135b7bb5f2d Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-11-27tcp: fix proxy connection validationFlorin Coras1-0/+4
Change-Id: Icb0274cd3bcabfab8bdff6dec7440a3a15edfbf1 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-11-20session/tcp: filtering improvementsFlorin Coras1-14/+22
- make allow action explicit (-3) - add session lookup is_filtered return flag that is set if lookup hit a deny filter - change tcp logic to drop filtered packets when punting is enabled Change-Id: Ic38f294424663a4e108439b7571511f46f8e0be1 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-11-09tcp: call accept notify after full connection initFlorin Coras1-9/+9
Change-Id: I69998aa4eb587d80fc61d14bb28a9318a318f9ec Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-28session: rules tablesFlorin Coras1-1/+1
This introduces 5-tuple lookup tables that may be used to implement custom session layer actions at connection establishment time (session layer perspective). The rules table build mask-match-action lookup trees that for a given 5-tuple key return the action for the first longest match. If rules overlap, ordering is established by tuple longest match with the following descending priority: remote ip, local ip, remote port, local port. At this time, the only match action supported is to forward packets to the application identified by the action. Change-Id: Icbade6fac720fa3979820d50cd7d6137f8b635c3 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-16udp: refactor udp codeFlorin Coras1-47/+45
Change-Id: I44d5c9df7c49b8d4d5677c6d319033b2da3e6b80 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-10session: add support for application namespacingFlorin Coras1-38/+50
Applications are now provided the option to select the namespace they are to be attached to and the scope of their attachement. Application namespaces are meant to: 1) constrain the scope of communication through the network by association with source interfaces and/or fib tables that provide the source ips to be used and limit the scope of routing 2) provide a namespace local scope to session layer communication, as opposed to the global scope provided by 1). That is, sessions can be established without assistance from transport and network layers. Albeit, zero/local-host ip addresses must still be provided in session establishment messages due to existing application idiosyncrasies. This mode of communication uses shared-memory fifos (cut-through sessions) exclusively. If applications request no namespace, they are assigned to the default one, which at its turn uses the default fib. Applications can request access to both local and global scopes for a namespace. If no scope is specified, session layer defaults to the global one. When a sw_if_index is provided for a namespace, zero-ip (INADDR_ANY) binds are converted to binds to the requested interface. Change-Id: Ia0f660bbf7eec7f89673f75b4821fc7c3d58e3d1 Signed-off-by: Florin Coras <fcoras@cisco.com>
2017-10-04[aarch64] Fixes CLI crashes on dpaa2 platform.Christophe Fontaine1-1/+1
- always use 'va_args' as pointer in all format_* functions - u32 for all 'indent' params as it's declaration was inconsistent Change-Id: Ic5799309a6b104c9b50fec309cba789c8da99e79 Signed-off-by: Christophe Fontaine <christophe.fontaine@enea.com>
2017-10-03tcp: updates to connection closing procedure (VPP-996)Florin Coras1-8/+12
- add separate TIME_WAIT time constant - fix output node for TIME_WAIT acks - ensure snd_nxt is snd_una_max after retransmitting fin - debugging improvements Change-Id: Ic947153346979853f2526824b229126e47aead86 Signed-off-by: Florin Coras <fcoras@cisco.com>