Age | Commit message (Collapse) | Author | Files | Lines |
|
Instead of reusing buffers for acking, consume all buffers and program
output for (dup)ack generation. This implicitly fixes the drop counters
that were artificially inflated by both data and feedback traffic.
Moreover, the patch also significantly reduces the ack traffic as we now
only generate an ack per frame, unless duplicate acks need to be sent.
Because of the reduced feedback traffic, a sender's rx path and a
receiver's tx path are now significantly less loaded. In particular, a
sender can overwhelm a 40Gbps NIC and generate tx drop bursts for low
rtts. Consequently, tx pacing is now enforced by default.
Change-Id: I619c29a8945bf26c093f8f9e197e3c6d5d43868e
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Optimize zero byte mask NEON functions below with less intrinsics,
and get their outputs consistent with functions in vector_sse42.h
always_inline u32 u64x2_zero_byte_mask (u64x2 input)
always_inline u32 u32x4_zero_byte_mask (u32x4 input)
always_inline u32 u16x8_zero_byte_mask (u16x8 input)
always_inline u32 u8x16_zero_byte_mask (u8x16 input)
always_inline u32 i64x2_zero_byte_mask (i64x2 input)
always_inline u32 i32x4_zero_byte_mask (i32x4 input)
always_inline u32 i16x8_zero_byte_mask (i16x8 input)
always_inline u32 i8x16_zero_byte_mask (i8x16 input)
Change-Id: I7f485915baeb37fa2dd484699b8769e0136f6574
Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
Reviewed-by: Sirshak Das <Sirshak.Das@arm.com>
|
|
Change-Id: I30487bd736407378fb5a6d313e4eef12bbb262b8
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Learning GBP endpoints over vxlan-gbp tunnels
Change-Id: I1db9fda5a16802d9ad8b4efd4e475614f3b21502
Signed-off-by: Neale Ranns <neale.ranns@cisco.com>
|
|
pool (VPP-1485)
Change-Id: Iaa404361eac2a6612dcdaba3f73bae41a35c5446
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
Change-Id: I6d6a73ac62f24928fb51e89948b92a1cb9134c40
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
In output.c, we buffer the descriptors and call vmxnet3_reg_write_inline
once outside the loop. This change improves the performance dramatically.
When refilling the ring, there is no need to inform the device unless
explicitly specified by the device (ctrl.update_prod == 1)
Change-Id: I7031d58bff0d249e913d14236d416c91eb6ab94a
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: Ifb841312d4a382547153b24903230b407f649e73
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
If no PCI address is specified in dpdk config, the default to automatically
put all PCIs in the whitelist.
For vmxnet3 PCIs, we want to change its default to exclude the vmxnet3 PCIs.
That is to put them in the blacklist instead of whitelist.
Change-Id: I2b7061d6437910eb0e1b16df19a770cab968c602
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I29f20dbaf2c2d735faff297cee552ed648f6f61b
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
(gdb) bt
bt
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) frame 5
frame 5
293 if (PREDICT_FALSE (rxvq->last_avail_idx == rxvq->avail->idx))
(gdb) p *rxvq
p *rxvq
$3 = {cacheline0 = 0x7f290bcadd80 "\377\003", qsz_mask = 1023, last_avail_idx = 0, last_used_idx = 0, n_since_last_int = 0, desc = 0x0, avail = 0x0, used = 0x0, int_deadline = 0, started = 1 '\001', enabled = 1 '\001', log_used = 0 '\000', cacheline1 = 0x7f290bcaddc0 "\377\377\377\377\016", errfd = -1, callfd_idx = 14, kickfd_idx = 19, log_guest_addr = 5151049792, mode = 0}
The crash is because we access the null pointer rxvq->avail,
which is supposed to be derived from the mmap informed by the driver.
We fixed a similar issue before in
https://gerrit.fd.io/r/#/c/14545/
The reason was the driver ummaps the memory without doing the disconnect in
SR-IOV environment. The fixed was applied to the RX path. Now it happens in the
TX path. We just need to apply the same check in the TX path.
Change-Id: I7b1dfc96797cb5b52845bc6cec09a8c5d4325280
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I48a92035b58d83420eb3eed3f05a75ba283543c2
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: I9375bca5f5136c84d801dbd635929bb1c37d75b4
Signed-off-by: Filip Varga <filip.varga@pantheon.tech>
|
|
Change-Id: I49a5029d256df8f749ee30d19ff7473147b6516f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
The change can save 1.1 clocks per packet on Intel Atom C3858 platform,
It downgraded from 2.05e1 to 1.94e1 clocks per packet.
The change can save 0.3 clocks per packet on Intel Xeon CPU E5-2699 v4 @ 2.20GHz,
It downgraded from 1.26e1 to 1.23e1 clocks per packet.
Change-Id: I1ede77fb592a797d86940a8abad9ca291a89f1c7
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
|
|
When netvsc or failsafe DPDK device is used, the DPDK port id does not
match the VPP device instance id. The code that formats the device
name was incorrectly calling DPDK device info using the VPP device
instance id. This causes the VPP interface to be named
"FortyGigabit0/2/0" based on mistakenly finding the PCI device
information of the hidden DPDK port id for the VF device.
Change-Id: I9366232f4b2087076bdcc1a58bf228007c24c084
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
|
|
The pmd device type show with 'show hardware' is wrong if using failsafe
(or netvsc pmd) because the pmd device type should be based of the VPP device
instance, not the DPDK port id.
Fixes: a059a000f81a ("dpdk: Decoupling the meaning of xd->device_index in dpdk_plugin")
Change-Id: I3880fe674731880c5706a21d8ef3ccf8d569d46d
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
|
|
Avoid dequeuing acked bytes more than once per burst for a connection.
Although the fifos do not use locks, size decrements are atomic, so they
rely on locked instructions.
Change-Id: Id65f4ea40b2c10057461402dfd0393034e6472d5
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Using bitfield struct for 5tuple proved to be fragile from
the performance standpoint - the zeroizing of the entire
structure and then setting the separate pieces of it
triggers increased memory latency. So, move to using
flags byte.
Also, use the direct object copies rather than memcpy.
Change-Id: Iad8faf9de050ff1256e40c950dee212cbd3e5267
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
For vxlan_encap, code will touch memory area before the field "data"
in struct vlib_buffer_t, however so far it is not prefetched in cache
yet for this graph node.
After applying the patch, 2~3 cycles per pkt for vxlan4_encap can be
saved on Haswell. It will bring a lot of benefits on DVN platform too.
Change-Id: I26d8c57fb3d2415726be5367117d73eb715e35ad
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
Change-Id: Idf79f261a7590038c1813d3996f4644e3d7e0cbe
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
Add atomic swap and store macro with acquire and release ordering
respectively. Variable in question is interupt_pending variable which
is used as guard variable by input nodes to process the device queue.
Atomic Swap is used with Acquire ordering as writes or reads following
this in program order should not be reordered before the swap.
Atomic Store is used with Release ordering, as post store the node is
added to pending list.
Change-Id: I1be49e91a15c58d0bf21ff5ba1bd37d5d7d12f7a
Original-patch-by: Damjan Marion <damarion@cisco.com>
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
|
|
Change-Id: I8bc3a991f0ede0605d78b51ba609fbe5889513f2
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: Idd4a5f8bab5d39e5f33f5c130601175af70a20d4
Signed-off-by: Filip Varga <filip.varga@pantheon.tech>
|
|
Allows sending of unsent data in fast recovery and consolidates logic in
tcp, instead of splitting it between tcp fast retransmit and tcp output
path called by the session layer.
Change-Id: I9b12cdf2aa2ac50b9f25e46856fed037163501fe
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I209f570634636725ce8fda5f61e900a71227b888
Signed-off-by: Igor Mikhailov (imichail) <imichail@cisco.com>
|
|
Restore parts of commit d0e812f wiped out by a7564e80
The full description of the change is in d0e812f
Change-Id: I632476cb10678a725396462f90f9b0bea9e572fa
Signed-off-by: Igor Mikhailov (imichail) <imichail@cisco.com>
|
|
Change-Id: Ib15d629c5fde7849bfa3307f42659e920eb0f463
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Refactor most of the ping code to be address-family agnostic,
and add support for chained buffers (thus, sending
the payloads bigger than 2K).
Change-Id: I749c302ca2f3390e0d1f84046fc72da5cf13e3ef
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Change-Id: I9b5f7b264f9978e3dd97b2d1eb103b7d10ac3170
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
The failsafe driver is unique because it shares device with underlying
pci device. This confuses name generation. Without this fix, the name
is wrong and multiple devices get created with same name.
Fixes: 3901a038edf4 ("dpdk: only look at PCI information on PCI devices")
Change-Id: I13796d03baf6c76dafe3667c83bea4a1ae30c48f
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
|
|
Also reset pacer on tcp retransmit timeout
Change-Id: I5a9edee4c00d1d169248d79587a9b10437c2bd87
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Also propagate tcp worker context instead of retrieving it multiple
times.
Change-Id: I7b273b981826b37783566d0172a64cd6957f3b33
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ided6c661edc9e2035fd7b472c312e2380d3f9c0b
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Force pacing for fast retransmit to avoid bursts of retransmitted
packets.
Change-Id: I2ff42c328899b36322c4de557b1f7d853dba8fe2
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I2476e3e916a42b41d1e66bfc1ec4f8c4264c1720
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
Change-Id: I716d025beb8f649060238c2bd388357943643621
Signed-off-by: Igor Mikhailov (imichail) <imichail@cisco.com>
|
|
Change-Id: Idbc7b61393c6d0e3b8ea950397a89d21b1cf3a42
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
This patch enables the use of this function for enqueuing frames to the next graph node.
Change-Id: I4003110db59870f7106e0d13942d6ff7bc54b46d
Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
Reviewed-by: Sirshak Das <Sirshak.Das@arm.com>
Reviewed-by: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
Reviewed-by: Steve Capper <Steve.Capper@arm.com>
|
|
Change-Id: Ib138b6e2eac47acc16e81bc88358ae7947420134
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
If sessions cannot be handled during the current dispatch loop
iteration, ensure that they are first to be handled in the next.
Change-Id: Ifc6215900f8cfd530d4886b58641189f0ccf9bb7
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I66ca0ddea872948507d078e405eb90f9f3a0e897
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Iaecf8c060e1337d8c362ad9a9be2bb9701664397
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: Ia68db22b917e9af1394c00e5a6b3df134bfd1568
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: Ia9b74761ce511d218bb5319c7c9b5e58be3e2e8a
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Fixes debug build crash.
Change-Id: Ia5c5da82beda5992f9e67456af9a4676b9b82722
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ibef46e068cd72415af28920b0146adf48105bf68
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
Change-Id: I93c6b7bccd1a1ab71625ae29c99c974581186c4d
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: I2eafac4ce810fe53454b729d81161ec80d036db7
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: I7531a64d7072d85514ca579827b6ea0e9cef6f08
Signed-off-by: Vijayabhaskar Katamreddy <vkatamre@cisco.com>
|