Age | Commit message (Collapse) | Author | Files | Lines |
|
when pagesize is 1G, this pm->base + (pp->index << pm->def_log2_page_sz) would very soon overrun if creating multiple mempools
add a (uword) to it
Change-Id: If769b99d344cc3f547418a242a7497d044071615
Signed-off-by: Kingwel Xie <kingwel.xie@ericsson.com>
|
|
Change-Id: I3daf8d473aa37b4597d130d19913b782cf7b8511
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I36c2fa33cdc1db9a6af9b48c99e281abd8af1b6e
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Because the code is not optimized, newreno is still the default
congestion control algorithm.
Change-Id: I7061cc80c5a75fa8e8265901fae4ea2888e35173
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
It is not used and just confuses people...
Change-Id: Ic731432a785731271531f183b448e4591a1d2a8b
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Compute the first power of ten which is greater than 0.1% of the clock
rate. Save the result, and use it to round future results. The
previous constant value - 1e7 - didn't work properly on aarch64.
Change-Id: Ic021e3eb1b90c0d4a7d9f1b6425123f0c8b48b0b
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Idd4471a3adf7023e48e85717f00c786b1dde0cca
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Idbce0393fc9e6e8dbb2765ed164ba7f90d1ffccc
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: I755525d953605561477eeb2252ef38c60000c70a
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
This patch adds support for mapping the virtual address to physical
address and size of memory allocated.
Change-Id: I7659a1881308e89b215c486fecd7c973076d0773
Signed-off-by: Mohsin Kazmi <sykazmi@cisco.com>
|
|
Change-Id: I294e4f93e925c58765d4692337208fcee7d12886
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
- update pacer once per burst
- better estimate initial rtt
- compute smoothed average for higher precision rtt estimate
Change-Id: I06d41a98784cdf861bedfbee2e7d0afc0d0154ef
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Avoids crash if suspended_process_frames grows.
Change-Id: Id26ef0dd0dd001b997c531c4dec004e7e7989670
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Id69678adb578b323ae18034d1b1fddb7417bcc08
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: I6782544d5ee0a66b1a027874b23574416093ca92
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I20192f3a8f4f01f47e775746f6fde7c685f185ee
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Instead of reusing buffers for acking, consume all buffers and program
output for (dup)ack generation. This implicitly fixes the drop counters
that were artificially inflated by both data and feedback traffic.
Moreover, the patch also significantly reduces the ack traffic as we now
only generate an ack per frame, unless duplicate acks need to be sent.
Because of the reduced feedback traffic, a sender's rx path and a
receiver's tx path are now significantly less loaded. In particular, a
sender can overwhelm a 40Gbps NIC and generate tx drop bursts for low
rtts. Consequently, tx pacing is now enforced by default.
Change-Id: I619c29a8945bf26c093f8f9e197e3c6d5d43868e
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Optimize zero byte mask NEON functions below with less intrinsics,
and get their outputs consistent with functions in vector_sse42.h
always_inline u32 u64x2_zero_byte_mask (u64x2 input)
always_inline u32 u32x4_zero_byte_mask (u32x4 input)
always_inline u32 u16x8_zero_byte_mask (u16x8 input)
always_inline u32 u8x16_zero_byte_mask (u8x16 input)
always_inline u32 i64x2_zero_byte_mask (i64x2 input)
always_inline u32 i32x4_zero_byte_mask (i32x4 input)
always_inline u32 i16x8_zero_byte_mask (i16x8 input)
always_inline u32 i8x16_zero_byte_mask (i8x16 input)
Change-Id: I7f485915baeb37fa2dd484699b8769e0136f6574
Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
Reviewed-by: Sirshak Das <Sirshak.Das@arm.com>
|
|
Change-Id: I30487bd736407378fb5a6d313e4eef12bbb262b8
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Learning GBP endpoints over vxlan-gbp tunnels
Change-Id: I1db9fda5a16802d9ad8b4efd4e475614f3b21502
Signed-off-by: Neale Ranns <neale.ranns@cisco.com>
|
|
pool (VPP-1485)
Change-Id: Iaa404361eac2a6612dcdaba3f73bae41a35c5446
Signed-off-by: Matus Fabian <matfabia@cisco.com>
|
|
Change-Id: I6d6a73ac62f24928fb51e89948b92a1cb9134c40
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
In output.c, we buffer the descriptors and call vmxnet3_reg_write_inline
once outside the loop. This change improves the performance dramatically.
When refilling the ring, there is no need to inform the device unless
explicitly specified by the device (ctrl.update_prod == 1)
Change-Id: I7031d58bff0d249e913d14236d416c91eb6ab94a
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: Ifb841312d4a382547153b24903230b407f649e73
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
If no PCI address is specified in dpdk config, the default to automatically
put all PCIs in the whitelist.
For vmxnet3 PCIs, we want to change its default to exclude the vmxnet3 PCIs.
That is to put them in the blacklist instead of whitelist.
Change-Id: I2b7061d6437910eb0e1b16df19a770cab968c602
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I29f20dbaf2c2d735faff297cee552ed648f6f61b
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
(gdb) bt
bt
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) frame 5
frame 5
293 if (PREDICT_FALSE (rxvq->last_avail_idx == rxvq->avail->idx))
(gdb) p *rxvq
p *rxvq
$3 = {cacheline0 = 0x7f290bcadd80 "\377\003", qsz_mask = 1023, last_avail_idx = 0, last_used_idx = 0, n_since_last_int = 0, desc = 0x0, avail = 0x0, used = 0x0, int_deadline = 0, started = 1 '\001', enabled = 1 '\001', log_used = 0 '\000', cacheline1 = 0x7f290bcaddc0 "\377\377\377\377\016", errfd = -1, callfd_idx = 14, kickfd_idx = 19, log_guest_addr = 5151049792, mode = 0}
The crash is because we access the null pointer rxvq->avail,
which is supposed to be derived from the mmap informed by the driver.
We fixed a similar issue before in
https://gerrit.fd.io/r/#/c/14545/
The reason was the driver ummaps the memory without doing the disconnect in
SR-IOV environment. The fixed was applied to the RX path. Now it happens in the
TX path. We just need to apply the same check in the TX path.
Change-Id: I7b1dfc96797cb5b52845bc6cec09a8c5d4325280
Signed-off-by: Steven <sluong@cisco.com>
|
|
Change-Id: I48a92035b58d83420eb3eed3f05a75ba283543c2
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: I9375bca5f5136c84d801dbd635929bb1c37d75b4
Signed-off-by: Filip Varga <filip.varga@pantheon.tech>
|
|
Change-Id: I49a5029d256df8f749ee30d19ff7473147b6516f
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
The change can save 1.1 clocks per packet on Intel Atom C3858 platform,
It downgraded from 2.05e1 to 1.94e1 clocks per packet.
The change can save 0.3 clocks per packet on Intel Xeon CPU E5-2699 v4 @ 2.20GHz,
It downgraded from 1.26e1 to 1.23e1 clocks per packet.
Change-Id: I1ede77fb592a797d86940a8abad9ca291a89f1c7
Signed-off-by: Yulong Pei <yulong.pei@intel.com>
|
|
When netvsc or failsafe DPDK device is used, the DPDK port id does not
match the VPP device instance id. The code that formats the device
name was incorrectly calling DPDK device info using the VPP device
instance id. This causes the VPP interface to be named
"FortyGigabit0/2/0" based on mistakenly finding the PCI device
information of the hidden DPDK port id for the VF device.
Change-Id: I9366232f4b2087076bdcc1a58bf228007c24c084
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
|
|
The pmd device type show with 'show hardware' is wrong if using failsafe
(or netvsc pmd) because the pmd device type should be based of the VPP device
instance, not the DPDK port id.
Fixes: a059a000f81a ("dpdk: Decoupling the meaning of xd->device_index in dpdk_plugin")
Change-Id: I3880fe674731880c5706a21d8ef3ccf8d569d46d
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
|
|
Avoid dequeuing acked bytes more than once per burst for a connection.
Although the fifos do not use locks, size decrements are atomic, so they
rely on locked instructions.
Change-Id: Id65f4ea40b2c10057461402dfd0393034e6472d5
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Using bitfield struct for 5tuple proved to be fragile from
the performance standpoint - the zeroizing of the entire
structure and then setting the separate pieces of it
triggers increased memory latency. So, move to using
flags byte.
Also, use the direct object copies rather than memcpy.
Change-Id: Iad8faf9de050ff1256e40c950dee212cbd3e5267
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
For vxlan_encap, code will touch memory area before the field "data"
in struct vlib_buffer_t, however so far it is not prefetched in cache
yet for this graph node.
After applying the patch, 2~3 cycles per pkt for vxlan4_encap can be
saved on Haswell. It will bring a lot of benefits on DVN platform too.
Change-Id: I26d8c57fb3d2415726be5367117d73eb715e35ad
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
Change-Id: Idf79f261a7590038c1813d3996f4644e3d7e0cbe
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
Add atomic swap and store macro with acquire and release ordering
respectively. Variable in question is interupt_pending variable which
is used as guard variable by input nodes to process the device queue.
Atomic Swap is used with Acquire ordering as writes or reads following
this in program order should not be reordered before the swap.
Atomic Store is used with Release ordering, as post store the node is
added to pending list.
Change-Id: I1be49e91a15c58d0bf21ff5ba1bd37d5d7d12f7a
Original-patch-by: Damjan Marion <damarion@cisco.com>
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
Reviewed-by: Ola Liljedahl <ola.liljedahl@arm.com>
|
|
Change-Id: I8bc3a991f0ede0605d78b51ba609fbe5889513f2
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: Idd4a5f8bab5d39e5f33f5c130601175af70a20d4
Signed-off-by: Filip Varga <filip.varga@pantheon.tech>
|
|
Allows sending of unsent data in fast recovery and consolidates logic in
tcp, instead of splitting it between tcp fast retransmit and tcp output
path called by the session layer.
Change-Id: I9b12cdf2aa2ac50b9f25e46856fed037163501fe
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: I209f570634636725ce8fda5f61e900a71227b888
Signed-off-by: Igor Mikhailov (imichail) <imichail@cisco.com>
|
|
Restore parts of commit d0e812f wiped out by a7564e80
The full description of the change is in d0e812f
Change-Id: I632476cb10678a725396462f90f9b0bea9e572fa
Signed-off-by: Igor Mikhailov (imichail) <imichail@cisco.com>
|
|
Change-Id: Ib15d629c5fde7849bfa3307f42659e920eb0f463
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Refactor most of the ping code to be address-family agnostic,
and add support for chained buffers (thus, sending
the payloads bigger than 2K).
Change-Id: I749c302ca2f3390e0d1f84046fc72da5cf13e3ef
Signed-off-by: Andrew Yourtchenko <ayourtch@gmail.com>
|
|
Change-Id: I9b5f7b264f9978e3dd97b2d1eb103b7d10ac3170
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
The failsafe driver is unique because it shares device with underlying
pci device. This confuses name generation. Without this fix, the name
is wrong and multiple devices get created with same name.
Fixes: 3901a038edf4 ("dpdk: only look at PCI information on PCI devices")
Change-Id: I13796d03baf6c76dafe3667c83bea4a1ae30c48f
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
|
|
Also reset pacer on tcp retransmit timeout
Change-Id: I5a9edee4c00d1d169248d79587a9b10437c2bd87
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Also propagate tcp worker context instead of retrieving it multiple
times.
Change-Id: I7b273b981826b37783566d0172a64cd6957f3b33
Signed-off-by: Florin Coras <fcoras@cisco.com>
|
|
Change-Id: Ided6c661edc9e2035fd7b472c312e2380d3f9c0b
Signed-off-by: Eyal Bari <ebari@cisco.com>
|