Age | Commit message (Collapse) | Author | Files | Lines |
|
vlib_get_buffers can save about 1.2 clocks per packet for vxlan encap
graph node on Skylake.
Type: improvement
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Change-Id: I9cad3211883de117c1b84324e8dfad38879de2d2
|
|
Type: fix
Several tunnels encapsulation use udp as outer header and udp src port
is set by inner header flow hash, such as gtpu, geneve, vxlan, vxlan-gbd
Since flow hash of inner header is already been calculated, keeping it
to vnet_buffere[b]->ip.flow_hash should save load-balance node work to
select ECMP uplinks.
Change-Id: I0e4e2b27178f4fcc5785e221d6d1f3e8747d0d59
Signed-off-by: Shawn Ji <xiaji@tethrnet.com>
|
|
Fix UDP over IP6 checksum offload setup for VXLAN and VXLAN-GBP.
Type: fix
Signed-off-by: John Lo <loj@cisco.com>
Change-Id: If110467a68234d8eed941869a2a03735f339dc33
|
|
Using memcpy instead of complex specific copy logic. This simplify
the implementation and also improve perf slightly.
Also move adjacency data from tail to head of buffer, which improves
cache locality (header and data share the same cacheline)
Finally, fix VxLAN which used to workaround vnet_rewrite logic.
Change-Id: I770ddad9846f7ee505aa99ad417e6a61d5cbbefa
Signed-off-by: Benoît Ganne <bganne@cisco.com>
|
|
Change-Id: Ide23bb3d82024118214902850821a8184fe65dfc
Signed-off-by: Filip Tehlar <ftehlar@cisco.com>
|
|
Change-Id: Id4f37f5d4a03160572954a416efa1ef9b3d79ad1
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
For vxlan_encap, code will touch memory area before the field "data"
in struct vlib_buffer_t, however so far it is not prefetched in cache
yet for this graph node.
After applying the patch, 2~3 cycles per pkt for vxlan4_encap can be
saved on Haswell. It will bring a lot of benefits on DVN platform too.
Change-Id: I26d8c57fb3d2415726be5367117d73eb715e35ad
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
1. For vxlan, prefetching one cacheline is enough.
2. Reduce vlib_increment_combined_counter functtion calling
if possible.
Change-Id: If3a72ac40c8988caaf0d5915b695f86d799f15a9
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
|
|
moving the rewrite into the tunnel struct
Change-Id: Iec74b48e13456d32957e826cffb5ea35a8ebd1a0
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Currently for VXLAN IPv4.
Change-Id: Id4b8bc0d9f6ab043810e4d1b9f28e01c27ce0660
Signed-off-by: Igor Mikhailov (imichail) <imichail@cisco.com>
|
|
+refactor decap loop to remove repetitions and goto's
slightly improves performance in scale (3k-4k tunnels) tests (7-9 clocks)
slightly deteriorates performance in single tunnel tests (3-4 clocks)
Change-Id: I1a64ed0279c00481b61a162296c5a30f58bf29c4
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Change-Id: I62a2a6524b72115a4239fbd7dc9ac8fdc35e20ed
Signed-off-by: John Lo <loj@cisco.com>
|
|
unified some code from IPv4/6 pathes
replaced unrolled rewrite copy with simple assignment
refactored stats handling
was not tested for performance
Change-Id: I00aeb9dd5b72584e6606e1a076e5c8270389aaa4
Signed-off-by: Eyal Bari <ebari@cisco.com>
|
|
Checksum offload is implemented in VXLAN encap over both IPv4 and
IPv6. It is enabled, however, only for VXLAN over IPv6 because UDP
checksum is needed only for IPv6 and optional for IPv4.
Change-Id: Ib879f4f6da7346ba5e079d321c1dfd630f5058b8
Signed-off-by: John Lo <loj@cisco.com>
|
|
This patch deprecates stack-based thread identification,
Also removes requirement that thread stacks are adjacent.
Finally, possibly annoying for some folks, it renames
all occurences of cpu_index and cpu_number with thread
index. Using word "cpu" is misleading here as thread can
be migrated ti different CPU, and also it is not related
to linux cpu index.
Change-Id: I68cdaf661e701d2336fc953dcb9978d10a70f7c1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I7b51f88292e057c6443b12224486f2d0c9f8ae23
Signed-off-by: Damjan Marion <damarion@cisco.com>
|