Age | Commit message (Collapse) | Author | Files | Lines |
|
Too many prefetches within loop unrollings induce bottleneck and
performance degradation on some CPUs which have less cache line fill
buffers, e.g, Arm Cortex-A72.
Apply dual loop unrolling and tune prefetches manually to resolve
hot-spot with prefetch instructions.
It saves about 11.5% cycles with ip4_input node on Cortex-A72 CPUs.
Type: feature
Change-Id: I1ac9eb21061a804af2a414b420217fbcda3689c9
Signed-off-by: Lijian Zhang <Lijian.Zhang@arm.com>
|
|
VPP graph dispatch trace record description:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Major Version | Minor Version | NStrings | ProtoHint |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer index (big endian) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
+ VPP graph node name ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Metadata ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Opaque ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Buffer Opaque 2 ... ... | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| VPP ASCII packet trace (if NStrings > 4) | NULL octet |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet data (up to 16K) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Graph dispatch records comprise a version stamp, an indication of how
many NULL-terminated strings will follow the record header, and a
protocol hint.
The buffer index allows downstream consumers of these data to easily
filter/track single packets as they traverse the forwarding
graph. FWIW, the 32-bit buffer index is stored in big endian format.
As of this writing, major version = 1, minor version = 0. Nstrings
will be either 4 or 5.
Here is the current set of protocol hints:
typedef enum
{
VLIB_NODE_PROTO_HINT_NONE = 0,
VLIB_NODE_PROTO_HINT_ETHERNET,
VLIB_NODE_PROTO_HINT_IP4,
VLIB_NODE_PROTO_HINT_IP6,
VLIB_NODE_PROTO_HINT_TCP,
VLIB_NODE_PROTO_HINT_UDP,
VLIB_NODE_N_PROTO_HINTS,
} vlib_node_proto_hint_t;
Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet
data SHOULD be 0x60, and should begin an ipv6 packet header.
Change-Id: Idf310bad80cc0e4207394c80f18db5f77c378741
Signed-off-by: Dave Barach <dave@barachs.net>
|
|
Change-Id: Iba750a41262cc028ad0363fff78cc219e4a33538
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
There are two reasons to modify the existing code ip4_input_inline.
1. For many tunnel decap cases, inner ip header or its part is possible
in the second cacheline, not first cacheline only after the field "data",
and this will cause data cache miss once the second cacheline is needed
to access. e.g vxlan-gpe.
2. For most of cases, "data" is the starting address of ethernet
header, not IP header. The existing code causes misunderstanding
from code readability perspective.
Change-Id: I43e119b899dbde95803bccbac54259729fd2cddf
Signed-off-by: Zhiyong Yang <zhiyong.yang@intel.com>
Signed-off-by: Yuwei Zhang <yuwei1.zhang@intel.com>
|
|
Change-Id: Ic7f7af983d5b6d756748023aa0c650f53e9285cf
Signed-off-by: Neale Ranns <neale.ranns@cisco.com>
|
|
This significantly reduces need for
...
in multiarch code. Simply constructor macros will jost create static unused
entry if CLIB_MARCH_VARIANT is defined and that will be optimized out by
compiler.
Change-Id: I17d1c4ac0c903adcfadaa4a07de1b854c7ab14ac
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: I810d834c407bd404d5f0544cdec0674f0bb92d31
Signed-off-by: Dave Barach <dave@barachs.net>
Signed-off-by: Dave Barach <dbarach@cisco.com>
|
|
each edge/arc from these nodes must be the same.
Change-Id: Id5dace61bca0af71ad1df98583425226e81fd0de
Signed-off-by: Neale Ranns <neale.ranns@cisco.com>
|
|
It is cheaper to get thread index from vlib_main_t if available...
Change-Id: I4582e160d06d9d7fccdc54271912f0635da79b50
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
and a new ip4-options node, inserted between ip4-input and ip4-punt,
that checks for IP-router-alert option + IGMP combination and sends
the packet to the ip4-local. This is required because some IGMP
packets are sent to the group address and not the all-routers address.
All IGMP packets are sent with the router alert option.
Change-Id: I01f478d4d98ac9f806e0bcba0f6da6e4e7d26e2a
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: Ibab5e27277f618ceb2d543b9d6a1a5f191e7d1db
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Gain is around 6 clocks per packet (22 to 16).
Change-Id: Ia6f4293ea9062368a9a6b235c650591dbc0707d0
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
Change-Id: Ic5dcadd13c88b8a5e7896dab82404509c081614a
Signed-off-by: Klement Sekera <ksekera@cisco.com>
|
|
Change-Id: I7aafdecd6f370411138e6ab67b2ff72cda6e0666
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
This patch deprecates stack-based thread identification,
Also removes requirement that thread stacks are adjacent.
Finally, possibly annoying for some folks, it renames
all occurences of cpu_index and cpu_number with thread
index. Using word "cpu" is misleading here as thread can
be migrated ti different CPU, and also it is not related
to linux cpu index.
Change-Id: I68cdaf661e701d2336fc953dcb9978d10a70f7c1
Signed-off-by: Damjan Marion <damarion@cisco.com>
|
|
- IPv[46] mfib tables with support for (*,G/m), (*,G) and (S,G) exact and longest prefix match
- Replication represented via a new replicate DPO.
- RPF configuration and data-plane checking
- data-plane signals sent to listening control planes.
The functions of multicast forwarding entries differ from their unicast conterparts, so we introduce a new mfib_table_t and mfib_entry_t objects. However, we re-use the fib_path_list to resolve and build the entry's output list. the fib_path_list provides the service to construct a replicate DPO for multicast.
'make tests' is added to with two new suites; TEST=mfib, this is invocation of the CLI command 'test mfib' which deals with many path add/remove, flag set/unset scenarios, TEST=ip-mcast, data-plane forwarding tests.
Updated applications to use the new MIFB functions;
- IPv6 NS/RA.
- DHCPv6
unit tests for these are undated accordingly.
Change-Id: I49ec37b01f1b170335a5697541c8fd30e6d3a961
Signed-off-by: Neale Ranns <nranns@cisco.com>
|
|
Change-Id: I7b51f88292e057c6443b12224486f2d0c9f8ae23
Signed-off-by: Damjan Marion <damarion@cisco.com>
|