1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
|
Buffer Metadata
===============
Each vlib_buffer_t (packet buffer) carries buffer metadata which
describes the current packet-processing state. The underlying
techniques have been used for decades, across multiple packet
processing environments.
We will examine vpp buffer metadata in some detail, but folks who need
to manipulate and/or extend the scheme should expect to do a certain
level of code inspection.
Vlib (Vector library) primary buffer metadata
----------------------------------------------
The first 64 octets of each vlib_buffer_t carries the primary buffer
metadata. See .../src/vlib/buffer.h for full details.
Important fields:
* i16 current_data: the signed offset in data[], pre_data[] that we
are currently processing. If negative current header points into
the pre-data (rewrite space) area.
* u16 current_length: nBytes between current_data and the end of this buffer.
* u32 flags: Buffer flag bits. Heavily used, not many bits left
* src/vlib/buffer.h flag bits
* VLIB_BUFFER_IS_TRACED: buffer is traced
* VLIB_BUFFER_NEXT_PRESENT: buffer has multiple chunks
* VLIB_BUFFER_TOTAL_LENGTH_VALID: total_length_not_including_first_buffer is valid (see below)
* src/vnet/buffer.h flag bits
* VNET_BUFFER_F_L4_CHECKSUM_COMPUTED: tcp/udp checksum has been computed
* VNET_BUFFER_F_L4_CHECKSUM_CORRECT: tcp/udp checksum is correct
* VNET_BUFFER_F_VLAN_2_DEEP: two vlan tags present
* VNET_BUFFER_F_VLAN_1_DEEP: one vlan tag present
* VNET_BUFFER_F_SPAN_CLONE: packet has already been cloned (span feature)
* VNET_BUFFER_F_LOOP_COUNTER_VALID: packet look-up loop count valid
* VNET_BUFFER_F_LOCALLY_ORIGINATED: packet built by vpp
* VNET_BUFFER_F_IS_IP4: packet is ipv4, for checksum offload
* VNET_BUFFER_F_IS_IP6: packet is ipv6, for checksum offload
* VNET_BUFFER_F_OFFLOAD_IP_CKSUM: hardware ip checksum offload requested
* VNET_BUFFER_F_OFFLOAD_TCP_CKSUM: hardware tcp checksum offload requested
* VNET_BUFFER_F_OFFLOAD_UDP_CKSUM: hardware udp checksum offload requested
* VNET_BUFFER_F_IS_NATED: natted packet, skip input checks
* VNET_BUFFER_F_L2_HDR_OFFSET_VALID: L2 header offset valid
* VNET_BUFFER_F_L3_HDR_OFFSET_VALID: L3 header offset valid
* VNET_BUFFER_F_L4_HDR_OFFSET_VALID: L4 header offset valid
* VNET_BUFFER_F_FLOW_REPORT: packet is an ipfix packet
* VNET_BUFFER_F_IS_DVR: packet to be reinjected into the l2 output path
* VNET_BUFFER_F_QOS_DATA_VALID: QoS data valid in vnet_buffer_opaque2
* VNET_BUFFER_F_GSO: generic segmentation offload requested
* VNET_BUFFER_F_AVAIL1: available bit
* VNET_BUFFER_F_AVAIL2: available bit
* VNET_BUFFER_F_AVAIL3: available bit
* VNET_BUFFER_F_AVAIL4: available bit
* VNET_BUFFER_F_AVAIL5: available bit
* VNET_BUFFER_F_AVAIL6: available bit
* VNET_BUFFER_F_AVAIL7: available bit
* u32 flow_id: generic flow identifier
* u8 ref_count: buffer reference / clone count (e.g. for span replication)
* u8 buffer_pool_index: buffer pool index which owns this buffer
* vlib_error_t (u16) error: error code for buffers enqueued to error handler
* u32 next_buffer: buffer index of next buffer in chain. Only valid if VLIB_BUFFER_NEXT_PRESENT is set
* union
* u32 current_config_index: current index on feature arc
* u32 punt_reason: reason code once packet punted. Mutually exclusive with current_config_index
* u32 opaque[10]: primary vnet-layer opaque data (see below)
* END of first cache line / data initialized by the buffer allocator
* u32 trace_index: buffer's index in the packet trace subsystem
* u32 total_length_not_including_first_buffer: see VLIB_BUFFER_TOTAL_LENGTH_VALID above
* u32 opaque2[14]: secondary vnet-layer opaque data (see below)
* u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]: rewrite space, often used to prepend tunnel encapsulations
* u8 data[0]: buffer data received from the wire. Ordinarily, hardware devices use b->data[0] as the DMA target but there are exceptions. Do not write code which blindly assumes that packet data starts in b->data[0]. Use vlib_buffer_get_current(...).
Vnet (network stack) primary buffer metadata
--------------------------------------------
Vnet primary buffer metadata occupies space reserved in the vlib
opaque field shown above, and has the type name
vnet_buffer_opaque_t. Ordinarily accessed using the vnet_buffer(b)
macro. See ../src/vnet/buffer.h for full details.
Important fields:
* u32 sw_if_index[2]: RX and TX interface handles. At the ip lookup
stage, vnet_buffer(b)->sw_if_index[VLIB_TX] is interpreted as a FIB
index.
* i16 l2_hdr_offset: offset from b->data[0] of the packet L2 header.
Valid only if b->flags & VNET_BUFFER_F_L2_HDR_OFFSET_VALID is set
* i16 l3_hdr_offset: offset from b->data[0] of the packet L3 header.
Valid only if b->flags & VNET_BUFFER_F_L3_HDR_OFFSET_VALID is set
* i16 l4_hdr_offset: offset from b->data[0] of the packet L4 header.
Valid only if b->flags & VNET_BUFFER_F_L4_HDR_OFFSET_VALID is set
* u8 feature_arc_index: feature arc that the packet is currently traversing
* union
* ip
* u32 adj_index[2]: adjacency from dest IP lookup in [VLIB_TX], adjacency
from source ip lookup in [VLIB_RX], set to ~0 until source lookup done
* union
* generic fields
* ICMP fields
* reassembly fields
* mpls fields
* l2 bridging fields, only valid in the L2 path
* l2tpv3 fields
* l2 classify fields
* vnet policer fields
* MAP fields
* MAP-T fields
* ip fragmentation fields
* COP (whitelist/blacklist filter) fields
* LISP fields
* TCP fields
* connection index
* sequence numbers
* header and data offsets
* data length
* flags
* SCTP fields
* NAT fields
* u32 unused[6]
Vnet (network stack) secondary buffer metadata
-----------------------------------------------
Vnet primary buffer metadata occupies space reserved in the vlib
opaque2 field shown above, and has the type name
vnet_buffer_opaque2_t. Ordinarily accessed using the vnet_buffer2(b)
macro. See ../src/vnet/buffer.h for full details.
Important fields:
* qos fields
* u8 bits
* u8 source
* u8 loop_counter: used to detect and report internal forwarding loops
* group-based policy fields
* u8 flags
* u16 sclass: the packet's source class
* u16 gso_size: L4 payload size, persists all the way to
interface-output in case GSO is not enabled
* u16 gso_l4_hdr_sz: size of the L4 protocol header
* union
* packet trajectory tracer (largely deprecated)
* u16 *trajectory_trace; only #if VLIB_BUFFER_TRACE_TRAJECTORY > 0
* packet generator
* u64 pg_replay_timestamp: timestamp for replayed pcap trace packets
* u32 unused[8]
Buffer Metadata Extensions
==========================
Plugin developers may wish to extend either the primary or secondary
vnet buffer opaque unions. Please perform a
manual live variable analysis, otherwise nodes which use shared buffer metadata space may break things.
It's not OK to add plugin or proprietary metadata to the core vpp
engine header files named above. Instead, proceed as follows. The
example concerns the vnet primary buffer opaque union
vlib_buffer_opaque_t. It's a very simple variation to use the vnet
secondary buffer opaque union vlib_buffer_opaque2_t.
In a plugin header file:
```
/* Add arbitrary buffer metadata */
#include <vnet/buffer.h>
typedef struct
{
u32 my_stuff[6];
} my_buffer_opaque_t;
STATIC_ASSERT (sizeof (my_buffer_opaque_t) <=
STRUCT_SIZE_OF (vnet_buffer_opaque_t, unused),
"Custom meta-data too large for vnet_buffer_opaque_t");
#define my_buffer_opaque(b) \
((my_buffer_opaque_t *)((u8 *)((b)->opaque) + STRUCT_OFFSET_OF (vnet_buffer_opaque_t, unused)))
```
To set data in the custom buffer opaque type given a vlib_buffer_t *b:
```
my_buffer_opaque (b)->my_stuff[2] = 123;
```
To read data from the custom buffer opaque type:
```
stuff0 = my_buffer_opaque (b)->my_stuff[2];
```
|