aboutsummaryrefslogtreecommitdiffstats
path: root/docs/developer/corearchitecture/buffer_metadata.rst
blob: 545c31f3041fd082ea46703d0cf925ae99ab6eb5 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
Buffer Metadata
===============

Each vlib_buffer_t (packet buffer) carries buffer metadata which
describes the current packet-processing state. The underlying techniques
have been used for decades, across multiple packet processing
environments.

We will examine vpp buffer metadata in some detail, but folks who need
to manipulate and/or extend the scheme should expect to do a certain
level of code inspection.

Vlib (Vector library) primary buffer metadata
---------------------------------------------

The first 64 octets of each vlib_buffer_t carries the primary buffer
metadata. See …/src/vlib/buffer.h for full details.

Important fields:

-  i16 current_data: the signed offset in data[], pre_data[] that we are
   currently processing. If negative current header points into the
   pre-data (rewrite space) area.
-  u16 current_length: nBytes between current_data and the end of this
   buffer.
-  u32 flags: Buffer flag bits. Heavily used, not many bits left

   -  src/vlib/buffer.h flag bits

      -  VLIB_BUFFER_IS_TRACED: buffer is traced
      -  VLIB_BUFFER_NEXT_PRESENT: buffer has multiple chunks
      -  VLIB_BUFFER_TOTAL_LENGTH_VALID:
         total_length_not_including_first_buffer is valid (see below)

   -  src/vnet/buffer.h flag bits

      -  VNET_BUFFER_F_L4_CHECKSUM_COMPUTED: tcp/udp checksum has been
         computed
      -  VNET_BUFFER_F_L4_CHECKSUM_CORRECT: tcp/udp checksum is correct
      -  VNET_BUFFER_F_VLAN_2_DEEP: two vlan tags present
      -  VNET_BUFFER_F_VLAN_1_DEEP: one vlan tag present
      -  VNET_BUFFER_F_SPAN_CLONE: packet has already been cloned (span
         feature)
      -  VNET_BUFFER_F_LOOP_COUNTER_VALID: packet look-up loop count
         valid
      -  VNET_BUFFER_F_LOCALLY_ORIGINATED: packet built by vpp
      -  VNET_BUFFER_F_IS_IP4: packet is ipv4, for checksum offload
      -  VNET_BUFFER_F_IS_IP6: packet is ipv6, for checksum offload
      -  VNET_BUFFER_F_OFFLOAD_IP_CKSUM: hardware ip checksum offload
         requested
      -  VNET_BUFFER_F_OFFLOAD_TCP_CKSUM: hardware tcp checksum offload
         requested
      -  VNET_BUFFER_F_OFFLOAD_UDP_CKSUM: hardware udp checksum offload
         requested
      -  VNET_BUFFER_F_IS_NATED: natted packet, skip input checks
      -  VNET_BUFFER_F_L2_HDR_OFFSET_VALID: L2 header offset valid
      -  VNET_BUFFER_F_L3_HDR_OFFSET_VALID: L3 header offset valid
      -  VNET_BUFFER_F_L4_HDR_OFFSET_VALID: L4 header offset valid
      -  VNET_BUFFER_F_FLOW_REPORT: packet is an ipfix packet
      -  VNET_BUFFER_F_IS_DVR: packet to be reinjected into the l2
         output path
      -  VNET_BUFFER_F_QOS_DATA_VALID: QoS data valid in
         vnet_buffer_opaque2
      -  VNET_BUFFER_F_GSO: generic segmentation offload requested
      -  VNET_BUFFER_F_AVAIL1: available bit
      -  VNET_BUFFER_F_AVAIL2: available bit
      -  VNET_BUFFER_F_AVAIL3: available bit
      -  VNET_BUFFER_F_AVAIL4: available bit
      -  VNET_BUFFER_F_AVAIL5: available bit
      -  VNET_BUFFER_F_AVAIL6: available bit
      -  VNET_BUFFER_F_AVAIL7: available bit

-  u32 flow_id: generic flow identifier
-  u8 ref_count: buffer reference / clone count (e.g. for span
   replication)
-  u8 buffer_pool_index: buffer pool index which owns this buffer
-  vlib_error_t (u16) error: error code for buffers enqueued to error
   handler
-  u32 next_buffer: buffer index of next buffer in chain. Only valid if
   VLIB_BUFFER_NEXT_PRESENT is set
-  union

   -  u32 current_config_index: current index on feature arc
   -  u32 punt_reason: reason code once packet punted. Mutually
      exclusive with current_config_index

-  u32 opaque[10]: primary vnet-layer opaque data (see below)
-  END of first cache line / data initialized by the buffer allocator
-  u32 trace_index: buffer’s index in the packet trace subsystem
-  u32 total_length_not_including_first_buffer: see
   VLIB_BUFFER_TOTAL_LENGTH_VALID above
-  u32 opaque2[14]: secondary vnet-layer opaque data (see below)
-  u8 pre_data[VLIB_BUFFER_PRE_DATA_SIZE]: rewrite space, often used to
   prepend tunnel encapsulations
-  u8 data[0]: buffer data received from the wire. Ordinarily, hardware
   devices use b->data[0] as the DMA target but there are exceptions. Do
   not write code which blindly assumes that packet data starts in
   b->data[0]. Use vlib_buffer_get_current(…).

Vnet (network stack) primary buffer metadata
--------------------------------------------

Vnet primary buffer metadata occupies space reserved in the vlib opaque
field shown above, and has the type name vnet_buffer_opaque_t.
Ordinarily accessed using the vnet_buffer(b) macro. See
../src/vnet/buffer.h for full details.

Important fields:

-  u32 sw_if_index[2]: RX and TX interface handles. At the ip lookup
   stage, vnet_buffer(b)->sw_if_index[VLIB_TX] is interpreted as a FIB
   index.
-  i16 l2_hdr_offset: offset from b->data[0] of the packet L2 header.
   Valid only if b->flags & VNET_BUFFER_F_L2_HDR_OFFSET_VALID is set
-  i16 l3_hdr_offset: offset from b->data[0] of the packet L3 header.
   Valid only if b->flags & VNET_BUFFER_F_L3_HDR_OFFSET_VALID is set
-  i16 l4_hdr_offset: offset from b->data[0] of the packet L4 header.
   Valid only if b->flags & VNET_BUFFER_F_L4_HDR_OFFSET_VALID is set
-  u8 feature_arc_index: feature arc that the packet is currently
   traversing
-  union

   -  ip

      -  u32 adj_index[2]: adjacency from dest IP lookup in [VLIB_TX],
         adjacency from source ip lookup in [VLIB_RX], set to ~0 until
         source lookup done
      -  union

         -  generic fields
         -  ICMP fields
         -  reassembly fields

   -  mpls fields
   -  l2 bridging fields, only valid in the L2 path
   -  l2tpv3 fields
   -  l2 classify fields
   -  vnet policer fields
   -  MAP fields
   -  MAP-T fields
   -  ip fragmentation fields
   -  COP (whitelist/blacklist filter) fields
   -  LISP fields
   -  TCP fields

      -  connection index
      -  sequence numbers
      -  header and data offsets
      -  data length
      -  flags

   -  SCTP fields
   -  NAT fields
   -  u32 unused[6]

Vnet (network stack) secondary buffer metadata
----------------------------------------------

Vnet primary buffer metadata occupies space reserved in the vlib opaque2
field shown above, and has the type name vnet_buffer_opaque2_t.
Ordinarily accessed using the vnet_buffer2(b) macro. See
../src/vnet/buffer.h for full details.

Important fields:

-  qos fields

   -  u8 bits
   -  u8 source

-  u8 loop_counter: used to detect and report internal forwarding loops
-  group-based policy fields

   -  u8 flags
   -  u16 sclass: the packet’s source class

-  u16 gso_size: L4 payload size, persists all the way to
   interface-output in case GSO is not enabled
-  u16 gso_l4_hdr_sz: size of the L4 protocol header
-  union

   -  packet trajectory tracer (largely deprecated)

      -  u16 \*trajectory_trace; only #if VLIB_BUFFER_TRACE_TRAJECTORY >
         0

   -  packet generator

      -  u64 pg_replay_timestamp: timestamp for replayed pcap trace
         packets

   -  u32 unused[8]

Buffer Metadata Extensions
--------------------------

Plugin developers may wish to extend either the primary or secondary
vnet buffer opaque unions. Please perform a manual live variable
analysis, otherwise nodes which use shared buffer metadata space may
break things.

It’s not OK to add plugin or proprietary metadata to the core vpp engine
header files named above. Instead, proceed as follows. The example
concerns the vnet primary buffer opaque union vlib_buffer_opaque_t. It’s
a very simple variation to use the vnet secondary buffer opaque union
vlib_buffer_opaque2_t.

In a plugin header file:

::

       /* Add arbitrary buffer metadata */
       #include <vnet/buffer.h>

       typedef struct
       {
         u32 my_stuff[6];
       } my_buffer_opaque_t;

       STATIC_ASSERT (sizeof (my_buffer_opaque_t) <=
                      STRUCT_SIZE_OF (vnet_buffer_opaque_t, unused),
                      "Custom meta-data too large for vnet_buffer_opaque_t");

       #define my_buffer_opaque(b)  \
         ((my_buffer_opaque_t *)((u8 *)((b)->opaque) + STRUCT_OFFSET_OF (vnet_buffer_opaque_t, unused)))

To set data in the custom buffer opaque type given a vlib_buffer_t \*b:

::

       my_buffer_opaque (b)->my_stuff[2] = 123;

To read data from the custom buffer opaque type:

::

       stuff0 = my_buffer_opaque (b)->my_stuff[2];