1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
@media only all and (prefers-color-scheme: dark) {
.highlight .hll { background-color: #49483e }
.highlight .c { color: #75715e } /* Comment */
.highlight .err { color: #960050; background-color: #1e0010 } /* Error */
.highlight .k { color: #66d9ef } /* Keyword */
.highlight .l { color: #ae81ff } /* Literal */
.highlight .n { color: #f8f8f2 } /* Name */
.highlight .o { color: #f92672 } /* Operator */
.highlight .p { color: #f8f8f2 } /* Punctuation */
.highlight .ch { color: #75715e } /* Comment.Hashbang */
.highlight .cm { color: #75715e } /* Comment.Multiline */
.highlight .cp { color: #75715e } /* Comment.Preproc */
.highlight .cpf { color: #75715e } /* Comment.PreprocFile */
.highlight .c1 { color: #75715e } /* Comment.Single */
.highlight .cs { color: #75715e } /* Comment.Special */
.highlight .gd { color: #f92672 } /* Generic.Deleted */
.highlight .ge { font-style: italic } /* Generic.Emph */
.highlight .gi { color: #a6e22e } /* Generic.Inserted */
.highlight .gs { font-weight: bold } /* Generic.Strong */
.highlight .gu { color: #75715e } /* Generic.Subheading */
.highlight .kc { color: #66d9ef } /* Keyword.Constant */
.highlight .kd { color: #66d9ef } /* Keyword.Declaration */
.highlight .kn { color: #f92672 } /* Keyword.Namespace */
.highlight .kp { color: #66d9ef } /* Keyword.Pseudo */
.highlight .kr { color: #66d9ef } /* Keyword.Reserved */
.highlight .kt { color: #66d9ef } /* Keyword.Type */
.highlight .ld { color: #e6db74 } /* Literal.Date */
.highlight .m { color: #ae81ff } /* Literal.Number */
.highlight .s { color: #e6db74 } /* Literal.String */
.highlight .na { color: #a6e22e } /* Name.Attribute */
.highlight .nb { color: #f8f8f2 } /* Name.Builtin */
.highlight .nc { color: #a6e22e } /* Name.Class */
.highlight .no { color: #66d9ef } /* Name.Constant */
.highlight .nd { color: #a6e22e } /* Name.Decorator */
.highlight .ni { color: #f8f8f2 } /* Name.Entity */
.highlight .ne { color: #a6e22e } /* Name.Exception */
.highlight .nf { color: #a6e22e } /* Name.Function */
.highlight .nl { color: #f8f8f2 } /* Name.Label */
.highlight .nn { color: #f8f8f2 } /* Name.Namespace */
.highlight .nx { color: #a6e22e } /* Name.Other */
.highlight .py { color: #f8f8f2 } /* Name.Property */
.highlightMAP and Lw4o6
=============
This is a memo intended to contain documentation of the VPP MAP and
Lw4o6 implementations. Everything that is not directly obvious should
come here.
MAP-E Virtual Reassembly
------------------------
The MAP-E implementation supports handling of IPv4 fragments as well as
IPv4-in-IPv6 inner and outer fragments. This is called virtual
reassembly because the fragments are not actually reassembled. Instead,
some meta-data are kept about the first fragment and reused for
subsequent fragments.
Fragment caching and handling is not always necessary. It is performed
when: \* An IPv4 fragment is received and the destination IPv4 address
is shared. \* An IPv6 packet is received with an inner IPv4 fragment,
the IPv4 source address is shared, and ‘security-check fragments’ is on.
\* An IPv6 fragment is received.
There are 3 dedicated nodes: \* ip4-map-reass \* ip6-map-ip4-reass \*
ip6-map-ip6-reass
ip4-map sends all fragments to ip4-map-reass. ip6-map sends all
inner-fragments to ip6-map-ip4-reass. ip6-map sends all outer-fragments
to ip6-map-ip6-reass.
IPv4 (resp. IPv6) virtual reassembly makes use of a hash table in order
to store IPv4 (resp. IPv6) reassembly structures. The hash-key is based
on the IPv4-src:IPv4-dst:Frag-ID:Protocol tuple (resp.
IPv6-src:IPv6-dst:Frag-ID tuple, as the protocol is IPv4-in-IPv6).
Therefore, each packet reassembly makes use of exactly one reassembly
structure. When such a structure is allocated, it is timestamped with
the current time. Finally, those structures are capable of storing a
limited number of buffer indexes.
An IPv4 (resp. IPv6) reassembly structure can cache up to
MAP_IP4_REASS_MAX_FRAGMENTS_PER_REASSEMBLY (resp.
MAP_IP6_REASS_MAX_FRAGMENTS_PER_REASSEMBLY) buffers. Buffers are cached
until the first fragment is received.
Virtual Reassembly configuration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IPv4 and IPv6 virtual reassembly support the following configuration:
map params reassembly [ip4 \| ip6] [lifetime ] [pool-size ] [buffers ]
[ht-ratio ]
lifetime: The time in milliseconds a reassembly structure is considered
valid. The longer, the more reliable is reassembly, but the more likely
it is to exhaust the pool of reassembly structures. IPv4 standard
suggests a lifetime of 15 seconds. IPv6 specifies a lifetime of 60
seconds. Those values are not realistic for high-throughput cases.
buffers: The upper limit of buffers that are allowed to be cached. It
can be used to protect against fragmentation attacks which would aim to
exhaust the global buffers pool.
pool-size: The number of reassembly structures that can be allocated. As
each structure can store a small fixed number of fragments, it also sets
an upper-bound of ‘pool-size \*
MAP_IPX_REASS_MAX_FRAGMENTS_PER_REASSEMBLY’ buffers that can be cached
in total.
ht-ratio: The amount of buckets in the hash-table is pool-size \*
ht-ratio.
Any time pool-size and ht-ratio is modified, the hash-table is destroyed
and created again, which means all current state is lost.
Additional considerations
^^^^^^^^^^^^^^^^^^^^^^^^^
Reassembly at high rate is expensive in terms of buffers. There is a
trade-off between the lifetime and number of allocated buffers. Reducing
the lifetime helps, but at the cost of loosing state for fragments that
are wide apart.
Let: R be the packet rate at which fragments are received. F be the
number of fragments per packet.
Assuming the first fragment is always received last. We should have:
buffers > lifetime \* R / F \* (F - 1) pool-size > lifetime \* R/F
This is a worst case. Receiving the first fragment earlier helps
reducing the number of required buffers. Also, an optimization is
implemented (MAP_IP6_REASS_COUNT_BYTES and MAP_IP4_REASS_COUNT_BYTES)
which counts the number of transmitted bytes and remembers the total
number of bytes which should be transmitted based on the last fragment,
and therefore helps reducing ‘pool-size’.
But the formula shows that it is challenging to forward a significant
amount of fragmented packets at high rates. For instance, with a
lifetime of 1 second, 5Mpps packet rate would require buffering up to
2.5 millions fragments.
If you want to do that, be prepared to configure a lot of fragments.
|