diff options
Diffstat (limited to 'docs/gettingstarted/developers/vnet.md')
-rw-r--r-- | docs/gettingstarted/developers/vnet.md | 795 |
1 files changed, 0 insertions, 795 deletions
diff --git a/docs/gettingstarted/developers/vnet.md b/docs/gettingstarted/developers/vnet.md deleted file mode 100644 index 09f81e46643..00000000000 --- a/docs/gettingstarted/developers/vnet.md +++ /dev/null @@ -1,795 +0,0 @@ - -VNET (VPP Network Stack) -======================== - -The files associated with the VPP network stack layer are located in the -*./src/vnet* folder. The Network Stack Layer is basically an -instantiation of the code in the other layers. This layer has a vnet -library that provides vectorized layer-2 and 3 networking graph nodes, a -packet generator, and a packet tracer. - -In terms of building a packet processing application, vnet provides a -platform-independent subgraph to which one connects a couple of -device-driver nodes. - -Typical RX connections include "ethernet-input" \[full software -classification, feeds ipv4-input, ipv6-input, arp-input etc.\] and -"ipv4-input-no-checksum" \[if hardware can classify, perform ipv4 header -checksum\]. - -Effective graph dispatch function coding ----------------------------------------- - -Over the 15 years, multiple coding styles have emerged: a -single/dual/quad loop coding model (with variations) and a -fully-pipelined coding model. - -Single/dual loops ------------------ - -The single/dual/quad loop model variations conveniently solve problems -where the number of items to process is not known in advance: typical -hardware RX-ring processing. This coding style is also very effective -when a given node will not need to cover a complex set of dependent -reads. - -Here is an quad/single loop which can leverage up-to-avx512 SIMD vector -units to convert buffer indices to buffer pointers: - -```c - static uword - simulated_ethernet_interface_tx (vlib_main_t * vm, - vlib_node_runtime_t * - node, vlib_frame_t * frame) - { - u32 n_left_from, *from; - u32 next_index = 0; - u32 n_bytes; - u32 thread_index = vm->thread_index; - vnet_main_t *vnm = vnet_get_main (); - vnet_interface_main_t *im = &vnm->interface_main; - vlib_buffer_t *bufs[VLIB_FRAME_SIZE], **b; - u16 nexts[VLIB_FRAME_SIZE], *next; - - n_left_from = frame->n_vectors; - from = vlib_frame_vector_args (frame); - - /* - * Convert up to VLIB_FRAME_SIZE indices in "from" to - * buffer pointers in bufs[] - */ - vlib_get_buffers (vm, from, bufs, n_left_from); - b = bufs; - next = nexts; - - /* - * While we have at least 4 vector elements (pkts) to process.. - */ - while (n_left_from >= 4) - { - /* Prefetch next quad-loop iteration. */ - if (PREDICT_TRUE (n_left_from >= 8)) - { - vlib_prefetch_buffer_header (b[4], STORE); - vlib_prefetch_buffer_header (b[5], STORE); - vlib_prefetch_buffer_header (b[6], STORE); - vlib_prefetch_buffer_header (b[7], STORE); - } - - /* - * $$$ Process 4x packets right here... - * set next[0..3] to send the packets where they need to go - */ - - do_something_to (b[0]); - do_something_to (b[1]); - do_something_to (b[2]); - do_something_to (b[3]); - - /* Process the next 0..4 packets */ - b += 4; - next += 4; - n_left_from -= 4; - } - /* - * Clean up 0...3 remaining packets at the end of the incoming frame - */ - while (n_left_from > 0) - { - /* - * $$$ Process one packet right here... - * set next[0..3] to send the packets where they need to go - */ - do_something_to (b[0]); - - /* Process the next packet */ - b += 1; - next += 1; - n_left_from -= 1; - } - - /* - * Send the packets along their respective next-node graph arcs - * Considerable locality of reference is expected, most if not all - * packets in the inbound vector will traverse the same next-node - * arc - */ - vlib_buffer_enqueue_to_next (vm, node, from, nexts, frame->n_vectors); - - return frame->n_vectors; - } -``` - -Given a packet processing task to implement, it pays to scout around -looking for similar tasks, and think about using the same coding -pattern. It is not uncommon to recode a given graph node dispatch function -several times during performance optimization. - -Creating Packets from Scratch ------------------------------ - -At times, it's necessary to create packets from scratch and send -them. Tasks like sending keepalives or actively opening connections -come to mind. Its not difficult, but accurate buffer metadata setup is -required. - -### Allocating Buffers - -Use vlib_buffer_alloc, which allocates a set of buffer indices. For -low-performance applications, it's OK to allocate one buffer at a -time. Note that vlib_buffer_alloc(...) does NOT initialize buffer -metadata. See below. - -In high-performance cases, allocate a vector of buffer indices, -and hand them out from the end of the vector; decrement _vec_len(..) -as buffer indices are allocated. See tcp_alloc_tx_buffers(...) and -tcp_get_free_buffer_index(...) for an example. - -### Buffer Initialization Example - -The following example shows the **main points**, but is not to be -blindly cut-'n-pasted. - -```c - u32 bi0; - vlib_buffer_t *b0; - ip4_header_t *ip; - udp_header_t *udp; - - /* Allocate a buffer */ - if (vlib_buffer_alloc (vm, &bi0, 1) != 1) - return -1; - - b0 = vlib_get_buffer (vm, bi0); - - /* At this point b0->current_data = 0, b0->current_length = 0 */ - - /* - * Copy data into the buffer. This example ASSUMES that data will fit - * in a single buffer, and is e.g. an ip4 packet. - */ - if (have_packet_rewrite) - { - clib_memcpy (b0->data, data, vec_len (data)); - b0->current_length = vec_len (data); - } - else - { - /* OR, build a udp-ip packet (for example) */ - ip = vlib_buffer_get_current (b0); - udp = (udp_header_t *) (ip + 1); - data_dst = (u8 *) (udp + 1); - - ip->ip_version_and_header_length = 0x45; - ip->ttl = 254; - ip->protocol = IP_PROTOCOL_UDP; - ip->length = clib_host_to_net_u16 (sizeof (*ip) + sizeof (*udp) + - vec_len(udp_data)); - ip->src_address.as_u32 = src_address->as_u32; - ip->dst_address.as_u32 = dst_address->as_u32; - udp->src_port = clib_host_to_net_u16 (src_port); - udp->dst_port = clib_host_to_net_u16 (dst_port); - udp->length = clib_host_to_net_u16 (vec_len (udp_data)); - clib_memcpy (data_dst, udp_data, vec_len(udp_data)); - - if (compute_udp_checksum) - { - /* RFC 7011 section 10.3.2. */ - udp->checksum = ip4_tcp_udp_compute_checksum (vm, b0, ip); - if (udp->checksum == 0) - udp->checksum = 0xffff; - } - b0->current_length = vec_len (sizeof (*ip) + sizeof (*udp) + - vec_len (udp_data)); - - } - b0->flags |= VLIB_BUFFER_TOTAL_LENGTH_VALID; - - /* sw_if_index 0 is the "local" interface, which always exists */ - vnet_buffer (b0)->sw_if_index[VLIB_RX] = 0; - - /* Use the default FIB index for tx lookup. Set non-zero to use another fib */ - vnet_buffer (b0)->sw_if_index[VLIB_TX] = 0; - -``` - -If your use-case calls for large packet transmission, use -vlib_buffer_chain_append_data_with_alloc(...) to create the requisite -buffer chain. - -### Enqueueing packets for lookup and transmission - -The simplest way to send a set of packets is to use -vlib_get_frame_to_node(...) to allocate fresh frame(s) to -ip4_lookup_node or ip6_lookup_node, add the constructed buffer -indices, and dispatch the frame using vlib_put_frame_to_node(...). - -```c - vlib_frame_t *f; - f = vlib_get_frame_to_node (vm, ip4_lookup_node.index); - f->n_vectors = vec_len(buffer_indices_to_send); - to_next = vlib_frame_vector_args (f); - - for (i = 0; i < vec_len (buffer_indices_to_send); i++) - to_next[i] = buffer_indices_to_send[i]; - - vlib_put_frame_to_node (vm, ip4_lookup_node_index, f); -``` - -It is inefficient to allocate and schedule single packet frames. -That's typical in case you need to send one packet per second, but -should **not** occur in a for-loop! - -Packet tracer -------------- - -Vlib includes a frame element \[packet\] trace facility, with a simple -debug CLI interface. The cli is straightforward: "trace add -input-node-name count" to start capturing packet traces. - -To trace 100 packets on a typical x86\_64 system running the dpdk -plugin: "trace add dpdk-input 100". When using the packet generator: -"trace add pg-input 100" - -To display the packet trace: "show trace" - -Each graph node has the opportunity to capture its own trace data. It is -almost always a good idea to do so. The trace capture APIs are simple. - -The packet capture APIs snapshoot binary data, to minimize processing at -capture time. Each participating graph node initialization provides a -vppinfra format-style user function to pretty-print data when required -by the VLIB "show trace" command. - -Set the VLIB node registration ".format\_trace" member to the name of -the per-graph node format function. - -Here's a simple example: - -```c - u8 * my_node_format_trace (u8 * s, va_list * args) - { - vlib_main_t * vm = va_arg (*args, vlib_main_t *); - vlib_node_t * node = va_arg (*args, vlib_node_t *); - my_node_trace_t * t = va_arg (*args, my_trace_t *); - - s = format (s, "My trace data was: %d", t-><whatever>); - - return s; - } -``` - -The trace framework hands the per-node format function the data it -captured as the packet whizzed by. The format function pretty-prints the -data as desired. - -Graph Dispatcher Pcap Tracing ------------------------------ - -The vpp graph dispatcher knows how to capture vectors of packets in pcap -format as they're dispatched. The pcap captures are as follows: - -``` - VPP graph dispatch trace record description: - - 0 1 2 3 - 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - | Major Version | Minor Version | NStrings | ProtoHint | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - | Buffer index (big endian) | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - + VPP graph node name ... ... | NULL octet | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - | Buffer Metadata ... ... | NULL octet | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - | Buffer Opaque ... ... | NULL octet | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - | Buffer Opaque 2 ... ... | NULL octet | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - | VPP ASCII packet trace (if NStrings > 4) | NULL octet | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ - | Packet data (up to 16K) | - +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ -``` - -Graph dispatch records comprise a version stamp, an indication of how -many NULL-terminated strings will follow the record header and preceed -packet data, and a protocol hint. - -The buffer index is an opaque 32-bit cookie which allows consumers of -these data to easily filter/track single packets as they traverse the -forwarding graph. - -Multiple records per packet are normal, and to be expected. Packets -will appear multiple times as they traverse the vpp forwarding -graph. In this way, vpp graph dispatch traces are significantly -different from regular network packet captures from an end-station. -This property complicates stateful packet analysis. - -Restricting stateful analysis to records from a single vpp graph node -such as "ethernet-input" seems likely to improve the situation. - -As of this writing: major version = 1, minor version = 0. Nstrings -SHOULD be 4 or 5. Consumers SHOULD be wary values less than 4 or -greater than 5. They MAY attempt to display the claimed number of -strings, or they MAY treat the condition as an error. - -Here is the current set of protocol hints: - -```c - typedef enum - { - VLIB_NODE_PROTO_HINT_NONE = 0, - VLIB_NODE_PROTO_HINT_ETHERNET, - VLIB_NODE_PROTO_HINT_IP4, - VLIB_NODE_PROTO_HINT_IP6, - VLIB_NODE_PROTO_HINT_TCP, - VLIB_NODE_PROTO_HINT_UDP, - VLIB_NODE_N_PROTO_HINTS, - } vlib_node_proto_hint_t; -``` - -Example: VLIB_NODE_PROTO_HINT_IP6 means that the first octet of packet -data SHOULD be 0x60, and should begin an ipv6 packet header. - -Downstream consumers of these data SHOULD pay attention to the -protocol hint. They MUST tolerate inaccurate hints, which MAY occur -from time to time. - -### Dispatch Pcap Trace Debug CLI - -To start a dispatch trace capture of up to 10,000 trace records: - -``` - pcap dispatch trace on max 10000 file dispatch.pcap -``` - -To start a dispatch trace which will also include standard vpp packet -tracing for packets which originate in dpdk-input: - -``` - pcap dispatch trace on max 10000 file dispatch.pcap buffer-trace dpdk-input 1000 -``` -To save the pcap trace, e.g. in /tmp/dispatch.pcap: - -``` - pcap dispatch trace off -``` - -### Wireshark dissection of dispatch pcap traces - -It almost goes without saying that we built a companion wireshark -dissector to display these traces. As of this writing, we have -upstreamed the wireshark dissector. - -Since it will be a while before wireshark/master/latest makes it into -all of the popular Linux distros, please see the "How to build a vpp -dispatch trace aware Wireshark" page for build info. - -Here is a sample packet dissection, with some fields omitted for -clarity. The point is that the wireshark dissector accurately -displays **all** of the vpp buffer metadata, and the name of the graph -node in question. - -``` - Frame 1: 2216 bytes on wire (17728 bits), 2216 bytes captured (17728 bits) - Encapsulation type: USER 13 (58) - [Protocols in frame: vpp:vpp-metadata:vpp-opaque:vpp-opaque2:eth:ethertype:ip:tcp:data] - VPP Dispatch Trace - BufferIndex: 0x00036663 - NodeName: ethernet-input - VPP Buffer Metadata - Metadata: flags: - Metadata: current_data: 0, current_length: 102 - Metadata: current_config_index: 0, flow_id: 0, next_buffer: 0 - Metadata: error: 0, n_add_refs: 0, buffer_pool_index: 0 - Metadata: trace_index: 0, recycle_count: 0, len_not_first_buf: 0 - Metadata: free_list_index: 0 - Metadata: - VPP Buffer Opaque - Opaque: raw: 00000007 ffffffff 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 - Opaque: sw_if_index[VLIB_RX]: 7, sw_if_index[VLIB_TX]: -1 - Opaque: L2 offset 0, L3 offset 0, L4 offset 0, feature arc index 0 - Opaque: ip.adj_index[VLIB_RX]: 0, ip.adj_index[VLIB_TX]: 0 - Opaque: ip.flow_hash: 0x0, ip.save_protocol: 0x0, ip.fib_index: 0 - Opaque: ip.save_rewrite_length: 0, ip.rpf_id: 0 - Opaque: ip.icmp.type: 0 ip.icmp.code: 0, ip.icmp.data: 0x0 - Opaque: ip.reass.next_index: 0, ip.reass.estimated_mtu: 0 - Opaque: ip.reass.fragment_first: 0 ip.reass.fragment_last: 0 - Opaque: ip.reass.range_first: 0 ip.reass.range_last: 0 - Opaque: ip.reass.next_range_bi: 0x0, ip.reass.ip6_frag_hdr_offset: 0 - Opaque: mpls.ttl: 0, mpls.exp: 0, mpls.first: 0, mpls.save_rewrite_length: 0, mpls.bier.n_bytes: 0 - Opaque: l2.feature_bitmap: 00000000, l2.bd_index: 0, l2.l2_len: 0, l2.shg: 0, l2.l2fib_sn: 0, l2.bd_age: 0 - Opaque: l2.feature_bitmap_input: none configured, L2.feature_bitmap_output: none configured - Opaque: l2t.next_index: 0, l2t.session_index: 0 - Opaque: l2_classify.table_index: 0, l2_classify.opaque_index: 0, l2_classify.hash: 0x0 - Opaque: policer.index: 0 - Opaque: ipsec.flags: 0x0, ipsec.sad_index: 0 - Opaque: map.mtu: 0 - Opaque: map_t.v6.saddr: 0x0, map_t.v6.daddr: 0x0, map_t.v6.frag_offset: 0, map_t.v6.l4_offset: 0 - Opaque: map_t.v6.l4_protocol: 0, map_t.checksum_offset: 0, map_t.mtu: 0 - Opaque: ip_frag.mtu: 0, ip_frag.next_index: 0, ip_frag.flags: 0x0 - Opaque: cop.current_config_index: 0 - Opaque: lisp.overlay_afi: 0 - Opaque: tcp.connection_index: 0, tcp.seq_number: 0, tcp.seq_end: 0, tcp.ack_number: 0, tcp.hdr_offset: 0, tcp.data_offset: 0 - Opaque: tcp.data_len: 0, tcp.flags: 0x0 - Opaque: sctp.connection_index: 0, sctp.sid: 0, sctp.ssn: 0, sctp.tsn: 0, sctp.hdr_offset: 0 - Opaque: sctp.data_offset: 0, sctp.data_len: 0, sctp.subconn_idx: 0, sctp.flags: 0x0 - Opaque: snat.flags: 0x0 - Opaque: - VPP Buffer Opaque2 - Opaque2: raw: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 - Opaque2: qos.bits: 0, qos.source: 0 - Opaque2: loop_counter: 0 - Opaque2: gbp.flags: 0, gbp.src_epg: 0 - Opaque2: pg_replay_timestamp: 0 - Opaque2: - Ethernet II, Src: 06:d6:01:41:3b:92 (06:d6:01:41:3b:92), Dst: IntelCor_3d:f6 Transmission Control Protocol, Src Port: 22432, Dst Port: 54084, Seq: 1, Ack: 1, Len: 36 - Source Port: 22432 - Destination Port: 54084 - TCP payload (36 bytes) - Data (36 bytes) - - 0000 cf aa 8b f5 53 14 d4 c7 29 75 3e 56 63 93 9d 11 ....S...)u>Vc... - 0010 e5 f2 92 27 86 56 4c 21 ce c5 23 46 d7 eb ec 0d ...'.VL!..#F.... - 0020 a8 98 36 5a ..6Z - Data: cfaa8bf55314d4c729753e5663939d11e5f2922786564c21… - [Length: 36] -``` - -It's a matter of a couple of mouse-clicks in Wireshark to filter the -trace to a specific buffer index. With that specific kind of filtration, -one can watch a packet walk through the forwarding graph; noting any/all -metadata changes, header checksum changes, and so forth. - -This should be of significant value when developing new vpp graph -nodes. If new code mispositions b->current_data, it will be completely -obvious from looking at the dispatch trace in wireshark. - -## pcap rx, tx, and drop tracing - -vpp also supports rx, tx, and drop packet capture in pcap format, -through the "pcap trace" debug CLI command. - -This command is used to start or stop a packet capture, or show the -status of packet capture. Each of "pcap trace rx", "pcap trace tx", -and "pcap trace drop" is implemented. Supply one or more of "rx", -"tx", and "drop" to enable multiple simultaneous capture types. - -These commands have the following optional parameters: - -- <b>rx</b> - trace received packets. - -- <b>tx</b> - trace transmitted packets. - -- <b>drop</b> - trace dropped packets. - -- <b>max _nnnn_</b> - file size, number of packet captures. Once - <nnnn> packets have been received, the trace buffer buffer is flushed - to the indicated file. Defaults to 1000. Can only be updated if packet - capture is off. - -- <b>max-bytes-per-pkt _nnnn_</b> - maximum number of bytes to trace - on a per-packet basis. Must be >32 and less than 9000. Default value: - 512. - -- <b>filter</b> - Use the pcap rx / tx / drop trace filter, which must - be configured. Use <b>classify filter pcap...</b> to configure the - filter. The filter will only be executed if the per-interface or - any-interface tests fail. - -- <b>intfc _interface_ | _any_</b> - Used to specify a given interface, - or use '<em>any</em>' to run packet capture on all interfaces. - '<em>any</em>' is the default if not provided. Settings from a previous - packet capture are preserved, so '<em>any</em>' can be used to reset - the interface setting. - -- <b>file _filename_</b> - Used to specify the output filename. The - file will be placed in the '<em>/tmp</em>' directory. If _filename_ - already exists, file will be overwritten. If no filename is - provided, '<em>/tmp/rx.pcap or tx.pcap</em>' will be used, depending - on capture direction. Can only be updated when pcap capture is off. - -- <b>status</b> - Displays the current status and configured - attributes associated with a packet capture. If packet capture is in - progress, '<em>status</em>' also will return the number of packets - currently in the buffer. Any additional attributes entered on - command line with a '<em>status</em>' request will be ignored. - -- <b>filter</b> - Capture packets which match the current packet - trace filter set. See next section. Configure the capture filter - first. - -## packet trace capture filtering - -The "classify filter pcap | <interface-name> | trace" debug CLI command -constructs an arbitrary set of packet classifier tables for use with -"pcap rx | tx | drop trace," and with the vpp packet tracer on a -per-interface or system-wide basis. - -Packets which match a rule in the classifier table chain will be -traced. The tables are automatically ordered so that matches in the -most specific table are tried first. - -It's reasonably likely that folks will configure a single table with -one or two matches. As a result, we configure 8 hash buckets and 128K -of match rule space by default. One can override the defaults by -specifying "buckets <nnn>" and "memory-size <xxx>" as desired. - -To build up complex filter chains, repeatedly issue the classify -filter debug CLI command. Each command must specify the desired mask -and match values. If a classifier table with a suitable mask already -exists, the CLI command adds a match rule to the existing table. If -not, the CLI command add a new table and the indicated mask rule - -### Configure a simple pcap classify filter - -``` - classify filter pcap mask l3 ip4 src match l3 ip4 src 192.168.1.11 - pcap trace rx max 100 filter -``` - -### Configure a simple per-interface capture filter - -``` - classify filter GigabitEthernet3/0/0 mask l3 ip4 src match l3 ip4 src 192.168.1.11" - pcap trace rx max 100 intfc GigabitEthernet3/0/0 -``` - -Note that per-interface capture filters are _always_ applied. - -### Clear per-interface capture filters - -``` - classify filter GigabitEthernet3/0/0 del -``` - -### Configure another fairly simple pcap classify filter - -``` - classify filter pcap mask l3 ip4 src dst match l3 ip4 src 192.168.1.10 dst 192.168.2.10 - pcap trace tx max 100 filter -``` - -### Configure a vpp packet tracer filter - -``` - classify filter trace mask l3 ip4 src dst match l3 ip4 src 192.168.1.10 dst 192.168.2.10 - trace add dpdk-input 100 filter -``` - -### Clear all current classifier filters - -``` - classify filter [pcap | <interface> | trace] del -``` - -### To inspect the classifier tables - -``` - show classify table [verbose] -``` - -The verbose form displays all of the match rules, with hit-counters. - -### Terse description of the "mask <xxx>" syntax: - -``` - l2 src dst proto tag1 tag2 ignore-tag1 ignore-tag2 cos1 cos2 dot1q dot1ad - l3 ip4 <ip4-mask> ip6 <ip6-mask> - <ip4-mask> version hdr_length src[/width] dst[/width] - tos length fragment_id ttl protocol checksum - <ip6-mask> version traffic-class flow-label src dst proto - payload_length hop_limit protocol - l4 tcp <tcp-mask> udp <udp_mask> src_port dst_port - <tcp-mask> src dst # ports - <udp-mask> src_port dst_port -``` - -To construct **matches**, add the values to match after the indicated -keywords in the mask syntax. For example: "... mask l3 ip4 src" -> -"... match l3 ip4 src 192.168.1.11" - -## VPP Packet Generator - -We use the VPP packet generator to inject packets into the forwarding -graph. The packet generator can replay pcap traces, and generate packets -out of whole cloth at respectably high performance. - -The VPP pg enables quite a variety of use-cases, ranging from functional -testing of new data-plane nodes to regression testing to performance -tuning. - -## PG setup scripts - -PG setup scripts describe traffic in detail, and leverage vpp debug -CLI mechanisms. It's reasonably unusual to construct a pg setup script -which doesn't include a certain amount of interface and FIB configuration. - -For example: - -``` - loop create - set int ip address loop0 192.168.1.1/24 - set int state loop0 up - - packet-generator new { - name pg0 - limit 100 - rate 1e6 - size 300-300 - interface loop0 - node ethernet-input - data { IP4: 1.2.3 -> 4.5.6 - UDP: 192.168.1.10 - 192.168.1.254 -> 192.168.2.10 - UDP: 1234 -> 2345 - incrementing 286 - } - } -``` - -A packet generator stream definition includes two major sections: -- Stream Parameter Setup -- Packet Data - -### Stream Parameter Setup - -Given the example above, let's look at how to set up stream -parameters: - -- **name pg0** - Name of the stream, in this case "pg0" - -- **limit 1000** - Number of packets to send when the stream is -enabled. "limit 0" means send packets continuously. - -- **maxframe \<nnn\>** - Maximum frame size. Handy for injecting -multiple frames no larger than \<nnn\>. Useful for checking dual / -quad loop codes - -- **rate 1e6** - Packet injection rate, in this case 1 MPPS. When not -specified, the packet generator injects packets as fast as possible - -- **size 300-300** - Packet size range, in this case send 300-byte packets - -- **interface loop0** - Packets appear as if they were received on the -specified interface. This datum is used in multiple ways: to select -graph arc feature configuration, to select IP FIBs. Configure -features e.g. on loop0 to exercise those features. - -- **tx-interface \<name\>** - Packets will be transmitted on the -indicated interface. Typically required only when injecting packets -into post-IP-rewrite graph nodes. - -- **pcap \<filename\>** - Replay packets from the indicated pcap -capture file. "make test" makes extensive use of this feature: -generate packets using scapy, save them in a .pcap file, then inject -them into the vpp graph via a vpp pg "pcap \<filename\>" stream -definition - -- **worker \<nn\>** - Generate packets for the stream using the -indicated vpp worker thread. The vpp pg generates and injects O(10 -MPPS / core). Use multiple stream definitions and worker threads to -generate and inject enough traffic to easily fill a 40 gbit pipe with -small packets. - -### Data definition - -Packet generator data definitions make use of a layered implementation -strategy. Networking layers are specified in order, and the notation can -seem a bit counter-intuitive. In the example above, the data -definition stanza constructs a set of L2-L4 headers layers, and -uses an incrementing fill pattern to round out the requested 300-byte -packets. - -- **IP4: 1.2.3 -> 4.5.6** - Construct an L2 (MAC) header with the ip4 -ethertype (0x800), src MAC address of 00:01:00:02:00:03 and dst MAC -address of 00:04:00:05:00:06. Mac addresses may be specified in either -_xxxx.xxxx.xxxx_ format or _xx:xx:xx:xx:xx:xx_ format. - -- **UDP: 192.168.1.10 - 192.168.1.254 -> 192.168.2.10** - Construct an -incrementing set of L3 (IPv4) headers for successive packets with -source addresses ranging from .10 to .254. All packets in the stream -have a constant dest address of 192.168.2.10. Set the protocol field -to 17, UDP. - -- **UDP: 1234 -> 2345** - Set the UDP source and destination ports to -1234 and 2345, respectively - -- **incrementing 256** - Insert up to 256 incrementing data bytes. - -Obvious variations involve "s/IP4/IP6/" in the above, along with -changing from IPv4 to IPv6 address notation. - -The vpp pg can set any / all IPv4 header fields, including tos, packet -length, mf / df / fragment id and offset, ttl, protocol, checksum, and -src/dst addresses. Take a look at ../src/vnet/ip/ip[46]_pg.c for -details. - -If all else fails, specify the entire packet data in hex: - -- **hex 0xabcd...** - copy hex data verbatim into the packet - -When replaying pcap files ("**pcap \<filename\>**"), do not specify a -data stanza. - -### Diagnosing "packet-generator new" parse failures - -If you want to inject packets into a brand-new graph node, remember -to tell the packet generator debug CLI how to parse the packet -data stanza. - -If the node expects L2 Ethernet MAC headers, specify ".unformat_buffer -= unformat_ethernet_header": - -``` - /* *INDENT-OFF* */ - VLIB_REGISTER_NODE (ethernet_input_node) = - { - <snip> - .unformat_buffer = unformat_ethernet_header, - <snip> - }; -``` - -Beyond that, it may be necessary to set breakpoints in -.../src/vnet/pg/cli.c. Debug image suggested. - -When debugging new nodes, it may be far simpler to directly inject -ethernet frames - and add a corresponding vlib_buffer_advance in the -new node - than to modify the packet generator. - -## Debug CLI - -The descriptions above describe the "packet-generator new" debug CLI in -detail. - -Additional debug CLI commands include: - -``` - vpp# packet-generator enable [<stream-name>] -``` - -which enables the named stream, or all streams. - -``` - vpp# packet-generator disable [<stream-name>] -``` - -disables the named stream, or all streams. - - -``` - vpp# packet-generator delete <stream-name> -``` - -Deletes the named stream. - -``` - vpp# packet-generator configure <stream-name> [limit <nnn>] - [rate <f64-pps>] [size <nn>-<nn>] -``` - -Changes stream parameters without having to recreate the entire stream -definition. Note that re-issuing a "packet-generator new" command will -correctly recreate the named stream. |