aboutsummaryrefslogtreecommitdiffstats
path: root/doc/guides/linux_gsg/nic_perf_intel_platform.rst
blob: febd733780dfa4d8d1000f35db4cde5910466af9 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
How to get best performance with NICs on Intel platforms
========================================================

This document is a step-by-step guide for getting high performance from DPDK applications on Intel platforms.


Hardware and Memory Requirements
--------------------------------

For best performance use an Intel Xeon class server system such as Ivy Bridge, Haswell or newer.

Ensure that each memory channel has at least one memory DIMM inserted, and that the memory size for each is at least 4GB.
**Note**: this has one of the most direct effects on performance.

You can check the memory configuration using ``dmidecode`` as follows::

      dmidecode -t memory | grep Locator

      Locator: DIMM_A1
      Bank Locator: NODE 1
      Locator: DIMM_A2
      Bank Locator: NODE 1
      Locator: DIMM_B1
      Bank Locator: NODE 1
      Locator: DIMM_B2
      Bank Locator: NODE 1
      ...
      Locator: DIMM_G1
      Bank Locator: NODE 2
      Locator: DIMM_G2
      Bank Locator: NODE 2
      Locator: DIMM_H1
      Bank Locator: NODE 2
      Locator: DIMM_H2
      Bank Locator: NODE 2

The sample output above shows a total of 8 channels, from ``A`` to ``H``, where each channel has 2 DIMMs.

You can also use ``dmidecode`` to determine the memory frequency::

      dmidecode -t memory | grep Speed

      Speed: 2133 MHz
      Configured Clock Speed: 2134 MHz
      Speed: Unknown
      Configured Clock Speed: Unknown
      Speed: 2133 MHz
      Configured Clock Speed: 2134 MHz
      Speed: Unknown
      ...
      Speed: 2133 MHz
      Configured Clock Speed: 2134 MHz
      Speed: Unknown
      Configured Clock Speed: Unknown
      Speed: 2133 MHz
      Configured Clock Speed: 2134 MHz
      Speed: Unknown
      Configured Clock Speed: Unknown

The output shows a speed of 2133 MHz (DDR4) and Unknown (not existing).
This aligns with the previous output which showed that each channel has one memory bar.


Network Interface Card Requirements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Use a `DPDK supported <http://dpdk.org/doc/nics>`_ high end NIC such as the Intel XL710 40GbE.

Make sure each NIC has been flashed the latest version of NVM/firmware.

Use PCIe Gen3 slots, such as Gen3 ``x8`` or Gen3 ``x16`` because PCIe Gen2 slots don't provide enough bandwidth
for 2 x 10GbE and above.
You can use ``lspci`` to check the speed of a PCI slot using something like the following::

      lspci -s 03:00.1 -vv | grep LnkSta

      LnkSta: Speed 8GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- ...
      LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ ...

When inserting NICs into PCI slots always check the caption, such as CPU0 or CPU1 to indicate which socket it is connected to.

Care should be take with NUMA.
If you are using 2 or more ports from different NICs, it is best to ensure that these NICs are on the same CPU socket.
An example of how to determine this is shown further below.


BIOS Settings
~~~~~~~~~~~~~

The following are some recommendations on BIOS settings. Different platforms will have different BIOS naming
so the following is mainly for reference:

#. Before starting consider resetting all BIOS settings to their default.

#. Disable all power saving options such as: Power performance tuning, CPU P-State, CPU C3 Report and CPU C6 Report.

#. Select **Performance** as the CPU Power and Performance policy.

#. Disable Turbo Boost to ensure the performance scaling increases with the number of cores.

#. Set memory frequency to the highest available number, NOT auto.

#. Disable all virtualization options when you test the physical function of the NIC, and turn on ``VT-d`` if you wants to use VFIO.


Linux boot command line
~~~~~~~~~~~~~~~~~~~~~~~

The following are some recommendations on GRUB boot settings:

#. Use the default grub file as a starting point.

#. Reserve 1G huge pages via grub configurations. For example to reserve 8 huge pages of 1G size::

      default_hugepagesz=1G hugepagesz=1G hugepages=8

#. Isolate CPU cores which will be used for DPDK. For example::

      isolcpus=2,3,4,5,6,7,8

#. If it wants to use VFIO, use the following additional grub parameters::

      iommu=pt intel_iommu=on


Configurations before running DPDK
----------------------------------

1. Build the DPDK target and reserve huge pages.
   See the earlier section on :ref:`linux_gsg_hugepages` for more details.

   The following shell commands may help with building and configuration:

   .. code-block:: console

      # Build DPDK target.
      cd dpdk_folder
      make install T=x86_64-native-linuxapp-gcc -j

      # Get the hugepage size.
      awk '/Hugepagesize/ {print $2}' /proc/meminfo

      # Get the total huge page numbers.
      awk '/HugePages_Total/ {print $2} ' /proc/meminfo

      # Unmount the hugepages.
      umount `awk '/hugetlbfs/ {print $2}' /proc/mounts`

      # Create the hugepage mount folder.
      mkdir -p /mnt/huge

      # Mount to the specific folder.
      mount -t hugetlbfs nodev /mnt/huge

2. Check the CPU layout using using the DPDK ``cpu_layout`` utility:

   .. code-block:: console

      cd dpdk_folder

      usertools/cpu_layout.py

   Or run ``lscpu`` to check the the cores on each socket.

3. Check your NIC id and related socket id:

   .. code-block:: console

      # List all the NICs with PCI address and device IDs.
      lspci -nn | grep Eth

   For example suppose your output was as follows::

      82:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
      82:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
      85:00.0 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]
      85:00.1 Ethernet [0200]: Intel XL710 for 40GbE QSFP+ [8086:1583]

   Check the PCI device related numa node id:

   .. code-block:: console

      cat /sys/bus/pci/devices/0000\:xx\:00.x/numa_node

   Usually ``0x:00.x`` is on socket 0 and ``8x:00.x`` is on socket 1.
   **Note**: To get the best performance, ensure that the core and NICs are in the same socket.
   In the example above ``85:00.0`` is on socket 1 and should be used by cores on socket 1 for the best performance.

4. Check which kernel drivers needs to be loaded and whether there is a need to unbind the network ports from their kernel drivers.
More details about DPDK setup and Linux kernel requirements see :ref:`linux_gsg_compiling_dpdk` and :ref:`linux_gsg_linux_drivers`.