aboutsummaryrefslogtreecommitdiffstats
path: root/docs/report/introduction/test_environment_sut_calib_clx.rst
blob: e12b0af6933ef22230ca214af6d9eacaf1ae553a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
Calibration Data - Cascade Lake
-------------------------------

Following sections include sample calibration data measured on
s32-t27-sut1 server running in one of the Intel Xeon Skylake testbeds as
specified in `FD.io CSIT testbeds - Xeon Cascade Lake`_.

Calibration data obtained from all other servers in Cascade Lake testbeds
shows the same or similar values.


Linux cmdline
~~~~~~~~~~~~~

::

    $ cat /proc/cmdline
    BOOT_IMAGE=/boot/vmlinuz-4.15.0-72-generic root=UUID=1d03969e-a2a0-41b2-a97e-1cc171b07e88 ro isolcpus=1-23,25-47,49-71,73-95 nohz_full=1-23,25-47,49-71,73-95 rcu_nocbs=1-23,25-47,49-71,73-95 numa_balancing=disable intel_pstate=disable intel_iommu=on iommu=pt nmi_watchdog=0 audit=0 nosoftlockup processor.max_cstate=1 intel_idle.max_cstate=1 hpet=disable tsc=reliable mce=off console=tty0 console=ttyS0,115200n8

Linux uname
~~~~~~~~~~~

::

    $ uname -a
    Linux s32-t27-sut1 4.15.0-72-generic #81-Ubuntu SMP Tue Nov 26 12:20:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux


System-level Core Jitter
~~~~~~~~~~~~~~~~~~~~~~~~

::

    $ sudo taskset -c 3 /home/testuser/pma_tools/jitter/jitter -i 30
    Linux Jitter testing program version 1.9
    Iterations=30
    The pragram will execute a dummy function 80000 times
    Display is updated every 20000 displayUpdate intervals
    Thread affinity will be set to core_id:7
    Timings are in CPU Core cycles
    Inst_Min:    Minimum Excution time during the display update interval(default is ~1 second)
    Inst_Max:    Maximum Excution time during the display update interval(default is ~1 second)
    Inst_jitter: Jitter in the Excution time during rhe display update interval. This is the value of interest
    last_Exec:   The Excution time of last iteration just before the display update
    Abs_Min:     Absolute Minimum Excution time since the program started or statistics were reset
    Abs_Max:     Absolute Maximum Excution time since the program started or statistics were reset
    tmp:         Cumulative value calcualted by the dummy function
    Interval:    Time interval between the display updates in Core Cycles
    Sample No:   Sample number

    Inst_Min,Inst_Max,Inst_jitter,last_Exec,Abs_min,Abs_max,tmp,Interval,Sample No
    160022,167590,7568,160026,160022,167590,2057568256,3203711852,1
    160022,170628,10606,160024,160022,170628,4079222784,3204010824,2
    160022,169824,9802,160024,160022,170628,1805910016,3203812064,3
    160022,168832,8810,160030,160022,170628,3827564544,3203792594,4
    160022,168248,8226,160026,160022,170628,1554251776,3203765920,5
    160022,167834,7812,160028,160022,170628,3575906304,3203761114,6
    160022,167442,7420,160024,160022,170628,1302593536,3203769250,7
    160022,169120,9098,160028,160022,170628,3324248064,3203853340,8
    160022,170710,10688,160024,160022,170710,1050935296,3203985878,9
    160022,167952,7930,160024,160022,170710,3072589824,3203733756,10
    160022,168314,8292,160030,160022,170710,799277056,3203741152,11
    160022,169672,9650,160024,160022,170710,2820931584,3203739910,12
    160022,168684,8662,160024,160022,170710,547618816,3203727336,13
    160022,168246,8224,160024,160022,170710,2569273344,3203739052,14
    160022,168134,8112,160030,160022,170710,295960576,3203735874,15
    160022,170230,10208,160024,160022,170710,2317615104,3203996356,16
    160022,167190,7168,160024,160022,170710,44302336,3203713628,17
    160022,167304,7282,160024,160022,170710,2065956864,3203717954,18
    160022,167500,7478,160024,160022,170710,4087611392,3203706674,19
    160022,167302,7280,160024,160022,170710,1814298624,3203726452,20
    160022,167266,7244,160024,160022,170710,3835953152,3203702804,21
    160022,167820,7798,160022,160022,170710,1562640384,3203719138,22
    160022,168100,8078,160024,160022,170710,3584294912,3203716636,23
    160022,170408,10386,160024,160022,170710,1310982144,3203946958,24
    160022,167276,7254,160024,160022,170710,3332636672,3203706236,25
    160022,167052,7030,160024,160022,170710,1059323904,3203696444,26
    160022,170322,10300,160024,160022,170710,3080978432,3203747514,27
    160022,167332,7310,160024,160022,170710,807665664,3203716210,28
    160022,167426,7404,160026,160022,170710,2829320192,3203700630,29
    160022,168840,8818,160024,160022,170710,556007424,3203727658,30


Memory Bandwidth
~~~~~~~~~~~~~~~~

::

    $ sudo /home/testuser/mlc --bandwidth_matrix
    Intel(R) Memory Latency Checker - v3.7
    Command line parameters: --bandwidth_matrix

    Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes
    Measuring Memory Bandwidths between nodes within system
    Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
    Using all the threads from each core if Hyper-threading is enabled
    Using Read-only traffic type
                    Numa node
    Numa node            0       1
           0        122097.7     51327.9
           1        51309.2      122005.5

::

    $ sudo /home/testuser/mlc --peak_injection_bandwidth
    Intel(R) Memory Latency Checker - v3.7
    Command line parameters: --peak_injection_bandwidth

    Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes

    Measuring Peak Injection Memory Bandwidths for the system
    Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
    Using all the threads from each core if Hyper-threading is enabled
    Using traffic with the following read-write ratios
    ALL Reads        :      243159.4
    3:1 Reads-Writes :      219132.5
    2:1 Reads-Writes :      216603.1
    1:1 Reads-Writes :      203713.0
    Stream-triad like:      193790.8

::

    $ sudo /home/testuser/mlc --max_bandwidth
    Intel(R) Memory Latency Checker - v3.7
    Command line parameters: --max_bandwidth

    Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes

    Measuring Maximum Memory Bandwidths for the system
    Will take several minutes to complete as multiple injection rates will be tried to get the best bandwidth
    Bandwidths are in MB/sec (1 MB/sec = 1,000,000 Bytes/sec)
    Using all the threads from each core if Hyper-threading is enabled
    Using traffic with the following read-write ratios
    ALL Reads        :      244114.27
    3:1 Reads-Writes :      219441.97
    2:1 Reads-Writes :      216603.72
    1:1 Reads-Writes :      203679.09
    Stream-triad like:      214902.80


Memory Latency
~~~~~~~~~~~~~~

::

    $ sudo /home/testuser/mlc --latency_matrix
    Intel(R) Memory Latency Checker - v3.7
    Command line parameters: --latency_matrix

    Using buffer size of 2000.000MiB
    Measuring idle latencies (in ns)...
                    Numa node
    Numa node            0       1
           0          81.2   130.2
           1         130.2    81.1

::

    $ sudo /home/testuser/mlc --idle_latency
    Intel(R) Memory Latency Checker - v3.7
    Command line parameters: --idle_latency

    Using buffer size of 2000.000MiB
    Each iteration took 186.1 core clocks ( 80.9    ns)

::

    $ sudo /home/testuser/mlc --loaded_latency
    Intel(R) Memory Latency Checker - v3.7
    Command line parameters: --loaded_latency

    Using buffer size of 100.000MiB/thread for reads and an additional 100.000MiB/thread for writes

    Measuring Loaded Latencies for the system
    Using all the threads from each core if Hyper-threading is enabled
    Using Read-only traffic type
    Inject  Latency Bandwidth
    Delay   (ns)    MB/sec
    ==========================
     00000  233.86   243421.9
     00002  230.61   243544.1
     00008  232.56   243394.5
     00015  229.52   244076.6
     00050  225.82   244290.6
     00100  161.65   236744.8
     00200  100.63   133844.0
     00300   96.84    90548.2
     00400   95.71    68504.3
     00500   95.68    55139.0
     00700   88.77    39798.4
     01000   84.74    28200.1
     01300   83.08    21915.5
     01700   82.27    16969.3
     02500   81.66    11810.6
     03500   81.98     8662.9
     05000   81.48     6306.8
     09000   81.17     3857.8
     20000   80.19     2179.9


L1/L2/LLC Latency
~~~~~~~~~~~~~~~~~

::

    $ sudo /home/testuser/mlc --c2c_latency
    Intel(R) Memory Latency Checker - v3.7
    Command line parameters: --c2c_latency

    Measuring cache-to-cache transfer latency (in ns)...
    Local Socket L2->L2 HIT  latency        55.5
    Local Socket L2->L2 HITM latency        55.6
    Remote Socket L2->L2 HITM latency (data address homed in writer socket)
                            Reader Numa Node
    Writer Numa Node     0       1
                0        -   115.6
                1    115.6       -
    Remote Socket L2->L2 HITM latency (data address homed in reader socket)
                            Reader Numa Node
    Writer Numa Node     0       1
                0        -   178.2
                1    178.4       -

.. include:: ../introduction/test_environment_sut_meltspec_clx.rst