summaryrefslogtreecommitdiffstats
path: root/extras/selinux/selinux_doc.rst
blob: a902ec675ce8e9da9ddef00d1caff61d467771c1 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
.. _selinux_doc:

SELinux - VPP Custom SELinux Policy
===================================

Overview
--------

Security-enhanced Linux (SELinux) is a security feature in the Linux
kernel. At a very high level, SELinux implements mandatory access
controls (MAC), as opposed to discretionary access control (DAC)
implemented in standard Linux. MAC defines how processes can interact
with other system components (Files, Directories, Other Processes,
Pipes, Sockets, Network Ports). Each system component is assigned a
label, and then the SELinux Policy defines which labels and which
actions on each label a process is able to perform. The VPP Custom
SELinux Policy defines the actions VPP is allowed to perform on which
labels.

The VPP Custom SELinux Policy is intended to be installed on RPM based
platforms (tested on CentOS 7 and RHEL 7). Though SELinux can run on
Debian platforms, it typically is not and therefore is not currently
being built for Debian.

The VPP Custom SELinux Policy does not enable or disable SELinux, only
allows VPP to run when SELinux is enabled. A fresh install of either
Fedora, CentOS or RHEL will have SELinux enabled by default. To
determine if SELinux is enabled on a given system and enable it if
needed, run:

::

      $ getenforce
      Permissive

      $ sudo setenforce 1

      $ getenforce
      Enforcing

To make the change persistent, modify the following file to set
``SELINUX=enforcing``:

::

      $ sudo vi /etc/selinux/config
      :
      # This file controls the state of SELinux on the system.
      # SELINUX= can take one of these three values:
      #     enforcing - SELinux security policy is enforced.
      #     permissive - SELinux prints warnings instead of enforcing.
      #     disabled - No SELinux policy is loaded.
      SELINUX=enforcing
      :

Installation
------------

To install VPP, see the installation instructions on the VPP Wiki
(https://wiki.fd.io/view/VPP/Installing_VPP_binaries_from_packages). The
VPP Custom SELinux Policy is packaged in its own RPM starting in 18.04,
``vpp-selinux-policy-<VERSION>-<RELEASE>.rpm``. It is packaged and
installed along with the other VPP RPMs.

Fresh Install of VPP
~~~~~~~~~~~~~~~~~~~~

If VPP has never been installed on a system, then starting in 18.04, the
VPP Custom SELinux Policy will be installed with the other RPMs and all
the system components managed by VPP will be labeled properly.

Fix SELinux Labels for VPP
~~~~~~~~~~~~~~~~~~~~~~~~~~

In the case where the VPP Custom Policy is being installed for the first
time, either because VPP has been upgraded or packages were removed and
then reinstalled, several directories and files will not not be properly
labeled. The labels on these files will need to be fixed for VPP to run
properly with SELinux enabled. After the VPP Custom SELinux Policy is
installed, run the following commands to fix the labels. If VPP is
already running, make sure to restart VPP after the labels are fixed.
This change is persistent for the life of the file. Once the VPP Custom
Policy is installed on the system, subsequent files created by VPP will
be labeled properly. This is only to fix files created by VPP prior to
the VPP Custom Policy being installed.

::

     $ sudo restorecon -Rv /etc/vpp/
     $ sudo restorecon -Rv /usr/lib/vpp_api_test_plugins/
     $ sudo restorecon -Rv /usr/lib/vpp_plugins/
     $ sudo restorecon -Rv /usr/share/vpp/
     $ sudo restorecon -Rv /var/run/vpp/

     $ sudo chcon -t vpp_tmp_t /tmp/vpp_*
     $ sudo chcon -t vpp_var_run_t /var/run/.vpp_*

**NOTE:** Because the VPP APIs allow custom filenames in certain
scenarios, the above commands may not handle all files. Inspect your
system and correct any files that are mislabeled. For example, to verify
all VPP files in ``/tmp/`` are labeled properly, run:

::

     $ sudo ls -alZ /tmp/

Any files not properly labeled with ``vpp_tmp_t``, run:

::

     $ sudo chcon -t vpp_tmp_t /tmp/<filename>

VPP Files
---------

Recommended Default File Directories
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Documentation in the VPP Wiki (https://wiki.fd.io/view/VPP/) and doxygen
generated documentation have examples with files located in certain
directories. Some of the recommend file locations have been moved to
satisfy SELinux. Most of the documentation has been updated, but links
to older documentation still exist and there may have been instances
that were missed. Use the file locations described below to allow
SELinux to properly label the given files.

File locations that have changed: \* VPP Debug CLI Script Files \* vHost
Sockets \* VPP Log Files

VPP Debug CLI Script Files
^^^^^^^^^^^^^^^^^^^^^^^^^^

The VPP Debug CLI, ``vppctl``, allows a sequence of CLI commands to be
read from a file and executed. To avoid from having to grant VPP access
to all of ``/tmp/`` and possibly ``/home/`` sub-directories, it is
recommended that any VPP Debug CLI script files be placed in a common
directory such as ``/usr/share/vpp/``.

For example:

::

   $ cat /usr/share/vpp/scripts/gigup.txt
   set interface state GigabitEthernet0/8/0 up
   set interface state GigabitEthernet0/9/0 up

To execute:

::

   $ vppctl exec /usr/share/vpp/scripts/gigup.txt

Or

::

   $ vppctl
       _______    _        _   _____  ___
    __/ __/ _ \  (_)__    | | / / _ \/ _ \
    _/ _// // / / / _ \   | |/ / ___/ ___/
    /_/ /____(_)_/\___/   |___/_/  /_/

   vpp# exec /usr/share/vpp/scripts/gigup.txt
   vpp# quit

If the file is not labeled properly, you will see something similar to:

::

   $ vppctl exec /home/<user>/dev/vpp/scripts/vppctl/gigup.txt
   exec: failed to open `/home/<user>/dev/vpp/scripts/vppctl/gigup.txt': Permission denied

   $ ls -alZ
   drwxrwxr-x. <user> <user> unconfined_u:object_r:user_home_t:s0 .
   drwxrwxr-x. <user> <user> unconfined_u:object_r:user_home_t:s0 ..
   -rw-r--r--. <user> <user> unconfined_u:object_r:user_home_t:s0 gigup.txt

Original Documentation
''''''''''''''''''''''

Some of the original documentation showed script files being executed
out of ``/tmp/``. Convenience also may lead to script files being placed
in ``/home/<user>/`` subdirectories. If a file is generated by the VPP
process in ``/tmp/``, for example a trace file or pcap file, it will get
properly labeled with the SELinux label ``vpp_tmp_t``. When a file is
created, unless a rule is in place for the process that created it, the
file will inherit the SELinux label of the parent directory. So if a
user creates a file themselves in ``/tmp/``, it will get the SELinux
label ``tmp_t``, which VPP does not have permission to access. Therefore
it is recommended that script files are located as described above.

vHost Sockets
^^^^^^^^^^^^^

vHost sockets are created from VPP perspective in either Server or
Client mode. In Server mode, the socket name is provided to VPP and VPP
creates the socket. In Client mode, the socket name is provided to VPP
and the hypervisor creates the socket. In order for VPP and hypervisor
to share the socket resource with SELinux enabled, a rule in the VPP
Custom SELinux Policy has been added. This rules allows processes with
the ``svirt_t`` label (the hypervisor) to access sockets with the
``vpp_var_run_t`` label. As such, when SELinux is enabled, vHost sockets
should be created in the directory ``/var/run/vpp/``.

.. _original-documentation-1:

Original Documentation
''''''''''''''''''''''

Some of the original documentation showed vHost sockets being created in
the directory ``/tmp/``. To work properly with SELinux enabled, vHost
sockets should be created as described above.

VPP Log Files
^^^^^^^^^^^^^

The VPP log file location is set by updating the
``/etc/vpp/startup.conf`` file:

::

   vi /etc/vpp/startup.conf
   unix {
   :
     log /var/log/vpp/vpp.log
   :
   }

By moving the log file to ``/var/log/vpp/``, it will get the label
``vpp_log_t``, which indicates that the files are log files so they
benefit from the associated rules (for example granting rights to
logrotate so that it can manipulate them).

.. _original-documentation-2:

Original Documentation
''''''''''''''''''''''

The default ``startup.conf`` file creates the VPP log file in
``/tmp/vpp.log``. By leaving the log file in ``/tmp/``, it will get the
label ``vpp_tmp_t``. Moving it to ``/var/log/vpp/``, it will get the
label ``vpp_log_t``.

Use of Non-default File Directories
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

VPP installs multiple files on the system. Some files have fixed
directory and file names: - /etc/bash_completion.d/vppctl_completion -
/etc/sysctl.d/80-vpp.conf - /usr/lib/systemd/system/vpp.service

Others files have default directory and file names but the default can
be overwritten: - /etc/vpp/startup.conf - Can be changed via the
``/usr/lib/systemd/system/vpp.service`` file by changing the -c option
on the VPP command line:

::

   ExecStart=/usr/bin/vpp -c /etc/vpp/startup.conf

-  /run/vpp/cli.sock

   -  Can be changed via the ``/etc/vpp/startup.conf`` file by changing
      the cli-listen setting:

::

   unix {
   :
     cli-listen /run/vpp/cli.sock
   :
   }

-  /var/log/vpp/vpp.log

   -  Can be changed via the ``/etc/vpp/startup.conf`` file by changing
      the log setting:

::

   unix {
     :
     log /var/log/vpp/vpp.log
     :
   }

If the directory of any VPP installed files is changed from the default,
ensure that the proper SELiunx label is applied. The SELinux label can
be determined by passing the -Z option to many common Linux commands:

::

   ls -alZ /run/vpp/
   drwxr-xr-x. root vpp  system_u:object_r:vpp_var_run_t:s0 .
   drwxr-xr-x. root root system_u:object_r:var_run_t:s0     ..
   srwxrwxr-x. root vpp  system_u:object_r:vpp_var_run_t:s0 cli.sock

VPP SELinux Types
~~~~~~~~~~~~~~~~~

The following SELinux types are created by the VPP Custom SELinux
Policy: - ``vpp_t`` - Applied to: - VPP process and spawned threads.

-  ``vpp_config_rw_t`` - Applied to:

   -  ``/etc/vpp/*``

-  ``vpp_tmp_t`` - Applied to:

   -  ``/tmp/*``

-  ``vpp_exec_t`` - Applied to:

   -  ``/usr/bin/*``

-  ``vpp_lib_t`` - Applied to:

   -  ``/usr/lib/vpp_api_test_plugins/*``
   -  ``/usr/lib/vpp_plugins/*``

-  ``vpp_unit_file_t`` - Applied to:

   -  ``/usr/lib/systemd/system/vpp.*``

-  ``vpp_log_t`` - Applied to:

   -  ``/var/log/vpp/*``

-  ``vpp_var_run_t`` - Applied to:

   -  ``/var/run/vpp/*``

Debug SELinux Issues
--------------------

If SELinux issues are suspected, there are a few steps that can be taken
to debug the issue. This section provides a few pointers on on those
steps. Any SELinux JIRAs will need this information to properly address
the issue.

Additional SELinux Packages and Setup
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

First, install the SELinux troubleshooting packages:

::

   $ sudo yum -y install setroubleshoot setroubleshoot-server setools-console
   -- OR --
   $ sudo dnf -y install setroubleshoot setroubleshoot-server setools-console

To enable proper logging, restart auditd:

::

   $ sudo service auditd restart

While debugging issues, it is best to set SELinux to ``Permissive``
mode. In ``Permissive`` mode, SELinux will still detect and flag errors,
but will allow processes to continue normal operation. This allows
multiple errors to be collected at once as opposed to breaking on each
individual error. To set SELinux to ``Permissive`` mode (until next
reboot or it is set back), use:

::

   $ sudo setenforce 0

   $ getenforce
   Permissive

After debugging, to set SELinux back to ``Enforcing`` mode, use:

::

   $ sudo setenforce 1

   $ getenforce
   Enforcing

Debugging
~~~~~~~~~

Once the SELinux troubleshooting packages are installed, perform the
actions that are suspected to be blocked by SELinux. Either ``tail`` the
log during these actions or ``grep`` the log for additional SELinux
logs:

::

   sudo tail -f /var/log/messages
   -- OR --
   sudo journalctl -f

Below are some examples of SELinux logs that are generated:

::

   May 14 11:28:34 svr-22 setroubleshoot: SELinux is preventing /usr/bin/vpp from read access on the file hostCreate.txt. For complete SELinux messages run: sealert -l a418f869-f470-4c8a-b8e9-bdd41f2dd60b
   May 14 11:28:34 svr-22 python: SELinux is preventing /usr/bin/vpp from read access on the file hostCreate.txt.#012#012*****  Plugin catchall (100. confidence) suggests   **************************#012#012If you believe that vpp should be allowed read access on the hostCreate.txt file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'vpp_main' --raw | audit2allow -M my-vppmain#012# semodule -i my-vppmain.pp#012
   May 14 11:28:34 svr-22 setroubleshoot: SELinux is preventing /usr/bin/vpp from read access on the file hostCreate.txt. For complete SELinux messages run: sealert -l a418f869-f470-4c8a-b8e9-bdd41f2dd60b
   May 14 11:28:34 svr-22 python: SELinux is preventing /usr/bin/vpp from read access on the file hostCreate.txt.#012#012*****  Plugin catchall (100. confidence) suggests   **************************#012#012If you believe that vpp should be allowed read access on the hostCreate.txt file by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'vpp_main' --raw | audit2allow -M my-vppmain#012# semodule -i my-vppmain.pp#012
   May 14 11:28:37 svr-22 setroubleshoot: SELinux is preventing vpp_main from map access on the packet_socket packet_socket. For complete SELinux messages run: sealert -l ab6667d9-3f14-4dbd-96a0-7a655f7b4eb1
   May 14 11:28:37 svr-22 python: SELinux is preventing vpp_main from map access on the packet_socket packet_socket.#012#012*****  Plugin catchall (100. confidence) suggests   **************************#012#012If you believe that vpp_main should be allowed map access on the packet_socket packet_socket by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'vpp_main' --raw | audit2allow -M my-vppmain#012# semodule -i my-vppmain.pp#012
   May 14 11:28:51 svr-22 setroubleshoot: SELinux is preventing vpp_main from map access on the packet_socket packet_socket. For complete SELinux messages run: sealert -l ab6667d9-3f14-4dbd-96a0-7a655f7b4eb1
   May 14 11:28:51 svr-22 python: SELinux is preventing vpp_main from map access on the packet_socket packet_socket.#012#012*****  Plugin catchall (100. confidence) suggests   **************************#012#012If you believe that vpp_main should be allowed map access on the packet_socket packet_socket by default.#012Then you should report this as a bug.#012You can generate a local policy module to allow this access.#012Do#012allow this access for now by executing:#012# ausearch -c 'vpp_main' --raw | audit2allow -M my-vppmain#012# semodule -i my-vppmain.pp#012

From the logs above, there are two sets of commands that are recommended
to be run. The first is to run the ``sealert`` command. The second is to
run the ``ausearch | audit2allow`` commands and the ``semodule``
command.

sealert Command
^^^^^^^^^^^^^^^

This ``sealert`` command provides a more detailed output for the given
issue detected.

::

   $ sealert -l a418f869-f470-4c8a-b8e9-bdd41f2dd60b
   SELinux is preventing /usr/bin/vpp from 'read, write' accesses on the chr_file noiommu-0.

   *****  Plugin device (91.4 confidence) suggests   ****************************

   If you want to allow vpp to have read write access on the noiommu-0 chr_file
   Then you need to change the label on noiommu-0 to a type of a similar device.
   Do
   # semanage fcontext -a -t SIMILAR_TYPE 'noiommu-0'
   # restorecon -v 'noiommu-0'

   *****  Plugin catchall (9.59 confidence) suggests   **************************

   If you believe that vpp should be allowed read write access on the noiommu-0 chr_file by default.
   Then you should report this as a bug.
   You can generate a local policy module to allow this access.
   Do
   allow this access for now by executing:
   # ausearch -c 'vpp' --raw | audit2allow -M my-vpp
   # semodule -i my-vpp.pp


   Additional Information:
   Source Context                system_u:system_r:vpp_t:s0
   Target Context                system_u:object_r:device_t:s0
   Target Objects                noiommu-0 [ chr_file ]
   Source                        vpp
   Source Path                   /usr/bin/vpp
   Port                          <Unknown>
   Host                          vpp_centos7_selinux
   Source RPM Packages           vpp-19.01.2-rc0~17_gcfd3086.x86_64
   Target RPM Packages
   Policy RPM                    selinux-policy-3.13.1-229.el7_6.12.noarch
   Selinux Enabled               True
   Policy Type                   targeted
   Enforcing Mode                Permissive
   Host Name                     vpp_centos7_selinux
   Platform                      Linux vpp_centos7_selinux
                                 3.10.0-957.12.1.el7.x86_64 #1 SMP Mon Apr 29
                                 14:59:59 UTC 2019 x86_64 x86_64
   Alert Count                   1
   First Seen                    2019-05-13 18:10:50 EDT
   Last Seen                     2019-05-13 18:10:50 EDT
   Local ID                      a418f869-f470-4c8a-b8e9-bdd41f2dd60b

   Raw Audit Messages
   type=AVC msg=audit(1557785450.964:257): avc:  denied  { read write } for  pid=5273 comm="vpp" name="noiommu-0" dev="devtmpfs" ino=36022 scontext=system_u:system_r:vpp_t:s0 tcontext=system_u:object_r:device_t:s0 tclass=chr_file permissive=1


   type=AVC msg=audit(1557785450.964:257): avc:  denied  { open } for  pid=5273 comm="vpp" path="/dev/vfio/noiommu-0" dev="devtmpfs" ino=36022 scontext=system_u:system_r:vpp_t:s0 tcontext=system_u:object_r:device_t:s0 tclass=chr_file permissive=1


   type=SYSCALL msg=audit(1557785450.964:257): arch=x86_64 syscall=open success=yes exit=ENOTBLK a0=7fb395ffd7f0 a1=2 a2=7fb395ffd803 a3=7fb395ffe2a0 items=0 ppid=1 pid=5273 auid=4294967295 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=993 sgid=0 fsgid=993 tty=(none) ses=4294967295 comm=vpp exe=/usr/bin/vpp subj=system_u:system_r:vpp_t:s0 key=(null)

   Hash: vpp,vpp_t,device_t,chr_file,read,write

In general, this command pumps out too much info and is only needed for
additional debugging for tougher issues. Also note that once the process
being tested is restarted, this command loses it’s context and will not
provide any information:

::

   $ sealert -l a418f869-f470-4c8a-b8e9-bdd41f2dd60b
   Error
   query_alerts error (1003): id (a418f869-f470-4c8a-b8e9-bdd41f2dd60b) not found

ausearch \| audit2allow and semodule Commands
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

These set of commands are more useful for basic debugging. The
``ausearch | audit2allow`` commands generate a set files. It may be
worthwhile to run the commands in a temporary subdirectory:

::

   $ mkdir test-01/; cd test-01/

   $ sudo ausearch -c 'vpp_main' --raw | audit2allow -M my-vppmain

   $ ls
   my-vpp.pp  my-vpp.te

   $ cat my-vpp.te
   module my-vpp 1.0;

   require {
           type user_home_t;
           type vpp_t;
           class packet_socket map;
           class file { open read };
   }

   #============= vpp_t ==============
   allow vpp_t self:packet_socket map;
   allow vpp_t user_home_t:file { open read };

As shown above, the file ``my-vpp.te`` has been generated. This file
shows possible changes to the SELinux policy that may fix the issue. If
an SELinux policy was being created from scratch, this policy could be
applied using the ``semodule -i my-vpp.pp`` command. HOWEVER, VPP
already has a policy in place. So these changes need to be incorporated
into the existing policy. The VPP SELinux policy is located in the
following files:

::

   $ ls extras/selinux/
   selinux_doc.md  vpp-custom.fc  vpp-custom.if  vpp-custom.te

In this example, ``map`` needs to be added to the ``packet_socket``
class. If the ``vpp-custom.te`` is examined (prior to this fix), then
one would see that the ``packet_socket`` class is already defined and
just needs to be updated:

::

   $ vi extras/selinux/vpp-custom.te
   :
   allow vpp_t self:process { execmem execstack setsched signal }; # too benevolent
   allow vpp_t self:packet_socket { bind create setopt ioctl };  <---
   allow vpp_t self:tun_socket { create relabelto relabelfrom };
   :

Before blindly applying the changes proposed by the
``ausearch | audit2allow`` commands, try to determine what is being
allowed by the policy and determine if this is desired, or if the code
can be reworked to no longer require the suggested permission. In the
``my-vpp.te`` file from above, it is suggested to allow ``vpp_t``
(i.e. the VPP process) access to all files in the home directory
(``allow vpp_t user_home_t:file { open read };``). This was because a
``vppctl exec`` command was executed calling a script located in the
``/home/<user>/`` directory. Once this script was run from the
``/usr/share/vpp/`` directory as described in a section above, these
permissions were no longer needed.
class="o">= { .name = "LISP_GPE", .format_device_name = format_lisp_gpe_name, .format_tx_trace = format_lisp_gpe_tx_trace, .tx_function = lisp_gpe_interface_tx, }; /* *INDENT-ON* */ u8 * format_lisp_gpe_header_with_length (u8 * s, va_list * args) { lisp_gpe_header_t *h = va_arg (*args, lisp_gpe_header_t *); u32 max_header_bytes = va_arg (*args, u32); u32 header_bytes; header_bytes = sizeof (h[0]); if (max_header_bytes != 0 && header_bytes > max_header_bytes) return format (s, "lisp-gpe header truncated"); s = format (s, "flags: "); #define _(n,v) if (h->flags & v) s = format (s, "%s ", #n); foreach_lisp_gpe_flag_bit; #undef _ s = format (s, "\n ver_res %d res %d next_protocol %d iid %d(%x)", h->ver_res, h->res, h->next_protocol, clib_net_to_host_u32 (h->iid << 8), clib_net_to_host_u32 (h->iid << 8)); return s; } /* *INDENT-OFF* */ VNET_HW_INTERFACE_CLASS (lisp_gpe_hw_class) = { .name = "LISP_GPE", .format_header = format_lisp_gpe_header_with_length, .build_rewrite = lisp_gpe_build_rewrite, .update_adjacency = lisp_gpe_update_adjacency, }; /* *INDENT-ON* */ typedef struct { u32 dpo_index; } l2_lisp_gpe_tx_trace_t; static u8 * format_l2_lisp_gpe_tx_trace (u8 * s, va_list * args) { CLIB_UNUSED (vlib_main_t * vm) = va_arg (*args, vlib_main_t *); CLIB_UNUSED (vlib_node_t * node) = va_arg (*args, vlib_node_t *); l2_lisp_gpe_tx_trace_t *t = va_arg (*args, l2_lisp_gpe_tx_trace_t *); s = format (s, "L2-LISP-GPE-TX: load-balance %d", t->dpo_index); return s; } /** * @brief LISP-GPE interface TX (encap) function for L2 overlays. * @node l2_lisp_gpe_interface_tx * * The L2 LISP-GPE interface TX (encap) function. * * Uses bridge domain index, source and destination ethernet addresses to * lookup tunnel. If the tunnel is multihomed a flow has is used to determine * the sub-tunnel and therefore the rewrite string to be used to encapsulate * the packets. * * @param[in] vm vlib_main_t corresponding to the current thread. * @param[in] node vlib_node_runtime_t data for this node. * @param[in] frame vlib_frame_t whose contents should be dispatched. * * @return number of vectors in frame. */ static uword l2_lisp_gpe_interface_tx (vlib_main_t * vm, vlib_node_runtime_t * node, vlib_frame_t * from_frame) { u32 n_left_from, next_index, *from, *to_next; lisp_gpe_main_t *lgm = &lisp_gpe_main; from = vlib_frame_vector_args (from_frame); n_left_from = from_frame->n_vectors; next_index = node->cached_next_index; while (n_left_from > 0) { u32 n_left_to_next; vlib_get_next_frame (vm, node, next_index, to_next, n_left_to_next); while (n_left_from > 0 && n_left_to_next > 0) { vlib_buffer_t *b0; u32 bi0, lbi0; ethernet_header_t *e0; bi0 = from[0]; to_next[0] = bi0; from += 1; to_next += 1; n_left_from -= 1; n_left_to_next -= 1; b0 = vlib_get_buffer (vm, bi0); e0 = vlib_buffer_get_current (b0); vnet_buffer (b0)->lisp.overlay_afi = LISP_AFI_MAC; /* lookup dst + src mac */ lbi0 = lisp_l2_fib_lookup (lgm, vnet_buffer (b0)->l2.bd_index, e0->src_address, e0->dst_address); vnet_buffer (b0)->ip.adj_index[VLIB_TX] = lbi0; if (PREDICT_FALSE (b0->flags & VLIB_BUFFER_IS_TRACED)) { l2_lisp_gpe_tx_trace_t *tr = vlib_add_trace (vm, node, b0, sizeof (*tr)); tr->dpo_index = lbi0; } vlib_validate_buffer_enqueue_x1 (vm, node, next_index, to_next, n_left_to_next, bi0, l2_arc_to_lb); } vlib_put_next_frame (vm, node, next_index, n_left_to_next); } return from_frame->n_vectors; } static u8 * format_l2_lisp_gpe_name (u8 * s, va_list * args) { u32 dev_instance = va_arg (*args, u32); return format (s, "l2_lisp_gpe%d", dev_instance); } /* *INDENT-OFF* */ VNET_DEVICE_CLASS (l2_lisp_gpe_device_class,static) = { .name = "L2_LISP_GPE", .format_device_name = format_l2_lisp_gpe_name, .format_tx_trace = format_l2_lisp_gpe_tx_trace, .tx_function = l2_lisp_gpe_interface_tx, }; /* *INDENT-ON* */ typedef struct { u32 dpo_index; } nsh_lisp_gpe_tx_trace_t; u8 * format_nsh_lisp_gpe_tx_trace (u8 * s, va_list * args) { CLIB_UNUSED (vlib_main_t * vm) = va_arg (*args, vlib_main_t *); CLIB_UNUSED (vlib_node_t * node) = va_arg (*args, vlib_node_t *); nsh_lisp_gpe_tx_trace_t *t = va_arg (*args, nsh_lisp_gpe_tx_trace_t *); s = format (s, "NSH-GPE-TX: tunnel %d", t->dpo_index); return s; } /** * @brief LISP-GPE interface TX for NSH overlays. * @node nsh_lisp_gpe_interface_tx * * The NSH LISP-GPE interface TX function. * * @param[in] vm vlib_main_t corresponding to the current thread. * @param[in] node vlib_node_runtime_t data for this node. * @param[in] frame vlib_frame_t whose contents should be dispatched. * * @return number of vectors in frame. */ static uword nsh_lisp_gpe_interface_tx (vlib_main_t * vm, vlib_node_runtime_t * node, vlib_frame_t * from_frame) { u32 n_left_from, next_index, *from, *to_next; lisp_gpe_main_t *lgm = &lisp_gpe_main; from = vlib_frame_vector_args (from_frame); n_left_from = from_frame->n_vectors; next_index = node->cached_next_index; while (n_left_from > 0) { u32 n_left_to_next; vlib_get_next_frame (vm, node, next_index, to_next, n_left_to_next); while (n_left_from > 0 && n_left_to_next > 0) { vlib_buffer_t *b0; u32 bi0; u32 *nsh0, next0; const dpo_id_t *dpo0; bi0 = from[0]; to_next[0] = bi0; from += 1; to_next += 1; n_left_from -= 1; n_left_to_next -= 1; b0 = vlib_get_buffer (vm, bi0); nsh0 = vlib_buffer_get_current (b0); vnet_buffer (b0)->lisp.overlay_afi = LISP_AFI_LCAF; /* lookup SPI + SI (second word of the NSH header). * NB: Load balancing was done by the control plane */ dpo0 = lisp_nsh_fib_lookup (lgm, nsh0[1]); next0 = dpo0->dpoi_next_node; vnet_buffer (b0)->ip.adj_index[VLIB_TX] = dpo0->dpoi_index; if (PREDICT_FALSE (b0->flags & VLIB_BUFFER_IS_TRACED)) { nsh_lisp_gpe_tx_trace_t *tr = vlib_add_trace (vm, node, b0, sizeof (*tr)); tr->dpo_index = dpo0->dpoi_index; } vlib_validate_buffer_enqueue_x1 (vm, node, next_index, to_next, n_left_to_next, bi0, next0); } vlib_put_next_frame (vm, node, next_index, n_left_to_next); } return from_frame->n_vectors; } static u8 * format_nsh_lisp_gpe_name (u8 * s, va_list * args) { u32 dev_instance = va_arg (*args, u32); return format (s, "nsh_lisp_gpe%d", dev_instance); } /* *INDENT-OFF* */ VNET_DEVICE_CLASS (nsh_lisp_gpe_device_class,static) = { .name = "NSH_LISP_GPE", .format_device_name = format_nsh_lisp_gpe_name, .format_tx_trace = format_nsh_lisp_gpe_tx_trace, .tx_function = nsh_lisp_gpe_interface_tx, }; /* *INDENT-ON* */ static vnet_hw_interface_t * lisp_gpe_create_iface (lisp_gpe_main_t * lgm, u32 vni, u32 dp_table, vnet_device_class_t * dev_class, tunnel_lookup_t * tuns) { u32 flen; u32 hw_if_index = ~0; u8 *new_name; vnet_hw_interface_t *hi; vnet_main_t *vnm = lgm->vnet_main; /* create hw lisp_gpeX iface if needed, otherwise reuse existing */ flen = vec_len (lgm->free_tunnel_hw_if_indices); if (flen > 0) { hw_if_index = lgm->free_tunnel_hw_if_indices[flen - 1]; _vec_len (lgm->free_tunnel_hw_if_indices) -= 1; hi = vnet_get_hw_interface (vnm, hw_if_index); /* rename interface */ new_name = format (0, "%U", dev_class->format_device_name, vni); vec_add1 (new_name, 0); vnet_rename_interface (vnm, hw_if_index, (char *) new_name); vec_free (new_name); /* clear old stats of freed interface before reuse */ vnet_interface_main_t *im = &vnm->interface_main; vnet_interface_counter_lock (im); vlib_zero_combined_counter (&im->combined_sw_if_counters [VNET_INTERFACE_COUNTER_TX], hi->sw_if_index); vlib_zero_combined_counter (&im->combined_sw_if_counters [VNET_INTERFACE_COUNTER_RX], hi->sw_if_index); vlib_zero_simple_counter (&im->sw_if_counters [VNET_INTERFACE_COUNTER_DROP], hi->sw_if_index); vnet_interface_counter_unlock (im); } else { hw_if_index = vnet_register_interface (vnm, dev_class->index, vni, lisp_gpe_hw_class.index, 0); hi = vnet_get_hw_interface (vnm, hw_if_index); } hash_set (tuns->hw_if_index_by_dp_table, dp_table, hw_if_index); /* set tunnel termination: post decap, packets are tagged as having been * originated by lisp-gpe interface */ hash_set (tuns->sw_if_index_by_vni, vni, hi->sw_if_index); hash_set (tuns->vni_by_sw_if_index, hi->sw_if_index, vni); return hi; } static void lisp_gpe_remove_iface (lisp_gpe_main_t * lgm, u32 hi_index, u32 dp_table, tunnel_lookup_t * tuns) { vnet_main_t *vnm = lgm->vnet_main; vnet_hw_interface_t *hi; uword *vnip; hi = vnet_get_hw_interface (vnm, hi_index); /* disable interface */ vnet_sw_interface_set_flags (vnm, hi->sw_if_index, 0 /* down */ ); vnet_hw_interface_set_flags (vnm, hi->hw_if_index, 0 /* down */ ); hash_unset (tuns->hw_if_index_by_dp_table, dp_table); vec_add1 (lgm->free_tunnel_hw_if_indices, hi->hw_if_index); /* clean tunnel termination and vni to sw_if_index binding */ vnip = hash_get (tuns->vni_by_sw_if_index, hi->sw_if_index); if (0 == vnip) { clib_warning ("No vni associated to interface %d", hi->sw_if_index); return; } hash_unset (tuns->sw_if_index_by_vni, vnip[0]); hash_unset (tuns->vni_by_sw_if_index, hi->sw_if_index); } static void lisp_gpe_iface_set_table (u32 sw_if_index, u32 table_id) { fib_node_index_t fib_index; fib_index = fib_table_find_or_create_and_lock (FIB_PROTOCOL_IP4, table_id); vec_validate (ip4_main.fib_index_by_sw_if_index, sw_if_index); ip4_main.fib_index_by_sw_if_index[sw_if_index] = fib_index; ip4_sw_interface_enable_disable (sw_if_index, 1); fib_index = fib_table_find_or_create_and_lock (FIB_PROTOCOL_IP6, table_id); vec_validate (ip6_main.fib_index_by_sw_if_index, sw_if_index); ip6_main.fib_index_by_sw_if_index[sw_if_index] = fib_index; ip6_sw_interface_enable_disable (sw_if_index, 1); } static void lisp_gpe_tenant_del_default_routes (u32 table_id) { fib_protocol_t proto; FOR_EACH_FIB_IP_PROTOCOL (proto) { fib_prefix_t prefix = { .fp_proto = proto, }; u32 fib_index; fib_index = fib_table_find (prefix.fp_proto, table_id); fib_table_entry_special_remove (fib_index, &prefix, FIB_SOURCE_LISP); fib_table_unlock (fib_index, prefix.fp_proto); } } static void lisp_gpe_tenant_add_default_routes (u32 table_id) { fib_protocol_t proto; FOR_EACH_FIB_IP_PROTOCOL (proto) { fib_prefix_t prefix = { .fp_proto = proto, }; u32 fib_index; /* * Add a deafult route that results in a control plane punt DPO */ fib_index = fib_table_find_or_create_and_lock (prefix.fp_proto, table_id); fib_table_entry_special_dpo_add (fib_index, &prefix, FIB_SOURCE_LISP, FIB_ENTRY_FLAG_EXCLUSIVE, lisp_cp_dpo_get (fib_proto_to_dpo (proto))); } } /** * @brief Add/del LISP-GPE L3 interface. * * Creates LISP-GPE interface, sets ingress arcs from lisp_gpeX_lookup, * installs default routes that attract all traffic with no more specific * routes to lgpe-ipx-lookup, set egress arcs to ipx-lookup, sets * the interface in the right vrf and enables it. * * @param[in] lgm Reference to @ref lisp_gpe_main_t. * @param[in] a Parameters to create interface. * * @return number of vectors in frame. */ u32 lisp_gpe_add_l3_iface (lisp_gpe_main_t * lgm, u32 vni, u32 table_id) { vnet_main_t *vnm = lgm->vnet_main; tunnel_lookup_t *l3_ifaces = &lgm->l3_ifaces; vnet_hw_interface_t *hi; uword *hip, *si; hip = hash_get (l3_ifaces->hw_if_index_by_dp_table, table_id); if (hip) { clib_warning ("vrf %d already mapped to a vni", table_id); return ~0; } si = hash_get (l3_ifaces->sw_if_index_by_vni, vni); if (si) { clib_warning ("Interface for vni %d already exists", vni); } /* create lisp iface and populate tunnel tables */ hi = lisp_gpe_create_iface (lgm, vni, table_id, &lisp_gpe_device_class, l3_ifaces); /* insert default routes that point to lisp-cp lookup */ lisp_gpe_iface_set_table (hi->sw_if_index, table_id); lisp_gpe_tenant_add_default_routes (table_id); /* enable interface */ vnet_sw_interface_set_flags (vnm, hi->sw_if_index, VNET_SW_INTERFACE_FLAG_ADMIN_UP); vnet_hw_interface_set_flags (vnm, hi->hw_if_index, VNET_HW_INTERFACE_FLAG_LINK_UP); return (hi->sw_if_index); } void lisp_gpe_del_l3_iface (lisp_gpe_main_t * lgm, u32 vni, u32 table_id) { vnet_main_t *vnm = lgm->vnet_main; tunnel_lookup_t *l3_ifaces = &lgm->l3_ifaces; vnet_hw_interface_t *hi; uword *hip; hip = hash_get (l3_ifaces->hw_if_index_by_dp_table, table_id); if (hip == 0) { clib_warning ("The interface for vrf %d doesn't exist", table_id); return; } hi = vnet_get_hw_interface (vnm, hip[0]); lisp_gpe_remove_iface (lgm, hip[0], table_id, &lgm->l3_ifaces); /* unset default routes */ ip4_sw_interface_enable_disable (hi->sw_if_index, 0); ip6_sw_interface_enable_disable (hi->sw_if_index, 0); lisp_gpe_tenant_del_default_routes (table_id); } /** * @brief Add/del LISP-GPE L2 interface. * * Creates LISP-GPE interface, sets it in L2 mode in the appropriate * bridge domain, sets egress arcs and enables it. * * @param[in] lgm Reference to @ref lisp_gpe_main_t. * @param[in] a Parameters to create interface. * * @return number of vectors in frame. */ u32 lisp_gpe_add_l2_iface (lisp_gpe_main_t * lgm, u32 vni, u32 bd_id) { vnet_main_t *vnm = lgm->vnet_main; tunnel_lookup_t *l2_ifaces = &lgm->l2_ifaces; vnet_hw_interface_t *hi; uword *hip, *si; u16 bd_index; if (bd_id > L2_BD_ID_MAX) { clib_warning ("bridge domain ID %d exceed 16M limit", bd_id); return ~0; } bd_index = bd_find_or_add_bd_index (&bd_main, bd_id); hip = hash_get (l2_ifaces->hw_if_index_by_dp_table, bd_index); if (hip) { clib_warning ("bridge domain %d already mapped to a vni", bd_id); return ~0; } si = hash_get (l2_ifaces->sw_if_index_by_vni, vni); if (si) { clib_warning ("Interface for vni %d already exists", vni); return ~0; } /* create lisp iface and populate tunnel tables */ hi = lisp_gpe_create_iface (lgm, vni, bd_index, &l2_lisp_gpe_device_class, &lgm->l2_ifaces); /* enable interface */ vnet_sw_interface_set_flags (vnm, hi->sw_if_index, VNET_SW_INTERFACE_FLAG_ADMIN_UP); vnet_hw_interface_set_flags (vnm, hi->hw_if_index, VNET_HW_INTERFACE_FLAG_LINK_UP); l2_arc_to_lb = vlib_node_add_named_next (vlib_get_main (), hi->tx_node_index, "l2-load-balance"); /* we're ready. add iface to l2 bridge domain */ set_int_l2_mode (lgm->vlib_main, vnm, MODE_L2_BRIDGE, hi->sw_if_index, bd_index, 0, 0, 0); return (hi->sw_if_index); } /** * @brief Add/del LISP-GPE L2 interface. * * Creates LISP-GPE interface, sets it in L2 mode in the appropriate * bridge domain, sets egress arcs and enables it. * * @param[in] lgm Reference to @ref lisp_gpe_main_t. * @param[in] a Parameters to create interface. * * @return number of vectors in frame. */ void lisp_gpe_del_l2_iface (lisp_gpe_main_t * lgm, u32 vni, u32 bd_id) { tunnel_lookup_t *l2_ifaces = &lgm->l2_ifaces; vnet_hw_interface_t *hi; u32 bd_index = bd_find_index (&bd_main, bd_id); ASSERT (bd_index != ~0); uword *hip = hash_get (l2_ifaces->hw_if_index_by_dp_table, bd_index); if (hip == 0) { clib_warning ("The interface for bridge domain %d doesn't exist", bd_id); return; } /* Remove interface from bridge .. by enabling L3 mode */ hi = vnet_get_hw_interface (lgm->vnet_main, hip[0]); set_int_l2_mode (lgm->vlib_main, lgm->vnet_main, MODE_L3, hi->sw_if_index, 0, 0, 0, 0); lisp_gpe_remove_iface (lgm, hip[0], bd_index, &lgm->l2_ifaces); } /** * @brief Add LISP-GPE NSH interface. * * Creates LISP-GPE interface, sets it in L3 mode. * * @param[in] lgm Reference to @ref lisp_gpe_main_t. * @param[in] a Parameters to create interface. * * @return sw_if_index. */ u32 vnet_lisp_gpe_add_nsh_iface (lisp_gpe_main_t * lgm) { vnet_main_t *vnm = lgm->vnet_main; tunnel_lookup_t *nsh_ifaces = &lgm->nsh_ifaces; vnet_hw_interface_t *hi; uword *hip, *si; hip = hash_get (nsh_ifaces->hw_if_index_by_dp_table, 0); if (hip) { clib_warning ("NSH interface 0 already exists"); return ~0; } si = hash_get (nsh_ifaces->sw_if_index_by_vni, 0); if (si) { clib_warning ("NSH interface already exists"); return ~0; } /* create lisp iface and populate tunnel tables */ hi = lisp_gpe_create_iface (lgm, 0, 0, &nsh_lisp_gpe_device_class, &lgm->nsh_ifaces); /* enable interface */ vnet_sw_interface_set_flags (vnm, hi->sw_if_index, VNET_SW_INTERFACE_FLAG_ADMIN_UP); vnet_hw_interface_set_flags (vnm, hi->hw_if_index, VNET_HW_INTERFACE_FLAG_LINK_UP); return (hi->sw_if_index); } /** * @brief Del LISP-GPE NSH interface. * */ void vnet_lisp_gpe_del_nsh_iface (lisp_gpe_main_t * lgm) { tunnel_lookup_t *nsh_ifaces = &lgm->nsh_ifaces; uword *hip; hip = hash_get (nsh_ifaces->hw_if_index_by_dp_table, 0); if (hip == 0) { clib_warning ("The NSH 0 interface doesn't exist"); return; } lisp_gpe_remove_iface (lgm, hip[0], 0, &lgm->nsh_ifaces); } static clib_error_t * lisp_gpe_add_del_iface_command_fn (vlib_main_t * vm, unformat_input_t * input, vlib_cli_command_t * cmd) { unformat_input_t _line_input, *line_input = &_line_input; u8 is_add = 1; u32 table_id, vni, bd_id; u8 vni_is_set = 0, vrf_is_set = 0, bd_index_is_set = 0; u8 nsh_iface = 0; clib_error_t *error = NULL; if (vnet_lisp_gpe_enable_disable_status () == 0) { return clib_error_return (0, "LISP is disabled"); } /* Get a line of input. */ if (!unformat_user (input, unformat_line_input, line_input)) return 0; while (unformat_check_input (line_input) != UNFORMAT_END_OF_INPUT) { if (unformat (line_input, "add")) is_add = 1; else if (unformat (line_input, "del")) is_add = 0; else if (unformat (line_input, "vrf %d", &table_id)) { vrf_is_set = 1; } else if (unformat (line_input, "vni %d", &vni)) { vni_is_set = 1; } else if (unformat (line_input, "bd %d", &bd_id)) { bd_index_is_set = 1; } else if (unformat (line_input, "nsh")) { nsh_iface = 1; } else { error = clib_error_return (0, "parse error: '%U'", format_unformat_error, line_input); goto done; } } if (nsh_iface) { if (is_add) { if (~0 == vnet_lisp_gpe_add_nsh_iface (&lisp_gpe_main)) { error = clib_error_return (0, "NSH interface not created"); goto done; } } else { vnet_lisp_gpe_del_nsh_iface (&lisp_gpe_main); } goto done; } if (vrf_is_set && bd_index_is_set) { error = clib_error_return (0, "Cannot set both vrf and brdige domain index!"); goto done; } if (!vni_is_set) { error = clib_error_return (0, "vni must be set!"); goto done; } if (!vrf_is_set && !bd_index_is_set) { error = clib_error_return (0, "vrf or bridge domain index must be set!"); goto done; } if (bd_index_is_set) { if (is_add) { if (~0 == lisp_gpe_tenant_l2_iface_add_or_lock (vni, bd_id)) { error = clib_error_return (0, "L2 interface not created"); goto done; } } else lisp_gpe_tenant_l2_iface_unlock (vni); } else { if (is_add) { if (~0 == lisp_gpe_tenant_l3_iface_add_or_lock (vni, table_id)) { error = clib_error_return (0, "L3 interface not created"); goto done; } } else lisp_gpe_tenant_l3_iface_unlock (vni); } done: unformat_free (line_input); return error; } /* *INDENT-OFF* */ VLIB_CLI_COMMAND (add_del_lisp_gpe_iface_command, static) = { .path = "gpe iface", .short_help = "gpe iface add/del vni <vni> vrf <vrf>", .function = lisp_gpe_add_del_iface_command_fn, }; /* *INDENT-ON* */ /* * fd.io coding-style-patch-verification: ON * * Local Variables: * eval: (c-set-style "gnu") * End: */