aboutsummaryrefslogtreecommitdiffstats
path: root/docs/usecases/contiv/MANUAL_INSTALL.rst
blob: 4e0d0c6c52bc5ce93c5d4b0b2a0965239a88ad3e (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
Manual Installation
===================

This document describes how to clone the Contiv repository and then use
`kubeadm <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/>`__
to manually install Kubernetes with Contiv-VPP networking on one or more
bare metal or VM hosts.

Clone the Contiv Repository
---------------------------

To clone the Contiv repository enter the following command:

::

   git clone https://github.com/contiv/vpp/<repository-name>

**Note:** Replace ** with the name you want assigned to your cloned
contiv repository.

The cloned repository has important folders that contain content that
are referenced in this Contiv documentation; those folders are noted
below:

::

   vpp-contiv2$ ls
   build       build-root  doxygen  gmod       LICENSE      Makefile   RELEASE.md   src
   build-data  docs        extras   INFO.yaml  MAINTAINERS  README.md  sphinx_venv  test

Preparing Your Hosts
--------------------

Host-specific Configurations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-  **VmWare VMs**: the vmxnet3 driver is required on each interface that
   will be used by VPP. Please see
   `here <https://github.com/contiv/vpp/tree/master/docs/VMWARE_FUSION_HOST.md>`__
   for instructions how to install the vmxnet3 driver on VmWare Fusion.

Setting up Network Adapter(s)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Setting up DPDK
^^^^^^^^^^^^^^^

DPDK setup must be completed **on each node** as follows:

-  Load the PCI UIO driver:

   ::

      $ sudo modprobe uio_pci_generic

-  Verify that the PCI UIO driver has loaded successfully:

   ::

      $ lsmod | grep uio
      uio_pci_generic        16384  0
      uio                    20480  1 uio_pci_generic

   Please note that this driver needs to be loaded upon each server
   bootup, so you may want to add ``uio_pci_generic`` into the
   ``/etc/modules`` file, or a file in the ``/etc/modules-load.d/``
   directory. For example, the ``/etc/modules`` file could look as
   follows:

   ::

      # /etc/modules: kernel modules to load at boot time.
      #
      # This file contains the names of kernel modules that should be loaded
      # at boot time, one per line. Lines beginning with "#" are ignored.
      uio_pci_generic

   .. rubric:: Determining Network Adapter PCI Addresses
      :name: determining-network-adapter-pci-addresses

   You need the PCI address of the network interface that VPP will use
   for the multi-node pod interconnect. On Debian-based distributions,
   you can use ``lshw``\ (*):

::

   $ sudo lshw -class network -businfo
   Bus info          Device      Class      Description
   ====================================================
   pci@0000:00:03.0  ens3        network    Virtio network device
   pci@0000:00:04.0  ens4        network    Virtio network device

**Note:** On CentOS/RedHat/Fedora distributions, ``lshw`` may not be
available by default, install it by issuing the following command:
``yum -y install lshw``

Configuring vswitch to Use Network Adapters
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Finally, you need to set up the vswitch to use the network adapters:

-  `Setup on a node with a single
   NIC <https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md>`__
-  `Setup a node with multiple
   NICs <https://github.com/contiv/vpp/tree/master/docs/MULTI_NIC_SETUP.md>`__

Using a Node Setup Script
~~~~~~~~~~~~~~~~~~~~~~~~~

You can perform the above steps using the `node setup
script <https://github.com/contiv/vpp/tree/master/k8s/README.md#setup-node-sh>`__.

Installing Kubernetes with Contiv-VPP CNI plugin
------------------------------------------------

After the nodes you will be using in your K8s cluster are prepared, you
can install the cluster using
`kubeadm <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/>`__.

(1/4) Installing Kubeadm on Your Hosts
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

For first-time installation, see `Installing
kubeadm <https://kubernetes.io/docs/setup/independent/install-kubeadm/>`__.
To update an existing installation, you should do a
``apt-get update && apt-get upgrade`` or ``yum update`` to get the
latest version of kubeadm.

On each host with multiple NICs where the NIC that will be used for
Kubernetes management traffic is not the one pointed to by the default
route out of the host, a `custom management
network <https://github.com/contiv/vpp/tree/master/docs/CUSTOM_MGMT_NETWORK.md>`__
for Kubernetes must be configured.

Using Kubernetes 1.10 and Above
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In K8s 1.10, support for huge pages in a pod has been introduced. For
now, this feature must be either disabled or memory limit must be
defined for vswitch container.

To disable huge pages, perform the following steps as root: \* Using
your favorite editor, disable huge pages in the kubelet configuration
file (``/etc/systemd/system/kubelet.service.d/10-kubeadm.conf`` or
``/etc/default/kubelet`` for version 1.11+):

::

     Environment="KUBELET_EXTRA_ARGS=--feature-gates HugePages=false"

-  Restart the kubelet daemon:

::

     systemctl daemon-reload
     systemctl restart kubelet

To define memory limit, append the following snippet to vswitch
container in deployment yaml file:

::

               resources:
                 limits:
                   hugepages-2Mi: 1024Mi
                   memory: 1024Mi

or set ``contiv.vswitch.defineMemoryLimits`` to ``true`` in `helm
values <https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/README.md>`__.

(2/4) Initializing Your Master
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Before initializing the master, you may want to
`remove <#tearing-down-kubernetes>`__ any previously installed K8s
components. Then, proceed with master initialization as described in the
`kubeadm
manual <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#initializing-your-master>`__.
Execute the following command as root:

::

   kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16

**Note:** ``kubeadm init`` will autodetect the network interface to
advertise the master on as the interface with the default gateway. If
you want to use a different interface (i.e. a custom management network
setup), specify the ``--apiserver-advertise-address=<ip-address>``
argument to kubeadm init. For example:

::

   kubeadm init --token-ttl 0 --pod-network-cidr=10.1.0.0/16 --apiserver-advertise-address=192.168.56.106

**Note:** The CIDR specified with the flag ``--pod-network-cidr`` is
used by kube-proxy, and it **must include** the ``PodSubnetCIDR`` from
the ``IPAMConfig`` section in the Contiv-vpp config map in Contiv-vpp’s
deployment file
`contiv-vpp.yaml <https://github.com/contiv/vpp/blob/master/k8s/contiv-vpp/values.yaml>`__.
Pods in the host network namespace are a special case; they share their
respective interfaces and IP addresses with the host. For proxying to
work properly it is therefore required for services with backends
running on the host to also **include the node management IP** within
the ``--pod-network-cidr`` subnet. For example, with the default
``PodSubnetCIDR=10.1.0.0/16`` and ``PodIfIPCIDR=10.2.1.0/24``, the
subnet ``10.3.0.0/16`` could be allocated for the management network and
``--pod-network-cidr`` could be defined as ``10.0.0.0/8``, so as to
include IP addresses of all pods in all network namespaces:

::

   kubeadm init --token-ttl 0 --pod-network-cidr=10.0.0.0/8 --apiserver-advertise-address=10.3.1.1

If Kubernetes was initialized successfully, it prints out this message:

::

   Your Kubernetes master has initialized successfully!

After successful initialization, don’t forget to set up your .kube
directory as a regular user (as instructed by ``kubeadm``):

.. code:: bash

   mkdir -p $HOME/.kube
   sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
   sudo chown $(id -u):$(id -g) $HOME/.kube/config

(3/4) Installing the Contiv-VPP Pod Network
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If you have already used the Contiv-VPP plugin before, you may need to
pull the most recent Docker images on each node:

::

   bash <(curl -s https://raw.githubusercontent.com/contiv/vpp/master/k8s/pull-images.sh)

Install the Contiv-VPP network for your cluster as follows:

-  If you do not use the STN feature, install Contiv-vpp as follows:

   ::

      kubectl apply -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml

-  If you use the STN feature, download the ``contiv-vpp.yaml`` file:

   ::

      wget https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml

   Then edit the STN configuration as described
   `here <https://github.com/contiv/vpp/tree/master/docs/SINGLE_NIC_SETUP.md#configuring-stn-in-contiv-vpp-k8s-deployment-files>`__.
   Finally, create the Contiv-vpp deployment from the edited file:

   ::

      kubectl apply -f ./contiv-vpp.yaml

Beware contiv-etcd data is persisted in ``/var/etcd`` by default. It has
to be cleaned up manually after ``kubeadm reset``. Otherwise outdated
data will be loaded by a subsequent deployment.

You can also generate random subfolder, alternatively:

::

   curl --silent https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml | sed "s/\/var\/etcd\/contiv-data/\/var\/etcd\/contiv-data\/$RANDOM/g" | kubectl apply -f -

Deployment Verification
^^^^^^^^^^^^^^^^^^^^^^^

After some time, all contiv containers should enter the running state:

::

   root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv
   NAME                           READY     STATUS    RESTARTS   AGE       IP               NODE
   ...
   contiv-etcd-gwc84              1/1       Running   0          14h       192.168.56.106   cvpp
   contiv-ksr-5c2vk               1/1       Running   2          14h       192.168.56.106   cvpp
   contiv-vswitch-l59nv           2/2       Running   0          14h       192.168.56.106   cvpp

In particular, make sure that the Contiv-VPP pod IP addresses are the
same as the IP address specified in the
``--apiserver-advertise-address=<ip-address>`` argument to kubeadm init.

Verify that the VPP successfully grabbed the network interface specified
in the VPP startup config (``GigabitEthernet0/4/0`` in our case):

::

   $ sudo vppctl
   vpp# sh inter
                 Name               Idx       State          Counter          Count
   GigabitEthernet0/4/0              1         up       rx packets                  1294
                                                        rx bytes                  153850
                                                        tx packets                   512
                                                        tx bytes                   21896
                                                        drops                        962
                                                        ip4                         1032
   host-40df9b44c3d42f4              3         up       rx packets                126601
                                                        rx bytes                44628849
                                                        tx packets                132155
                                                        tx bytes                27205450
                                                        drops                         24
                                                        ip4                       126585
                                                        ip6                           16
   host-vppv2                        2         up       rx packets                132162
                                                        rx bytes                27205824
                                                        tx packets                126658
                                                        tx bytes                44634963
                                                        drops                         15
                                                        ip4                       132147
                                                        ip6                           14
   local0                            0        down

You should also see the interface to kube-dns (``host-40df9b44c3d42f4``)
and to the node’s IP stack (``host-vppv2``).

Master Isolation (Optional)
^^^^^^^^^^^^^^^^^^^^^^^^^^^

By default, your cluster will not schedule pods on the master for
security reasons. If you want to be able to schedule pods on the master,
(e.g., for a single-machine Kubernetes cluster for development), then
run:

::

   kubectl taint nodes --all node-role.kubernetes.io/master-

More details about installing the pod network can be found in the
`kubeadm
manual <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#pod-network>`__.

(4/4) Joining Your Nodes
~~~~~~~~~~~~~~~~~~~~~~~~

To add a new node to your cluster, run as root the command that was
output by kubeadm init. For example:

::

   kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>

More details can be found int the `kubeadm
manual <https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/#joining-your-nodes>`__.

.. _deployment-verification-1:

Deployment Verification
^^^^^^^^^^^^^^^^^^^^^^^

After some time, all contiv containers should enter the running state:

::

   root@cvpp:/home/jan# kubectl get pods -n kube-system -o wide | grep contiv
   NAME                           READY     STATUS    RESTARTS   AGE       IP               NODE
   contiv-etcd-gwc84              1/1       Running   0          14h       192.168.56.106   cvpp
   contiv-ksr-5c2vk               1/1       Running   2          14h       192.168.56.106   cvpp
   contiv-vswitch-h6759           2/2       Running   0          14h       192.168.56.105   cvpp-slave2
   contiv-vswitch-l59nv           2/2       Running   0          14h       192.168.56.106   cvpp
   etcd-cvpp                      1/1       Running   0          14h       192.168.56.106   cvpp
   kube-apiserver-cvpp            1/1       Running   0          14h       192.168.56.106   cvpp
   kube-controller-manager-cvpp   1/1       Running   0          14h       192.168.56.106   cvpp
   kube-dns-545bc4bfd4-fr6j9      3/3       Running   0          14h       10.1.134.2       cvpp
   kube-proxy-q8sv2               1/1       Running   0          14h       192.168.56.106   cvpp
   kube-proxy-s8kv9               1/1       Running   0          14h       192.168.56.105   cvpp-slave2
   kube-scheduler-cvpp            1/1       Running   0          14h       192.168.56.106   cvpp

In particular, verify that a vswitch pod and a kube-proxy pod is running
on each joined node, as shown above.

On each joined node, verify that the VPP successfully grabbed the
network interface specified in the VPP startup config
(``GigabitEthernet0/4/0`` in our case):

::

   $ sudo vppctl
   vpp# sh inter
                 Name               Idx       State          Counter          Count
   GigabitEthernet0/4/0              1         up
   ...

From the vpp CLI on a joined node you can also ping kube-dns to verify
node-to-node connectivity. For example:

::

   vpp# ping 10.1.134.2
   64 bytes from 10.1.134.2: icmp_seq=1 ttl=64 time=.1557 ms
   64 bytes from 10.1.134.2: icmp_seq=2 ttl=64 time=.1339 ms
   64 bytes from 10.1.134.2: icmp_seq=3 ttl=64 time=.1295 ms
   64 bytes from 10.1.134.2: icmp_seq=4 ttl=64 time=.1714 ms
   64 bytes from 10.1.134.2: icmp_seq=5 ttl=64 time=.1317 ms

   Statistics: 5 sent, 5 received, 0% packet loss

Deploying Example Applications
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Simple Deployment
^^^^^^^^^^^^^^^^^

You can go ahead and create a simple deployment:

::

   $ kubectl run nginx --image=nginx --replicas=2

Use ``kubectl describe pod`` to get the IP address of a pod, e.g.:

::

   $ kubectl describe pod nginx | grep IP

You should see two ip addresses, for example:

::

   IP:     10.1.1.3
   IP:     10.1.1.4

You can check the pods’ connectivity in one of the following ways: \*
Connect to the VPP debug CLI and ping any pod:

::

     sudo vppctl
     vpp# ping 10.1.1.3

-  Start busybox and ping any pod:

::

     kubectl run busybox --rm -ti --image=busybox /bin/sh
     If you don't see a command prompt, try pressing enter.
     / #
     / # ping 10.1.1.3

-  You should be able to ping any pod from the host:

::

     ping 10.1.1.3

Deploying Pods on Different Nodes
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

to enable pod deployment on the master, untaint the master first:

::

   kubectl taint nodes --all node-role.kubernetes.io/master-

In order to verify inter-node pod connectivity, we need to tell
Kubernetes to deploy one pod on the master node and one POD on the
worker. For this, we can use node selectors.

In your deployment YAMLs, add the ``nodeSelector`` sections that refer
to preferred node hostnames, e.g.:

::

     nodeSelector:
       kubernetes.io/hostname: vm5

Example of whole JSONs:

::

   apiVersion: v1
   kind: Pod
   metadata:
     name: nginx1
   spec:
     nodeSelector:
       kubernetes.io/hostname: vm5
     containers:
       - name: nginx

         : nginx

::

   apiVersion: v1
   kind: Pod
   metadata:
     name: nginx2
   spec:
     nodeSelector:
       kubernetes.io/hostname: vm6
     containers:
       - name: nginx
         image: nginx

After deploying the JSONs, verify they were deployed on different hosts:

::

   $ kubectl get pods -o wide
   NAME      READY     STATUS    RESTARTS   AGE       IP           NODE
   nginx1    1/1       Running   0          13m       10.1.36.2    vm5
   nginx2    1/1       Running   0          13m       10.1.219.3   vm6

Now you can verify the connectivity to both nginx PODs from a busybox
POD:

::

   kubectl run busybox --rm -it --image=busybox /bin/sh

   / # wget 10.1.36.2
   Connecting to 10.1.36.2 (10.1.36.2:80)
   index.html           100% |*******************************************************************************************************************************************************************|   612   0:00:00 ETA

   / # rm index.html

   / # wget 10.1.219.3
   Connecting to 10.1.219.3 (10.1.219.3:80)
   index.html           100% |*******************************************************************************************************************************************************************|   612   0:00:00 ETA

Uninstalling Contiv-VPP
~~~~~~~~~~~~~~~~~~~~~~~

To uninstall the network plugin itself, use ``kubectl``:

::

   kubectl delete -f https://raw.githubusercontent.com/contiv/vpp/master/k8s/contiv-vpp.yaml

Tearing down Kubernetes
~~~~~~~~~~~~~~~~~~~~~~~

-  First, drain the node and make sure that the node is empty before
   shutting it down:

::

     kubectl drain <node name> --delete-local-data --force --ignore-daemonsets
     kubectl delete node <node name>

-  Next, on the node being removed, reset all kubeadm installed state:

::

     rm -rf $HOME/.kube
     sudo su
     kubeadm reset

-  If you added environment variable definitions into
   ``/etc/systemd/system/kubelet.service.d/10-kubeadm.conf``, this would
   have been a process from the `Custom Management Network
   file <https://github.com/contiv/vpp/blob/master/docs/CUSTOM_MGMT_NETWORK.md#setting-up-a-custom-management-network-on-multi-homed-nodes>`__,
   then remove the definitions now.

Troubleshooting
~~~~~~~~~~~~~~~

Some of the issues that can occur during the installation are:

-  Forgetting to create and initialize the ``.kube`` directory in your
   home directory (As instructed by ``kubeadm init --token-ttl 0``).
   This can manifest itself as the following error:

   ::

      W1017 09:25:43.403159    2233 factory_object_mapping.go:423] Failed to download OpenAPI (Get https://192.168.209.128:6443/swagger-2.0.0.pb-v1: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")), falling back to swagger
      Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")

-  Previous installation lingering on the file system.
   ``'kubeadm init --token-ttl 0`` fails to initialize kubelet with one
   or more of the following error messages:

   ::

      ...
      [kubelet-check] It seems like the kubelet isn't running or healthy.
      [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10255/healthz' failed with error: Get http://localhost:10255/healthz: dial tcp [::1]:10255: getsockopt: connection refused.
      ...

If you run into any of the above issues, try to clean up and reinstall
as root:

::

   sudo su
   rm -rf $HOME/.kube
   kubeadm reset
   kubeadm init --token-ttl 0
   rm -rf /var/etcd/contiv-data
   rm -rf /var/bolt/bolt.db

Contiv-specific kubeadm installation on Aarch64
-----------------------------------------------

Supplemental instructions apply when using Contiv-VPP for Aarch64. Most
installation steps for Aarch64 are the same as that described earlier in
this chapter, so you should firstly read it before you start the
installation on Aarch64 platform.

Use the `Aarch64-specific kubeadm install
instructions <https://github.com/contiv/vpp/blob/master/docs/arm64/MANUAL_INSTALL_ARM64.md>`__
to manually install Kubernetes with Contiv-VPP networking on one or more
bare-metals of Aarch64 platform.