Kubernetes have some of the most active open source communities, and it is not new that has been consolidated as the standard for containers orchestration. For these and several other reasons, it is essential to know the core components of Kubernetes and their interactions. One of the most well-known and effective methods to acquire this knowledge base, and even to study to CKA certification (Certified Kubernetes Administrator), it is through the Kubernetes the Hardway by Kelsey Hightower, by configuring a cluster from scratch.
This post aims to give an overview of the tutorial and notes on some points that most get our attention when configuring Kubernetes.
The first chapters of the tutorial presents the access to the Google Cloud Platform, the cloud provider chosen for creating the infrastructure, but don’t worry, it is possible to use another cloud provider, the tutorial do not use any specific product from Google, just computing resources, storage and network.
Some tools used throughout the tutorial are also showed in the beginning, among which we highlight:
CFSSL: A package of tools created by Cloudflare, an American company focused on web infrastructure and security. It is called by Cloudflare itself as swiss army knife for PKI/TLS. It is built in Golang and can be used to sign, verify and group TLS certificates. We really recommend this tool to any other situations with TLS needs, it is simple to use and powerful.
kubectl: No need in depth presentation, it is a well known command line tool for Kubernetes cluster control, it uses the information stored in the kubeconfig (default is in $HOME/.kube/config) to handle the cluster through the API Server interactions.
Next we have the provision of the infrastructure for the components that control Kubernetes, also known as the control plane, and the infrastructure for the worker nodes, which are the nodes responsible for effectively run the workloads. Here is the topology proposed in the tutorial:
Note that the network is flat, which all containers and nodes can communicate with each other. For this we have a subnet (10.240.0.0/24) with 254 hosts used for the controller nodes and worker nodes (and perhaps new nodes that may come in the future). And for the pods allocation it is also necessary to define a CIDR range, in the tutorial 10.200.0.0/16, which supports 254 subnets. The distribution of the subnets is set in the Controller Manager component on each worker node, thus, each pod allocation receives an address within the range defined for its worker node subnet. For instance, the worker-1 with pod CIDR 10.200.1.0/24 can set pods between addresses 10.200.1.1 and 10.200.1.254; and the worker-2 with pod CIDR 10.200.2.0/24 can set pods between addresses 10.200.2.1 and 10.200.2.254.
Configurations, certificates and more certificates
After the creation of computational resources, things start to get a little dull, mainly because it involves provisioning the PKI Infrastructure, which is basically an arrangement that links public keys to identities through a Certification Authority, all this is for traffic encryption and confirm that the parties involved in the communication are really who they say they are. Anyone who has needed to generate this infrastructure knows that it is a little bit of work. But in this case, the CFSSL tool helps to reduce the complexity. It is extremely important that this is done, so that the communication between (critical) components flows safely and reliably, other authentication methods would not provide the same level of security.
Another important point of the tutorial is the configuration of the EncryptionConfig, which plays a fundamental role in security, encrypting the information saved in etcd. It is important to note that the information saved in etcd componente is in a key/value based format, where by default, the maximum request size is 1.5 MiB, so that the system does not suffer so much with latency. And if you like your cluster, have a backup policy for etcd’s data.
Another great configuration helper is covered in chapter five, the Kubernetes configuration files, also known as kubeconfigs. They are yaml files of kind Config, that organize information about clusters, users, namespaces, authentication mechanisms and can group these informations with contexts section. With this configuration mechanism we can communicate with the API Server of any desired Kubernetes, and interact with the cluster through the APIs. The kubectl tool uses this file too, which by default is in the path $HOME/.kube/config.
Controllers and Worker Nodes Components
In this step of the tutorial we have the installation and configuration of all dependencies and components that make up the base of Kubernetes. Applications are downloaded in the controller and worker nodes; private and public keys are distributed for authentication and encrypted communication between the components; and the configuration of kubconfigs and systemmd services containing parameters for the binaries. Below a list of all the installed components:
Controller Nodes Components (Control Plane)
Worker Nodes Components
An interesting point of the kublet configuration is the need to disable swap in worker nodes, this is necessary not just because the support of swap it is a hard task, but it is because Kubernetes design is focused on performance and optimization in the use of the nodes resources. We can confirm with the definition of Quality of Service for Pods, where the system manage differently QoS depending on the configured resources limits (Guaranteed, Burstable and BestEffort)
CNI Networking Plugin
Another important element in the architecture is the configuration of the Container Network Interface plugin, that it is responsible for the connectivity of containers through the network interfaces configuration, and also removing allocated resources when the container is deleted.
In the chapter eleven o the tutorial, routing rules are created between worker nodes, so that all worker nodes and consequently pods and services, have connectivity. The DNS Cluster Add-on is provisioned, so that the components can be accessed by a name. The installed component is coredns, which communicates with kublet, provisioning a DNS for components scheduled on nodes.
It is important to mention that the tutorial does not offer the possibility of using services with LoadBalancer type, because it is out of the scope configure the integration with the cloud provider, and the LoadBalancer type have to have the integration to create the appropriate component for expose the service on the provider network.
Finally, the tutorial delivers a working cluster with high availability from scratch, and the last chapters are dedicated for tests and cleanup. The objective of the tests are guarantee that data encryption in etcd is working; test an application deployment; test access to applications through port-forwarding for inspection; query the logs of a pod; access pods via shell and expose an application to the internet through the NodePort service.
But why stop there with the Kubernetes cluster? We have total control to adjust settings, change network ranges, make new tests, and why not, break components configurations to understand behaviors and failures. Undoubtedly, Kubernetes The Hard Way it is a fantastic way to understand and evaluate this complex system.