What is Kubernetes?

Oct 3, 2023 • 10 min

Commonly abbreviated as kube or K8s, Kubernetes has risen in recent years as the reference container orchestration solution for application deployment.

With Kubernetes, you can deploy containerized applications across any type of IT infrastructure and centrally managing the various resources they require. These can be computing resources, storage, databases, networking, etc. These resources are grouped into a Kubernetes cluster composed of a set of servers.

While this article doesn’t delve into the basics of containers (a prerequisite for using K8s), it presents Kubernetes in broad strokes, its utility for platform operation, and the different types of deployments.

A bit of history on Kubernetes

Kubernetes was originally developed by Google in 2013, and the project was made open-source in 2014.

Kubernetes benefited from 15 years of experience from Google’s teams in cluster management, drawing lessons from writing and operating an internal project called Borg, outlined in a paper published in 2015: Large-scale cluster management at Google with Borg. However, this tool is not publicly available. Both projects rely on the concept of containers, with Google being one of the early contributors to containerization technologies through Linux cgroups, a concept introduced in 2006.

In 2015, Google donated the project to the Cloud Native Computing Foundation affiliated with the Linux Foundation. The project thus became independent and vendor-neutral in relation to cloud service providers. Over the years, Kubernetes has gradually established itself as the most widely used orchestration solution and has received contributions from a vast community.

Why Kubernetes?

If you’re deploying a containerized application on just one machine (and if that is suitable in terms of availability), there is no need to use an orchestration solution.

However, as soon as applications become complex, and you aim to ensure high availability of the service across different environments (servers, clouds, etc.), you will quickly need to manage a large number of containers, with varied infrastructure resource requirements.

You will likely face issues such as:

  • Some services must be accessible from the Internet, others only by other containers (potentially with internal/external load-balancing logic): the network configuration becomes complex and must be done manually.
  • Storage must be managed manually (e.g., mounting the right volumes in the right places, etc.).
  • Containers do not natively support horizontal scaling, so launching and distributing various containers across different servers must be done manually.
  • It’s not easy/automatic to react to events, such as increased load, unavailability of a machine or a container, etc.
  • It’s hard to get a list of all the containers related to a service or an application deployed on different servers. Thus, operation can be particularly challenging.

Kubernetes is a container orchestration solution that notably addresses these challenges, and many more. Let’s look at this more closely!

The Essentials of Kubernetes

Main Kubernetes Resources

Within Kubernetes, there are various resource types of resources to describe the nature and state of a cluster:

  • Nodes are the virtual (VM) or physical (“bare-metal”) servers or instances of the Kubernetes cluster.
  • Pods are groups of containers running together on a node. They are considered the basic unit of Kubernetes. A pod typically contains one application container, and zero, one, or more side-cars, which could be, for example, an agent monitoring the application container.
  • Deployments allow creating pods following a user-defined model. They specifically specify the number of Replicas that are then instantiated by Kubernetes. When you delete a pod created by a Deployment, Kubernetes automatically reconstructs it. They are intended for long-running processes such as web services or workers.
  • DaemonSets, somewhat like Deployments, enable creating pods, but with the guarantee that there will be one and only one Replica per node.
  • CronJobs simulate the operation of a crontab by creating pods at regular intervals to execute specific tasks. These pods are ephemeral; they stop once the task is completed.
  • Ingress allow exposing ports outside a pod and/or exposing a service with a domain name.
  • Secrets enable securely storing and managing sensitive information (like credentials).
  • And many more, such as Namespaces, Services, ConfigMaps, etc.

Less known but interesting for operating a platform under Kubernetes, it is also possible to add new “types of workloads” (in addition to these Deployments, StatefulSets, DaemonSets, CronJob, …) that extend the functionalities of the latter, as offered for example by the project OpenKruise.

Kubernetes Architecture

Kubernetes is a set of components that manage the different parts of a cluster and orchestrate the containers. All the components that make up the cluster are called the control plane; they can be replicated to ensure high availability of the service. The orchestration of containers in K8s involves scheduling, executing, and stopping containers automatically according to rules defined by the user through Kubernetes resources.

Kubernetes control plane

This diagram from Lucas Käldström shows the different parts of the Kubernetes control plane (master) and the different parts of the nodes (workers) of a cluster:

  • API Server: The K8s API allows communication with the cluster and configuration of its various resources.
  • etcd: A distributed database that stores the cluster resources.
  • Scheduler: A process whose purpose is to assign pods to nodes.
  • Controller Manager: A process that includes the main control loops of Kubernetes. We will revisit this concept later in this article.

Kubernetes Reconciliation Loops

Kubernetes is a declarative system: the user describes the expected result, and K8s implements it, in contrast to an imperative system where it is up to the user to describe the implementation.

The declarative nature of Kubernetes involves two interdependent concepts: that of the current state and the desired state. The current state is the actual state of the Kubernetes cluster at a given time, while the desired state corresponds to the configuration given by the user through the resources discussed above. The task of Kubernetes and more specifically its controller manager is to achieve reconciliation of these two states by converging the current state towards the desired state. This is done via what are called reconciliation loops.

Kubernetes reconciliation loop

This approach brings several major benefits to facilitate the operation of a platform under Kubernetes, such as:

  • Self-healing: if a node is no longer available for reason X or Y, Kubernetes will automatically recreate the pods present on this node on one or more other available nodes. Similarly, if a container crashes, it will be automatically restarted.
  • High resilience: when an action does not work, such as a request for SSL certificate creation, Kubernetes will automatically retry several times at increasing intervals.
  • Intelligent scheduling: Kubernetes will automatically assign pods to available nodes while distributing the workload among these nodes.

Kubernetes Manifests

All Kubernetes resources are defined through manifests in YAML format (a human-readable data format often used in configuration files) or more rarely JSON and are accessible via the K8s API.

Here is an example of what the description of a pod looks like:

apiVersion: v1
kind: Pod
metadata:
  name: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80

YAML manifests are the official definition of the desired state. These manifests can (and should) be versioned in a code repository like git, for example, which will later make it very easy to set up a staging environment or facilitate the implementation of a DRP (Disaster Recovery Plan).

Kubectl, the Official CLI of Kubernetes

kubectl is the official command-line tool of Kubernetes. It allows managing cluster resources, creating pods, deploying Deployments, and more.

Here are a few examples of usage:

$ # list the nodes of the cluster
$ kubectl get nodes
NAME      STATUS   ROLES               AGE    VERSION
main1     Ready    controlplane,etcd   166d   v1.18.18
worker1   Ready    worker              166d   v1.18.18
worker2   Ready    worker              166d   v1.18.18
worker3   Ready    worker              55d    v1.18.18
$
$ # apply the yaml of a Deployment
$ # creates or updates the Deployment if it already exists
$ kubectl apply -f ./nginx-deployment.yaml
deployment/nginx created
$
$ # delete a Deployment via its yaml
$ kubectl delete -f ./nginx-deployment.yaml
deployment "nginx" deleted

Other excellent tools such as Lens (offered by our partner Mirantis) or K9s provide a graphical interface for performing the same actions as kubectl.

Extensions of Kubernetes: Plugins, CRDs, and Operators

While Kubernetes comes packed with a plethora of native features, on a production platform it often needs to be extended in several ways:

  • Infrastructure-Level Extension: To integrate with specialized external “components” such as storage, network, or container runtime. The integration of these components is standardized through specifications like CSI (Container Storage Interface) for storage, CNI (Container Network Interface) for the network, and CRI (Container Runtime Interface) for container execution. For each of these specifications, several implementations already exist, for example, Kube-Router for CNI, containerd for CRI, and OpenEBS for CSI. You can also create your own implementations of these specifications to meet your own needs. This is what we did at Enix with our storage driver SAN iSCSI CSI, which is open-sourced on our GitHub.

  • Functional-Level Extension: To cover specific needs. For example, new types of resources can be created thanks to CRDs (Custom Resource Definition), which are managed by operators. Concrete example: with the cert-manager operator, you can use the Kubernetes API to obtain and renew TLS certificates via Vault or Let’s Encrypt, in a way that is relatively well integrated with other Kubernetes resources (notably Secrets and Ingress). Another common use of these famous operators is the automation of the lifecycle of database clusters: Kubernetes becomes able, for instance, to ensure the provisioning of a pair of SQL servers (with primary/secondary replication) and automatic recovery in case of failure.

Kubernetes Distributions

Kubernetes isn’t designed to function alone. When one wishes to control all elements of their cluster and have custom features to deploy them on the infrastructure of their choice, components can be installed manually. We talk about Vanilla Kubernetes clusters.

Another alternative exists for deploying Kubernetes on any type of infrastructure, without vendor lock-in, and without losing portability (or minimally). Much like Linux, these are Kubernetes distributions that include essential tools for operating a cluster.

They can also integrate related features often used with a Kubernetes cluster (for example, a Prometheus/Grafana monitoring chain, a GitOps chain, or a CI/CD pipeline, etc.).

Proven Kubernetes distributions include: Rancher Kubernetes Engine, Talos/Omni, Mirantis Kubernetes Engine, D2iQ DKP, and Rancher K3s for embedded systems and edge computing.

More proprietary distributions or those more adherent to the underlying infrastructure also exist, for instance, VMware Tanzu or Red Hat OpenShift.

Managed Kubernetes Services

Alternatively, you can leverage managed Kubernetes services from major Cloud providers.

The handling is simpler and faster as the complexity of Kubernetes is largely hidden, the scaling of resources is presumably facilitated thanks to the integration between the managed K8s service and the other services of the Cloud provider.

However, these benefits do not come without trade-offs: not all needs are natively covered, one loses some of the independence promised by Kubernetes, deployments are on public cloud therefore on shared infrastructures, etc.

The main Cloud service providers offer a managed Kubernetes service. Examples include:

  • Amazon Elastic Kubernetes Service (Amazon EKS)
  • Azure Kubernetes Service (Azure AKS)
  • Google Kubernetes Engine (Google GKE)
  • OVHcloud Managed Kubernetes Service
  • Scaleway Kapsule

To conclude this part on Kubernetes distributions and managed services, the choice of implementation and Kubernetes solution can be complex: it depends on your broader infrastructure/cloud strategy, or on your specific criteria related to your platform, your containerized applications, and ultimately your business.

Conclusion

Kubernetes brings tangible benefits both technically and operationally in managing containerized applications when they are distributed and/or large: for managing load-balancing, scaling, for performing rolling updates during productions, etc.

However, Kubernetes is a vast and complex tool requiring a long period of learning. To allow you to approach containers and Kubernetes with serenity, we offer at Enix a kubernetes training recognized (notably thanks to our well-known trainer Jérôme Petazzoni!) and complete on these fantastic technologies. You will discover the concepts discussed in this article and many more expert ones, such as RBAC, health checks, or network policies.

And if you need help deploying and operating your Kubernetes clusters, you can check out our unique approach to Cloud Native Managed Services or contact us to discuss it together!


Do not miss our latest DevOps and Cloud Native blogposts! Follow Enix on Twitter!