Kubernetes Explained: Architecture, Components, and Why It Beats Docker Swarm

The Problem Docker Swarm Could Not Solve

If you have been learning containerization, you probably started with Docker on a single machine. Then you worried: what if this machine goes down? So you added another machine, connected the two, and formed a cluster. That cluster was managed by Docker Swarm, which is an inbuilt orchestration tool inside Docker. You do not install it separately. You just run docker swarm init and join your worker nodes to the master.

Docker Swarm works fine for small applications. It supports self-healing through replicas. Delete a container and Docker Swarm automatically spins up a replacement. But there is a ceiling to what Docker Swarm can handle.

The two big limitations of Docker Swarm:

No auto-scaling
No built-in load balancer (you configure one manually)

For small applications with a limited number of users, Docker Swarm is perfectly fine. But when your application grows, when you need real auto-scaling and production-grade load balancing, you need Kubernetes.

This is the core reason teams move from Docker Swarm to Kubernetes. If someone asks in an interview which container orchestration tools you know, do not just say Kubernetes. Say Docker Swarm and Kubernetes. Then explain the difference: Docker Swarm is simpler, Docker-native, better for small applications, but lacks auto-scaling and a proper load balancer. Kubernetes supports all of these and is production-ready for large-scale applications.

What Is Kubernetes?

Kubernetes is a powerful, open-source platform for automating the deployment, scaling, and operations of applications running inside containers. It is a container orchestration tool.

Open-source means it is free. You will also hear people call it K8s. Here is why: Kubernetes starts with K and ends with s. Count the letters in between: u-b-e-r-n-e-t-e. That is 8 letters. So K8s.

The logo of Kubernetes is a ship's steering wheel. A ship carries containers. The steering wheel controls where those containers go. Kubernetes is the steering wheel for your containerized applications.

Here is another analogy. Think of a music conductor standing in front of an orchestra. The conductor does not sing. The conductor controls all the singers by directing them with hand movements. Kubernetes is that conductor. The containers are the singers. That is why we call Kubernetes a container orchestrator.

One critical thing to understand early: Kubernetes does not understand containers directly. Kubernetes only understands Pods. A pod is a wrapper that sits on top of the container. Inside the pod are containers. Inside the containers are your applications. Kubernetes manages pods, not containers directly.

A Brief History of Kubernetes

Before Kubernetes existed, Google had an internal cluster management tool called Borg. Borg was used only inside Google to manage their massive infrastructure. Google then built a more flexible successor called Omega. Both were internal tools never released to the public.

In mid-2014, Google officially announced Kubernetes as the successor to Omega. It was initially developed by Google engineers Joe Beda, Brendan Burns, and Craig McLuckie. Google later donated the project to the Cloud Native Computing Foundation (CNCF), which maintains it today.

The Cluster: The Foundation of Everything

Whenever someone says Kubernetes, the first word that should come to your mind is cluster. You cannot manage containerized workloads with Kubernetes without a cluster.

A cluster is a group of server machines working together. Inside a Kubernetes cluster, there are two types of nodes:

Master Node, also called the Control Plane
Worker Nodes

All commands you run go to the master node. The master node receives those commands and decides what to do. But the master node does not run your application workloads. All the pods, which contain your containers and applications, live on the worker nodes.

Think of the master node as a manager in a company. A manager does not do the hands-on work. The manager receives requests and distributes tasks to the team. The worker nodes are the team doing the actual work. The master node holds no pods at all.

When you run a command to create a pod, you run it on the master. The master receives it and tells the appropriate worker node to create the pod. Which worker node gets the pod is decided entirely by the master.

Master Node Components (Control Plane)

The master node has four core components plus one optional component. These are heavily covered in interviews.

kube-apiserver

This is the entry point for all commands sent to the cluster. When you run any kubectl command from your terminal, that command goes to the kube-apiserver first.

Think of it as the receptionist at a hospital. The receptionist receives everyone who walks in, understands what they need, and redirects the request to the right person. The kube-apiserver receives all REST commands from the client, communicates with the user, and passes the work to the appropriate component.

kube-scheduler

Once the kube-apiserver receives a command to create a pod, it forwards that request to the kube-scheduler. The scheduler decides which worker node the pod should land on.

The scheduler checks which nodes are available, how many pods each node is already running, and assigns the new pod accordingly. This is completely automatic. You do not pick the worker node. The scheduler does.

etcd

etcd is a key-value store used by Kubernetes to store all cluster data. Think of it as the database of the cluster. Everything about the cluster state lives here: which pods are running, on which nodes, what configurations are applied.

Do not just call it a database in an interview. Say: "etcd is a component in Kubernetes that stores all cluster data in a key-value format."

kube-controller-manager

The controller manager is responsible for maintaining the state of the cluster. State here means: are all pods running as expected? Are replicas at the correct count? If a pod goes down, the controller manager detects that the actual state does not match the desired state and takes action to fix it.

Everything related to cluster health, replication counts, and ensuring things stay as configured is managed by the kube-controller-manager.

cloud-controller-manager (Optional)

This component is only needed when running Kubernetes on a cloud provider like AWS. It handles interactions with cloud infrastructure such as provisioning load balancers, managing storage volumes, and handling cloud-specific networking.

Worker Node Components

Worker nodes are where your actual application workloads run. They have their own set of components.

Kubelet

Kubelet is an agent that runs on every worker node. It is the component that communicates with the master node and ensures that the pods assigned to that worker node are actually running.

When the scheduler assigns a pod to a worker node, the kubelet on that node creates and maintains that pod. If a container inside the pod crashes, kubelet ensures it gets restarted.

Container Runtime

A pod contains containers. To actually run those containers, the worker node needs a container runtime. The container runtime is the tool responsible for running the containers inside the pod. Docker is the most common example.

Note the distinction clearly: kubelet ensures pods are running. Container runtime ensures containers inside those pods are running. Both are different components with separate responsibilities.

kube-proxy

Every pod needs a network identity. kube-proxy handles the networking inside the worker node. It assigns IP addresses to pods and manages communication between nodes across the cluster.

The Command: kubectl

To run any command against a Kubernetes cluster, you use a tool called kubectl. This is your command-line interface to the cluster. Some people pronounce it "kube-control", others say "kubectl." Both are acceptable.

kubectl get pods
kubectl create -f manifest.yaml
kubectl delete pod my-pod

When you type a kubectl command, it goes to the kube-apiserver. The apiserver processes it and the rest of the components take over from there.

Kubernetes Features Beyond Orchestration

Services: Exposing Your Application

You deploy your application inside a pod. That pod is running inside a worker node deep inside the cluster. How does someone on the internet access it? You create a Service.

Services expose your application and come in three types:

ClusterIP gives your service an internal IP address within the cluster only. Your application is not accessible from outside. This is useful for internal components like databases that should never be exposed to the internet.

NodePort exposes your application on a specific port on the worker node's IP address. You access it as NodeIP:PortNumber from outside the cluster. It works but is not ideal for production because you are handing users raw IP addresses and port numbers.

LoadBalancer is the production-ready option. It provisions a cloud load balancer automatically and gives your application a clean public endpoint. This is what you use in production.

Volumes

Pods are ephemeral, meaning temporary. If a pod goes down and gets recreated, the data inside it is gone. To persist data beyond the life of a pod, you use Volumes. This is the same concept as Docker volumes applied at the Kubernetes level. You can attach cloud storage like AWS EBS volumes to your pods.

Namespaces

Imagine a company where developers, testers, and a support team all share the same Kubernetes cluster. The developers do not want their pods mixed with the testers' pods. One solution would be to create separate clusters for each team, but that means spinning up multiple master nodes and multiple sets of worker nodes. That is expensive and wasteful.

Instead, you create Namespaces inside the same cluster. A namespace is a logical boundary within the cluster. Each team gets its own namespace and only sees what is inside it. Access is controlled using RBAC (Role-Based Access Control), the same concept used in tools like Jenkins.

Ways to Create a Kubernetes Cluster

There are two broad approaches: self-managed and cloud-based.

Self-Managed Clusters

Minikube is the tool to start with. Minikube creates a single-node cluster on your local machine. Everything, the master components and the worker, runs on one machine. This is not for production but is perfect for learning kubectl commands and getting comfortable with Kubernetes without spending on cloud resources.

Kops (Kubernetes Operations) is used when you are ready to build a real multi-node cluster. With Kops, you provision the master node, provision the worker nodes, and manage the cluster yourself. This is how you learn Kubernetes deeply before relying on managed services.

kubeadm is another tool for bootstrapping a Kubernetes cluster on existing machines.

Cloud-Based (Managed) Clusters

EKS (Elastic Kubernetes Service) is AWS's fully managed Kubernetes service. AWS manages the entire control plane. You create the cluster through the AWS console or CLI, connect with kubectl, and start deploying. Azure has AKS, and Google Cloud has GKE.

The reason to learn self-managed first is simple: when AWS manages the control plane, you do not understand what is happening underneath. Learn it manually, then appreciate what EKS abstracts away.

Docker Swarm vs Kubernetes: The Structural Difference

Here is the architecture story, side by side:

Docker Swarm: Cluster contains Nodes, Nodes contain Containers, Containers contain Applications.

Kubernetes: Cluster contains Nodes, Nodes contain Pods, Pods contain Containers, Containers contain Applications.

The only structural difference is the addition of Pods as a layer between nodes and containers. That one extra layer is what gives Kubernetes its flexibility, self-healing ability, and support for advanced networking and service discovery.

Summary

Kubernetes is a free, open-source container orchestration platform that solves what Docker Swarm cannot: auto-scaling, built-in load balancing, self-healing, namespacing, and production-grade deployment at scale.

Here is everything in one place:

A cluster has a master node (control plane) and multiple worker nodes
Master node components: kube-apiserver, kube-scheduler, etcd, kube-controller-manager, cloud-controller-manager (optional)
Worker node components: kubelet (agent), container runtime, kube-proxy
Pods are the smallest unit Kubernetes manages. One pod, one container is the standard setup
kubectl is the command-line tool you use to communicate with the cluster via the kube-apiserver
Services expose your application: ClusterIP for internal, NodePort for port-based access, LoadBalancer for production
Volumes persist data beyond the life of a pod
Namespaces provide logical isolation inside a single cluster for multiple teams
Start with Minikube for learning, move to Kops for self-managed multi-node clusters, then EKS for cloud-managed

Kubernetes Explained: Architecture, Components, and Why It Beats Docker Swarm

The Problem Docker Swarm Could Not Solve

What Is Kubernetes?

A Brief History of Kubernetes

The Cluster: The Foundation of Everything