Skip to main content

Command Palette

Search for a command to run...

Kubernetes Explained: Architecture, Components, and Why It Beats Docker Swarm

Updated
11 min read
Kubernetes Explained: Architecture, Components, and Why It Beats Docker Swarm
S
Passionate about Cloud and DevOps engineering. I write structured technical notes and beginner-friendly articles on AWS, Linux, CI/CD, networking, system architecture and modern software delivery workflows.

The Problem Docker Swarm Could Not Solve

If you have been learning containerization, you probably started with Docker on a single machine. Then you worried: what if this machine goes down? So you added another machine, connected the two, and formed a cluster. That cluster was managed by Docker Swarm, which is an inbuilt orchestration tool inside Docker. You do not install it separately. You just run docker swarm init and join your worker nodes to the master.

Docker Swarm works fine for small applications. It supports self-healing through replicas. Delete a container and Docker Swarm automatically spins up a replacement. But there is a ceiling to what Docker Swarm can handle.

The two big limitations of Docker Swarm:

  • No auto-scaling

  • No built-in load balancer (you configure one manually)

For small applications with a limited number of users, Docker Swarm is perfectly fine. But when your application grows, when you need real auto-scaling and production-grade load balancing, you need Kubernetes.

This is the core reason teams move from Docker Swarm to Kubernetes. If someone asks in an interview which container orchestration tools you know, do not just say Kubernetes. Say Docker Swarm and Kubernetes. Then explain the difference: Docker Swarm is simpler, Docker-native, better for small applications, but lacks auto-scaling and a proper load balancer. Kubernetes supports all of these and is production-ready for large-scale applications.


What Is Kubernetes?

Kubernetes is a powerful, open-source platform for automating the deployment, scaling, and operations of applications running inside containers. It is a container orchestration tool.

Open-source means it is free. You will also hear people call it K8s. Here is why: Kubernetes starts with K and ends with s. Count the letters in between: u-b-e-r-n-e-t-e. That is 8 letters. So K8s.

The logo of Kubernetes is a ship's steering wheel. A ship carries containers. The steering wheel controls where those containers go. Kubernetes is the steering wheel for your containerized applications.

Here is another analogy. Think of a music conductor standing in front of an orchestra. The conductor does not sing. The conductor controls all the singers by directing them with hand movements. Kubernetes is that conductor. The containers are the singers. That is why we call Kubernetes a container orchestrator.

One critical thing to understand early: Kubernetes does not understand containers directly. Kubernetes only understands Pods. A pod is a wrapper that sits on top of the container. Inside the pod are containers. Inside the containers are your applications. Kubernetes manages pods, not containers directly.


A Brief History of Kubernetes

Before Kubernetes existed, Google had an internal cluster management tool called Borg. Borg was used only inside Google to manage their massive infrastructure. Google then built a more flexible successor called Omega. Both were internal tools never released to the public.

In mid-2014, Google officially announced Kubernetes as the successor to Omega. It was initially developed by Google engineers Joe Beda, Brendan Burns, and Craig McLuckie. Google later donated the project to the Cloud Native Computing Foundation (CNCF), which maintains it today.


The Cluster: The Foundation of Everything

Whenever someone says Kubernetes, the first word that should come to your mind is cluster. You cannot manage containerized workloads with Kubernetes without a cluster.

A cluster is a group of server machines working together. Inside a Kubernetes cluster, there are two types of nodes:

  • Master Node, also called the Control Plane

  • Worker Nodes

All commands you run go to the master node. The master node receives those commands and decides what to do. But the master node does not run your application workloads. All the pods, which contain your containers and applications, live on the worker nodes.

Think of the master node as a manager in a company. A manager does not do the hands-on work. The manager receives requests and distributes tasks to the team. The worker nodes are the team doing the actual work. The master node holds no pods at all.

When you run a command to create a pod, you run it on the master. The master receives it and tells the appropriate worker node to create the pod. Which worker node gets the pod is decided entirely by the master.


Master Node Components (Control Plane)

The master node has four core components plus one optional component. These are heavily covered in interviews.

kube-apiserver

This is the entry point for all commands sent to the cluster. When you run any kubectl command from your terminal, that command goes to the kube-apiserver first.

Think of it as the receptionist at a hospital. The receptionist receives everyone who walks in, understands what they need, and redirects the request to the right person. The kube-apiserver receives all REST commands from the client, communicates with the user, and passes the work to the appropriate component.

kube-scheduler

Once the kube-apiserver receives a command to create a pod, it forwards that request to the kube-scheduler. The scheduler decides which worker node the pod should land on.

The scheduler checks which nodes are available, how many pods each node is already running, and assigns the new pod accordingly. This is completely automatic. You do not pick the worker node. The scheduler does.

etcd

etcd is a key-value store used by Kubernetes to store all cluster data. Think of it as the database of the cluster. Everything about the cluster state lives here: which pods are running, on which nodes, what configurations are applied.

Do not just call it a database in an interview. Say: "etcd is a component in Kubernetes that stores all cluster data in a key-value format."

kube-controller-manager

The controller manager is responsible for maintaining the state of the cluster. State here means: are all pods running as expected? Are replicas at the correct count? If a pod goes down, the controller manager detects that the actual state does not match the desired state and takes action to fix it.

Everything related to cluster health, replication counts, and ensuring things stay as configured is managed by the kube-controller-manager.

cloud-controller-manager (Optional)

This component is only needed when running Kubernetes on a cloud provider like AWS. It handles interactions with cloud infrastructure such as provisioning load balancers, managing storage volumes, and handling cloud-specific networking.


Worker Node Components

Worker nodes are where your actual application workloads run. They have their own set of components.

Kubelet

Kubelet is an agent that runs on every worker node. It is the component that communicates with the master node and ensures that the pods assigned to that worker node are actually running.

When the scheduler assigns a pod to a worker node, the kubelet on that node creates and maintains that pod. If a container inside the pod crashes, kubelet ensures it gets restarted.

Container Runtime

A pod contains containers. To actually run those containers, the worker node needs a container runtime. The container runtime is the tool responsible for running the containers inside the pod. Docker is the most common example.

Note the distinction clearly: kubelet ensures pods are running. Container runtime ensures containers inside those pods are running. Both are different components with separate responsibilities.

kube-proxy

Every pod needs a network identity. kube-proxy handles the networking inside the worker node. It assigns IP addresses to pods and manages communication between nodes across the cluster.


The Command: kubectl

To run any command against a Kubernetes cluster, you use a tool called kubectl. This is your command-line interface to the cluster. Some people pronounce it "kube-control", others say "kubectl." Both are acceptable.

kubectl get pods
kubectl create -f manifest.yaml
kubectl delete pod my-pod

When you type a kubectl command, it goes to the kube-apiserver. The apiserver processes it and the rest of the components take over from there.


Kubernetes Features Beyond Orchestration

Services: Exposing Your Application

You deploy your application inside a pod. That pod is running inside a worker node deep inside the cluster. How does someone on the internet access it? You create a Service.

Services expose your application and come in three types:

ClusterIP gives your service an internal IP address within the cluster only. Your application is not accessible from outside. This is useful for internal components like databases that should never be exposed to the internet.

NodePort exposes your application on a specific port on the worker node's IP address. You access it as NodeIP:PortNumber from outside the cluster. It works but is not ideal for production because you are handing users raw IP addresses and port numbers.

LoadBalancer is the production-ready option. It provisions a cloud load balancer automatically and gives your application a clean public endpoint. This is what you use in production.

Volumes

Pods are ephemeral, meaning temporary. If a pod goes down and gets recreated, the data inside it is gone. To persist data beyond the life of a pod, you use Volumes. This is the same concept as Docker volumes applied at the Kubernetes level. You can attach cloud storage like AWS EBS volumes to your pods.

Namespaces

Imagine a company where developers, testers, and a support team all share the same Kubernetes cluster. The developers do not want their pods mixed with the testers' pods. One solution would be to create separate clusters for each team, but that means spinning up multiple master nodes and multiple sets of worker nodes. That is expensive and wasteful.

Instead, you create Namespaces inside the same cluster. A namespace is a logical boundary within the cluster. Each team gets its own namespace and only sees what is inside it. Access is controlled using RBAC (Role-Based Access Control), the same concept used in tools like Jenkins.


Ways to Create a Kubernetes Cluster

There are two broad approaches: self-managed and cloud-based.

Self-Managed Clusters

Minikube is the tool to start with. Minikube creates a single-node cluster on your local machine. Everything, the master components and the worker, runs on one machine. This is not for production but is perfect for learning kubectl commands and getting comfortable with Kubernetes without spending on cloud resources.

Kops (Kubernetes Operations) is used when you are ready to build a real multi-node cluster. With Kops, you provision the master node, provision the worker nodes, and manage the cluster yourself. This is how you learn Kubernetes deeply before relying on managed services.

kubeadm is another tool for bootstrapping a Kubernetes cluster on existing machines.

Cloud-Based (Managed) Clusters

EKS (Elastic Kubernetes Service) is AWS's fully managed Kubernetes service. AWS manages the entire control plane. You create the cluster through the AWS console or CLI, connect with kubectl, and start deploying. Azure has AKS, and Google Cloud has GKE.

The reason to learn self-managed first is simple: when AWS manages the control plane, you do not understand what is happening underneath. Learn it manually, then appreciate what EKS abstracts away.


Docker Swarm vs Kubernetes: The Structural Difference

Here is the architecture story, side by side:

Docker Swarm: Cluster contains Nodes, Nodes contain Containers, Containers contain Applications.

Kubernetes: Cluster contains Nodes, Nodes contain Pods, Pods contain Containers, Containers contain Applications.

The only structural difference is the addition of Pods as a layer between nodes and containers. That one extra layer is what gives Kubernetes its flexibility, self-healing ability, and support for advanced networking and service discovery.


Summary

Kubernetes is a free, open-source container orchestration platform that solves what Docker Swarm cannot: auto-scaling, built-in load balancing, self-healing, namespacing, and production-grade deployment at scale.

Here is everything in one place:

  • A cluster has a master node (control plane) and multiple worker nodes

  • Master node components: kube-apiserver, kube-scheduler, etcd, kube-controller-manager, cloud-controller-manager (optional)

  • Worker node components: kubelet (agent), container runtime, kube-proxy

  • Pods are the smallest unit Kubernetes manages. One pod, one container is the standard setup

  • kubectl is the command-line tool you use to communicate with the cluster via the kube-apiserver

  • Services expose your application: ClusterIP for internal, NodePort for port-based access, LoadBalancer for production

  • Volumes persist data beyond the life of a pod

  • Namespaces provide logical isolation inside a single cluster for multiple teams

  • Start with Minikube for learning, move to Kops for self-managed multi-node clusters, then EKS for cloud-managed

Kubernetes

Part 1 of 1

A Kubernetes learning series covering container orchestration, clusters, pods, deployments, services, networking, storage, scaling, and real world DevOps practices with practical explanations.

More from this blog

Ansible Playbooks Explained: From First YAML File to Managing Real Servers

If you have already run a few Ansible ad hoc commands and seen how they work, you already understand the core idea: one command, many servers. But ad hoc commands only take you so far. When you need to install software, start services, create users, copy files, and print confirmation messages, all in one automated run across multiple servers, that is when you move to playbooks. Playbooks are where Ansible truly earns its place in a DevOps workflow. Everything you do in Ansible at scale, you do through playbooks.

Jun 19, 202613 min read
Ansible Playbooks Explained: From First YAML File to Managing Real Servers

You Have 400 Servers to Configure. Now What? Let Ansible Do the Work.

Picture this. You have four EC2 instances running in your AWS account, and someone asks you to install Apache on all four of them. What do you do? The obvious answer most people go with is SSH into each machine, run the install command, repeat. Simple enough when it is four servers. But what happens when it is forty? Or four hundred? In a real enterprise environment, that number is not exaggerated at all. That is exactly the problem that Ansible was built to solve. And once you understand what it does and how it thinks, you will wonder how anyone managed large infrastructure without it.

Jun 18, 202612 min read
You Have 400 Servers to Configure. Now What? Let Ansible Do the Work.

Building Production-Ready Docker Deployments with Secrets, Stacks, and Distroless Images

This post wraps up the core Docker Swarm curriculum by covering four important topics that complete the picture of production-ready containerized deployments. We start with Docker Secrets, which solves the real-world problem of passing sensitive credentials into containers without hardcoding them. We then look at Docker Stack, which is how you run multi-service Docker Compose files across a Swarm cluster instead of a single host. After that we cover the distinction between replicated and global services, which is a concept that appears in Kubernetes as well. We close with a look at Portainer for those who prefer a visual interface, and a brief introduction to distroless images. Each of these topics builds on everything covered so far. If you have your Swarm cluster running, you can follow along with every command shown here.

Jun 17, 202614 min read
Building Production-Ready Docker Deployments with Secrets, Stacks, and Distroless Images

Beyond One Server: Solving Docker Scaling with Swarm and Container Networks

This post covers three separate but deeply connected topics. We start by finishing what was started with Docker Hub, pushing all four bank service images to a remote registry so they survive beyond any single machine. From there, we identify a real architectural problem with single-host Docker deployments and introduce Docker Swarm as the solution. Finally, we close with Docker networking, explaining how containers communicate with each other both on the same host and across different hosts. By the end of this article, you will understand how to push and pull images from Docker Hub, how to set up a multi-node Docker Swarm cluster, how to create and scale services across that cluster, what self-healing means in practice, and how Docker networking works under the hood.

Jun 17, 202618 min read
Beyond One Server: Solving Docker Scaling with Swarm and Container Networks

Docker Compose in Action: Multi-Container Apps, Nginx Load Balancing & Docker Hub

We built four containerized microservices for an bank application in the last post. Internet banking, mobile banking, insurance, and loans, each running in its own container, each exposed on a separate port. The setup worked. But the process of building and running each container individually by hand was repetitive, error-prone, and simply not practical at scale. This post introduces Docker Compose, and by the end, you will understand not just how to use it, but why it exists, what its real limitations are, how to combine it with Nginx to build a working high availability architecture, and how to push your images to Docker Hub so they are available beyond your local machine. There is also a hands-on project included here that builds a Flask-based Python application behind an Nginx load balancer, which you are expected to complete as a practical exercise.

Jun 16, 202617 min read
Docker Compose in Action: Multi-Container Apps, Nginx Load Balancing & Docker Hub
S

Sai Praneeth's Blogs

37 posts

From SDLC and Agile to DevOps and CI/CD, this blog is where I share structured technical notes, concepts and practical insights in Cloud and DevOps engineering.