Skip to main content

Command Palette

Search for a command to run...

Beyond One Server: Solving Docker Scaling with Swarm and Container Networks

This post covers three separate but deeply connected topics. We start by finishing what was started with Docker Hub, pushing all four bank service images to a remote registry so they survive beyond any single machine. From there, we identify a real architectural problem with single-host Docker deployments and introduce Docker Swarm as the solution. Finally, we close with Docker networking, explaining how containers communicate with each other both on the same host and across different hosts. By the end of this article, you will understand how to push and pull images from Docker Hub, how to set up a multi-node Docker Swarm cluster, how to create and scale services across that cluster, what self-healing means in practice, and how Docker networking works under the hood.

Updated
18 min read
Beyond One Server: Solving Docker Scaling with Swarm and Container Networks
S
Passionate about Cloud and DevOps engineering. I write structured technical notes and beginner-friendly articles on AWS, Linux, CI/CD, networking, system architecture and modern software delivery workflows.

Pushing All Four Images to Docker Hub

We have four images built locally: internet-banking:v1, mobile-banking:v1, insurance:v1, and loans:v1. These images live only on the EC2 host machine right now. The goal is to get them into Docker Hub so they are safe, shareable, and accessible from any machine.

The workflow is always the same: clone code, build images, tag them with your Docker Hub repository path, log in, and push.

Build each image from its Dockerfile:

cd internet-banking
docker build -t internet-banking:v1 .
cd ..

cd mobile-banking
docker build -t mobile-banking:v1 .
cd ..

cd insurance
docker build -t insurance:v1 .
cd ..

cd loans
docker build -t loans:v1 .
cd ..

Verify all four are present:

docker images

Step 2: Tag the Images

Before pushing, each image must be tagged with your Docker Hub username and repository name. Docker uses this tag to know where to push the image.

docker tag internet-banking:v1 yourusername/ib-image
docker tag mobile-banking:v1 yourusername/mb-image
docker tag insurance:v1 yourusername/insurance-image
docker tag loans:v1 yourusername/loans-image

One thing worth noting here: if you do not specify a version tag at the end (like :v1), Docker automatically assigns the tag latest. So yourusername/ib-image and yourusername/ib-image:latest are the same thing. Always remember that no tag means latest by default.

Step 3: Login to Docker Hub

docker login

Enter your Docker Hub username and password when prompted. On successful login, you will see Login Succeeded. Docker stores the credentials locally so you do not have to log in again for subsequent pushes in the same post.

If you try to push without logging in first, Docker Hub rejects the request with an access denied error. Login is mandatory before any push operation.

Step 4: Push All Images

docker push yourusername/ib-image
docker push yourusername/mb-image
docker push yourusername/insurance-image
docker push yourusername/loans-image

Each push uploads the image layer by layer to Docker Hub. Once complete, open your Docker Hub repository in the browser and you will see all four images listed with a "pushed less than a minute ago" timestamp.

Step 5: Delete Local Images and Pull Back

Let us now verify that Docker Hub is actually storing these correctly by deleting everything locally and pulling them back.

To delete all local images in one command:

docker rmi -f $(docker images -q)

The -q flag on docker images returns only the image IDs. Wrapping it in $() passes all those IDs directly to docker rmi -f. Every image gets force-removed.

Run docker images to confirm nothing is left locally.

Now pull one image back from Docker Hub:

docker pull yourusername/ib-image

Docker goes to Docker Hub, downloads the image, and it is available locally again. You can now create a container from it just as if you had built it yourself:

docker run -itd --name container1 -p 80:80 yourusername/ib-image

Navigate to your EC2 IP in a browser. The internet banking application is live. The image came straight from Docker Hub.

This is the purpose of Docker Hub. Images stored there survive machine terminations, restarts, and team handoffs. Developers push, operations teams pull.


AWS Equivalents: ECS and ECR

Everything we have done with Docker so far runs on a single EC2 instance. There is an AWS-native equivalent for everything we have covered.

Docker on EC2 has an equivalent called ECS (Elastic Container Service). ECS is a fully managed container orchestration service that runs your containers without you managing the underlying servers. It is the serverless version of what we have been doing manually.

Docker Hub has an equivalent called ECR (Elastic Container Registry). ECR is AWS's managed image registry, the same concept as Docker Hub but within the AWS ecosystem, with tighter integration with IAM permissions and VPC networking.

The mapping is clean:

  • Docker + EC2 is equivalent to ECS

  • Docker Hub is equivalent to ECR

We will cover ECS and ECR in project form later in the course. For now, the important thing is understanding the conceptual mapping.


The Problem with a Single Docker Host

Here is a scenario worth thinking about carefully.

You have one EC2 instance. Docker is installed on it. You have four containers running your banking services. Things are working well.

Now someone accidentally terminates that EC2 instance.

Everything is gone. Not just the containers, but the Docker host itself. Docker Compose can bring containers back up, but only if the machine still exists. If the machine is gone, Docker Compose has nothing to work with.

This is the fundamental problem with single-host deployments. One machine going down takes everything with it. The solution is to run your containers across multiple machines, so if one goes down, the others keep serving traffic.

This multi-machine, multi-container management is called container orchestration.


Container Orchestration and Docker Swarm

Orchestration means managing containers across multiple host machines. You have seen this pattern before in other tools. Jenkins has a master and worker agents. Ansible has a control node and managed nodes. Docker Swarm follows the exact same model.

Docker Swarm is Docker's built-in container orchestration tool. It lets you join multiple Docker host machines into a cluster called a Swarm. One machine acts as the manager node and the others act as worker nodes. The manager distributes work to the workers.

The two main orchestration tools you need to know are Docker Swarm and Kubernetes. Kubernetes is far more widely used in production today, but Docker Swarm is worth knowing because some organizations still use it. Both are valid answers when an interviewer asks what container orchestration tools you have worked with.

One important operational note: Docker Swarm is built directly into the Docker Engine. You do not need to install anything extra to use it. This is different from Docker Compose, which requires a separate installation step. Swarm is already there the moment Docker is installed.


Setting Up a Docker Swarm Cluster

For this setup, we need three machines: one manager and two worker nodes. Launch three EC2 instances and install Docker on all three.

Setting Hostnames

For clarity, set descriptive hostnames on each machine.

On the manager:

sudo hostnamectl set-hostname docker-swarm-master

On worker node 1:

sudo hostnamectl set-hostname worker-node-1

On worker node 2:

sudo hostnamectl set-hostname worker-node-2

Install and Start Docker on Worker Nodes

On both worker nodes:

yum install docker -y
systemctl start docker

The manager node already has Docker running from previous post.

Initialize the Swarm on the Manager

On the manager node:

docker swarm init

This initializes the Swarm and makes the current machine the manager. The output includes a docker swarm join command with a token. That token is what worker nodes need to join this Swarm.

The output will look something like this:

Swarm initialized: current node is now a manager.

To add a worker to this swarm, run the following command:

    docker swarm join --token SWMTKN-1-xxxx <manager-ip>:2377

The port number Docker Swarm uses for cluster management is 2377. This is an interview-worthy fact. Docker Swarm communicates between manager and worker nodes on port 2377.

Join Worker Nodes to the Swarm

Copy the full docker swarm join command from the manager's output and run it on each worker node:

docker swarm join --token SWMTKN-1-xxxx <manager-ip>:2377

After running this on both workers, each one responds with: This node joined a swarm as a worker.

Verify the Cluster

Back on the manager node:

docker node ls

This lists all nodes in the Swarm. You will see the manager marked with a * (indicating the current node) and labeled as Leader. Worker node 1 and worker node 2 appear below it with status Ready.

Your cluster is ready.


Creating Services in Docker Swarm

This is where Docker Swarm changes the command you use. In single-host Docker, you create containers with docker run. In Docker Swarm, docker run still works, but it only creates a container on the local machine. It does not use the Swarm at all.

To create a container that is distributed across the cluster, you use docker service create.

docker service create \
  --name internet-banking \
  --replicas 3 \
  -p 81:80 \
  yourusername/ib-image

Breaking this down:

--name internet-banking gives the service a name. A service in Swarm terms is the definition of a workload. Docker Swarm creates and manages containers to fulfill that service.

--replicas 3 tells Docker Swarm to always keep exactly 3 containers running for this service. These 3 containers are distributed across your manager and worker nodes.

-p 81:80 maps port 81 on the host to port 80 inside the containers. Because of Docker Swarm's built-in routing mesh, you can hit port 81 on any node in the cluster and it will reach the service, regardless of which specific node the container is running on.

yourusername/ib-image is the image to use. If it is not present locally, Docker Swarm pulls it from Docker Hub automatically.

After running this command, verify the service:

docker service ls

The output shows the service name, number of replicas (for example, 3/3 meaning 3 running out of 3 desired), the image being used, and the published port.

Inspecting Where Containers Are Running

docker service ps internet-banking

This lists each container (called a task in Swarm terminology) along with the specific node it is running on. With 3 replicas across 3 nodes, you will see one container on the manager, one on worker node 1, and one on worker node 2.

You can access the application by going to the IP of any of the three nodes at port 81. Docker Swarm's routing mesh intercepts the request and routes it to an available container.

Checking Service Logs

docker service logs internet-banking

This aggregates logs from all container instances of the service. If something is wrong inside any of the containers, this is where you look.

Inspecting a Service

docker service inspect internet-banking

This returns the complete configuration of the service in JSON format: the image, replica count, port bindings, update configuration, and much more.


Scaling Services in Docker Swarm

One of the most practical things Docker Swarm does well is scaling.

Say traffic on internet banking increases significantly. You need more containers to handle the load. Scale out:

docker service scale internet-banking=15

Within seconds, Docker Swarm creates 12 additional containers and distributes them across all three nodes. Check the result:

docker service ps internet-banking

You will see 15 tasks listed, roughly 5 per node, all in running state. The entire scale-out operation takes about two seconds.

To scale back in:

docker service scale internet-banking=5

Docker Swarm removes 10 containers and settles at exactly 5. The distribution across nodes is handled automatically. You do not need to specify which nodes should run which containers.


Rolling Back a Service

Docker Swarm keeps track of the previous state of each service. If you scale out and then need to return to a previous configuration, use rollback:

docker service rollback internet-banking

This returns the service to its previous state. If you had scaled from 5 to 15, rollback brings it back to 5. If from 15 to 5, rollback returns to 15.

This is useful in team environments where multiple engineers might be adjusting a service. If someone makes a change and the result is unexpected, rollback gives you a quick way to revert without needing to know what the previous value was.

Run rollback again and it oscillates back the other way, always toggling between the last two states.


Self-Healing: Docker Swarm's Most Important Property

Here is the feature that makes Docker Swarm genuinely valuable for high availability.

With docker service scale internet-banking=5, Docker Swarm is now responsible for maintaining 5 running containers for that service at all times. If a container crashes, Docker Swarm detects it and automatically creates a replacement container to bring the count back to 5.

This is called self-healing.

An interviewer might ask: "In Docker Swarm, if a container is deleted or crashes, does it get recreated automatically?" The answer is yes. Docker Swarm continuously monitors the actual state of the cluster against the desired state (the replica count you specified). Any divergence triggers automatic recovery.

In production, you should not manually delete or stop individual Swarm containers. If you do accidentally delete one, Docker Swarm creates a new one immediately to maintain the replica count. The application keeps running throughout.


Limitations of Docker Swarm

Docker Swarm is useful, but it has real limitations that explain why Kubernetes has become the industry standard.

No auto-scaling. When you ran docker service scale internet-banking=15, you typed that number manually. Docker Swarm has no mechanism to watch CPU or memory usage and automatically adjust the replica count based on load. Scaling in Docker Swarm is always a manual operation.

Limited built-in load balancing. Docker Swarm includes a basic routing mesh that distributes connections across replicas, but it lacks advanced load balancing features such as health-check-based routing, SSL termination, and traffic weighting that production environments often require.

These two limitations are the primary reasons Docker Swarm is not widely used in modern production environments. Kubernetes addresses both of them. It supports Horizontal Pod Autoscaling based on real-time metrics, and it integrates with sophisticated load balancing solutions natively.

Docker Swarm remains worth knowing. Some organizations run it, and understanding its model makes Kubernetes concepts easier to absorb.

Leaving the Swarm

To remove a worker node from the Swarm:

docker swarm leave

Run this on the worker node you want to remove. The node exits the cluster. You can re-add it later by generating a new join token on the manager and running docker swarm join again on the worker.


Docker Networking

With the Swarm section complete, we now turn to Docker networking. Understanding how containers communicate with each other is essential for building real multi-service applications.

The question networking answers is: how does one container talk to another? The answer depends on whether those containers are on the same host or on different hosts.

To list all networks currently configured in Docker:

docker network ls

The output shows several networks, each with a name and a driver type. The driver is what determines the behavior.

The Four Network Types

Bridge (default for single-host communication)

Bridge is the default network that Docker uses unless you specify otherwise. When two containers are on the same host and you want them to communicate with each other, bridge networking is what you use.

When you run a container without specifying a network, it automatically gets placed on the default bridge network.

Overlay (for multi-host communication)

Overlay networks span multiple Docker hosts. When containers on different machines in a Swarm cluster need to communicate directly with each other, they use an overlay network. This is the network type Docker Swarm uses internally.

Host

The host network makes a container share the host machine's network stack directly. The container does not get its own IP address. It uses the same IP as the EC2 instance. Use this when you need the container to behave exactly like a process running directly on the host network.

None

The none network gives the container no network interface at all. Completely isolated. Use this only when the container intentionally must not communicate over any network.

In day-to-day work, you will primarily use bridge (for single-host container-to-container communication) and overlay (for Swarm or multi-host scenarios).


Creating and Using Custom Networks

Docker creates a default bridge network automatically, but you can create your own named networks for better isolation and control.

Create a custom network:

docker network create my-network

By default, this creates a bridge-type network named my-network. Verify it:

docker network ls

You will see my-network listed with the bridge driver.

Connecting Containers to a Network

Create two containers:

docker run -itd --name container1 -p 82:80 ubuntu
docker run -itd --name container2 -p 83:80 ubuntu

Check which network container1 is on by default:

docker inspect container1

In the output, look for NetworkMode. It shows bridge, the default.

Now connect both containers to the custom network:

docker network connect my-network container1
docker network connect my-network container2

Both containers are now part of my-network. They can communicate with each other through this network.

Testing Container-to-Container Communication

Find container1's IP address:

docker inspect container1

Look for the IPAddress field in the output. Note that IP address.

Now log into container2 and ping container1:

docker exec -it container2 /bin/bash

Inside the container, the ping utility may not be installed by default. Install it:

apt update
apt install iputils-ping -y

Now ping container1 by its IP address:

ping <container1-ip>

The ping responds successfully. The two containers are communicating over the custom network.

Exit the container:

Ctrl + P, then Ctrl + Q

Disconnecting a Container from a Network

docker network disconnect my-network container2

container2 is now removed from my-network. It can no longer communicate with container1 through that network.

Cleaning Up Unused Networks

docker network prune

This removes all networks that no containers are currently connected to. Useful for cleaning up leftover networks after containers have been removed.

Cleaning Up Everything at Once

docker system prune

This removes all stopped containers, all unused images, all unused networks, and all dangling build caches in one operation. Use it when you want a completely clean Docker environment. It is more aggressive than individual prune commands, so use it carefully in environments where you want to keep certain resources.


Docker Networking: Quick Reference

docker network ls                          # List all networks
docker network create <name>               # Create a custom bridge network
docker network connect <network> <container>   # Connect a container to a network
docker network disconnect <network> <container> # Disconnect a container from a network
docker network inspect <network>           # Full details about a network
docker network prune                       # Remove all unused networks
docker system prune                        # Remove all unused Docker objects

Docker Swarm: Quick Reference

docker swarm init                         # Initialize swarm on the manager node
docker swarm join --token <token> <ip>:2377  # Join a worker node to the swarm
docker node ls                            # List all nodes in the swarm

docker service create \
  --name <service-name> \
  --replicas <count> \
  -p <host-port>:<container-port> \
  <image>                                 # Create a service with replicas

docker service ls                         # List all services
docker service ps <service-name>          # List tasks (containers) for a service
docker service logs <service-name>        # View logs for a service
docker service inspect <service-name>     # Full service details

docker service scale <service-name>=<count>  # Scale a service to N replicas
docker service rollback <service-name>    # Roll back to previous service state

docker swarm leave                        # Remove a worker node from the swarm

Summary

This post covered three interconnected areas of Docker.

Starting with Docker Hub, we pushed all four service images from a local EC2 machine to a remote registry. The workflow is always tag, login, push. Pulling those images back after deleting them locally confirmed that Docker Hub stores them reliably. The AWS equivalents are ECS (Elastic Container Service) for container runtime and ECR (Elastic Container Registry) for image storage.

The problem of single-host fragility led to Docker Swarm. Docker Swarm creates a cluster of multiple Docker hosts with a manager node and worker nodes. Services run across the cluster with a specified number of replicas. The docker service create command is the Swarm equivalent of docker run. Services can be scaled up or down with docker service scale. Docker Swarm's self-healing property ensures that the desired replica count is always maintained automatically, even if containers crash or are deleted.

Docker Swarm's limitations are clear: it has no auto-scaling and limited load balancing compared to Kubernetes. These limitations explain why Kubernetes is the dominant orchestration tool today.

Docker networking rounds out the picture. Bridge networks handle container-to-container communication on the same host. Overlay networks handle communication across different hosts in a Swarm. The host network shares the EC2 machine's network stack with the container. Custom networks give you more control over isolation and naming. Containers in the same network can ping each other by IP, and you can connect or disconnect containers from networks at any time using docker network connect and docker network disconnect.

More from this blog

Stop Writing Repetitive Playbooks: Ansible Tags, Variables, Loops, Handlers and Conditions Decoded

Once you have written your first few Ansible playbooks, installed some packages, and started a couple of services, a natural question comes up: what else can Ansible do? The answer is quite a lot. This article walks through six features that turn a basic Ansible setup into something genuinely powerful: the Setup module, Tags, Variables, Loops, Handlers, and Conditional tasks. Each one builds on what you already know, and together they give you the tools to manage complex, real-world server environments cleanly and efficiently. Before going further, the environment being used here has one Ansible master and multiple worker nodes. Two are grouped as prod and two as dev inside /etc/ansible/hosts. SSH key-based authentication is already configured between the master and all worker nodes. If any of that is not set up yet, revisit passwordless SSH setup and inventory configuration before continuing.

Jun 21, 202615 min read
Stop Writing Repetitive Playbooks: Ansible Tags, Variables, Loops, Handlers and Conditions Decoded

Ansible Playbooks Explained: From First YAML File to Managing Real Servers

If you have already run a few Ansible ad hoc commands and seen how they work, you already understand the core idea: one command, many servers. But ad hoc commands only take you so far. When you need to install software, start services, create users, copy files, and print confirmation messages, all in one automated run across multiple servers, that is when you move to playbooks. Playbooks are where Ansible truly earns its place in a DevOps workflow. Everything you do in Ansible at scale, you do through playbooks.

Jun 19, 202613 min read
Ansible Playbooks Explained: From First YAML File to Managing Real Servers

You Have 400 Servers to Configure. Now What? Let Ansible Do the Work.

Picture this. You have four EC2 instances running in your AWS account, and someone asks you to install Apache on all four of them. What do you do? The obvious answer most people go with is SSH into each machine, run the install command, repeat. Simple enough when it is four servers. But what happens when it is forty? Or four hundred? In a real enterprise environment, that number is not exaggerated at all. That is exactly the problem that Ansible was built to solve. And once you understand what it does and how it thinks, you will wonder how anyone managed large infrastructure without it.

Jun 18, 202612 min read
You Have 400 Servers to Configure. Now What? Let Ansible Do the Work.

Building Production-Ready Docker Deployments with Secrets, Stacks, and Distroless Images

This post wraps up the core Docker Swarm curriculum by covering four important topics that complete the picture of production-ready containerized deployments. We start with Docker Secrets, which solves the real-world problem of passing sensitive credentials into containers without hardcoding them. We then look at Docker Stack, which is how you run multi-service Docker Compose files across a Swarm cluster instead of a single host. After that we cover the distinction between replicated and global services, which is a concept that appears in Kubernetes as well. We close with a look at Portainer for those who prefer a visual interface, and a brief introduction to distroless images. Each of these topics builds on everything covered so far. If you have your Swarm cluster running, you can follow along with every command shown here.

Jun 17, 202614 min read
Building Production-Ready Docker Deployments with Secrets, Stacks, and Distroless Images
S

Sai Praneeth's Blogs

38 posts

From SDLC and Agile to DevOps and CI/CD, this blog is where I share structured technical notes, concepts and practical insights in Cloud and DevOps engineering.