Skip to main content

Command Palette

Search for a command to run...

Understanding Dockerfiles: From Basics to Real-World Deployment

If you have been working with Docker for even a short time, you have probably heard this phrase thrown around: "just write a Dockerfile." But what does that actually mean, and how does it work under the hood? In this article, we are going to walk through Dockerfiles the same way a classroom session would. We will start from the very basics, build up through each component one at a time, and finish with a real-world deployment where you host an actual web application and a Tomcat server running Jenkins inside Docker containers. Before this, you may have created Docker images by spinning up a container, manually installing things inside it, and then running docker commit to snapshot that container into an image. That works, but it is not the right approach in practice. It is manual, error-prone, and not repeatable. The correct and professional way to create images is through a Dockerfile.

Updated
18 min read
Understanding Dockerfiles: From Basics to Real-World Deployment
S
Passionate about Cloud and DevOps engineering. I write structured technical notes and beginner-friendly articles on AWS, Linux, CI/CD, networking, system architecture and modern software delivery workflows.

What Is a Dockerfile?

A Dockerfile is a text file that contains a set of instructions for automatically building a Docker image. Think of it as a blueprint. You write down everything your image should contain, and Docker reads that file and builds the image for you.

The key phrase here is automate image creation. That is exactly what a Dockerfile does. Instead of manually going inside a container and running commands one by one, you write all of those instructions into the Dockerfile, and Docker handles the rest.

One important rule to remember before anything else: the file name must always be Dockerfile with a capital D and nothing else. Not dockerfile, not docker-file, not Dockerfile.txt. Just Dockerfile. This is because Docker looks for that exact name by default when you run the build command.


From docker commit to Dockerfile: Why the Shift?

To understand why Dockerfiles matter, let us revisit how we used to create images without one.

The old workflow went like this:

  1. Pull a base image (say, Ubuntu)

  2. Create a container from that image

  3. Log into the container

  4. Manually install all the tools and applications you need

  5. Exit the container

  6. Run docker commit to save the container state as a new image

docker commit <container_id> my-image:v1

This works. But it has real problems. Every time someone else wants to recreate this image, they need to repeat all those manual steps. If something goes wrong, you cannot easily trace what was done. There is no history, no automation, and no repeatability.

The Dockerfile solves all of this. You write everything once, in a file, and anyone with that file can reproduce the exact same image by running a single command.

docker build -t my-image:v1 .

That is the command. docker build. Whenever you have a Dockerfile and you want to create an image from it, you use docker build. Remember that.


Dockerfile Components: The Building Blocks

Inside a Dockerfile, we have something called components (also called instructions). These are always written in capital letters. Let us go through each one.

FROM

FROM ubuntu

FROM is always the first line in any Dockerfile. It specifies the base image you want to start with. Every Docker image is built on top of some base image. That base image could be ubuntu, amazon linux, alpine, or even a more specific image like tomcat or python. You always need to start somewhere, and FROM is how you declare that starting point.

RUN

RUN apt update -y
RUN apt install git maven apache2 -y

RUN is used to execute Linux commands during the image creation process. This is critical to understand. When you run docker build, Docker reads the Dockerfile line by line. Every RUN command gets executed at build time, meaning the results of those commands are baked into the image itself.

If you want to install software, update packages, create files, or run any shell command as part of building the image, you use RUN.

CMD

CMD apt install mysql-server -y

CMD is executed at container creation time, not during image build. This distinction is very important and we will come back to it with a full example shortly.

Think of it this way. RUN prepares the image. CMD is what runs when someone starts a container from that image. The most common use of CMD in real-world Dockerfiles is to start a service. For example, starting Apache or Tomcat when the container launches.

ENTRYPOINT

ENTRYPOINT is similar to CMD but with higher priority. If both ENTRYPOINT and CMD are present in a Dockerfile, ENTRYPOINT takes precedence. It is typically used when you want the container to behave like an executable. We will go deeper into this in future sessions, but it is good to know it exists.

COPY

COPY index.html /var/www/html/

COPY is used to copy files from your local machine into the image. If you have application code sitting on your EC2 instance or local system, and you want that code to end up inside your Docker image, you use COPY.

The syntax is simple: COPY <source-on-local> <destination-in-image>.

This is one of the most commonly used instructions in real-world Dockerfiles, because your application code always lives locally and needs to get into the image somehow.

ADD

ADD https://example.com/some-file.tar.gz /tmp/

ADD is similar to COPY, but it can also download files from the internet. If you need to pull a file from a URL and place it inside the image, ADD is the command for that. It also has the ability to automatically extract compressed archives, which COPY does not do.

So the rule of thumb is: use COPY for local files, use ADD for internet files or archives.

WORKDIR

WORKDIR /app

WORKDIR sets the default working directory inside the container. After setting WORKDIR, any subsequent RUN, CMD, COPY, or ADD instructions will operate relative to that directory. It is the equivalent of doing cd /app inside the container, except it persists as the default location.

LABEL

LABEL author="Alex"

LABEL is used to add metadata to the image. You can use it to tag who created the image, what version it is, or any other descriptive information. This does not affect how the image works technically, but it is useful for organization and documentation purposes.

You can inspect this metadata later using:

docker inspect <container-name>

ENV

ENV COURSE="DevOps"
ENV TRAINER="Alex"

ENV is used to set environment variables inside the image and all containers created from it. These variables are accessible from within the running container just like any shell environment variable.

For example, after setting ENV COURSE="DevOps", you could log into the container and run:

echo $COURSE

And it would print DevOps.

In real-world usage, ENV is extremely powerful. You can pass database connection strings, credentials, configuration flags, and more through environment variables.

ARG

ARG is similar to ENV but only available during the build process, not inside the running container. Think of ARG as build-time variables and ENV as runtime variables. We will cover ARG in more depth in coming sessions.

EXPOSE

EXPOSE 80

EXPOSE is used to document which port your application inside the container listens on. If your web server runs on port 80, you declare that with EXPOSE 80. If Tomcat runs on 8080, you use EXPOSE 8080.

It is important to understand that EXPOSE alone does not actually publish the port to the outside world. It is more of documentation within the Dockerfile. To actually make the port accessible from outside, you need to use the -p flag when running the container. But EXPOSE is still important because it tells Docker and anyone reading the Dockerfile which port the application expects to use.


Building Your First Dockerfile

Let us write an actual Dockerfile now. Open a terminal on your machine and create a file named Dockerfile.

vi Dockerfile

Inside the file, write the following:

FROM ubuntu
RUN apt update -y
RUN apt install git maven tree apache2 -y
RUN touch file1

Save and exit. Now let us build this into an image.

docker build -t rias:version1 .

Breaking this command down:

  • docker build is the command to create an image from a Dockerfile

  • -t rias:version1 tags the image with the name rias and the version version1

  • . means the Dockerfile is in the current directory

When you run this, Docker will read the Dockerfile and execute each instruction in sequence. You will see output flying by in your terminal as each step completes.

To verify the image was created:

docker images

You should see rias listed there with the tag version1.

Now create a container from this image:

docker run -it --name container1 rias:version1

Once you are inside the container, try:

git -v

Git is there, already installed. Why? Because it was installed during the image build process. When you created a container from that image, git came along with it because it was baked into the image itself.


Understanding Image Layers

When you ran docker build above, you might have noticed the output showing something like 1/4, 2/4, 3/4, 4/4. That is Docker telling you it is processing each instruction in order.

This is the concept of image layers. Instead of creating the entire image in one single shot, Docker builds it layer by layer. Each instruction in the Dockerfile creates a new layer on top of the previous one.

This design has a practical benefit. If you modify your Dockerfile and rebuild, Docker is smart enough to reuse the layers that have not changed. It only re-executes the instructions that are new or different. That is why rebuilding an image after a small change is much faster than the first build.


RUN vs CMD: A Distinction That Matters

This is one of the most important things to understand about Dockerfiles, so let us spend time on it with a proper example.

Write a new Dockerfile like this:

FROM ubuntu
RUN apt update -y
RUN apt install git maven apache2 -y
RUN touch file1
RUN apt install python3 -y
CMD apt install mysql-server -y

Build it:

docker build -t rias:version2 .

When Docker processes this file, everything under RUN gets executed during the build. The image ends up with git, maven, apache2, python3, and that empty file all baked in.

But the CMD line? That does not run during the build. Nothing about MySQL appears in the build output.

Now create a container from this image:

docker run -it --name container2 rias:version2

The moment the container starts, the CMD fires. MySQL server installation begins right then. You are watching it happen at container startup time, not at image build time.

So to summarize this cleanly:

RUN runs during image creation (build time). Use it to install packages, run setup scripts, create files, anything that should be permanently baked into the image.

CMD runs at container creation (startup time). Use it to start services, kick off your application, or run any process that should begin when the container comes alive.

The reason this separation exists is practical. Imagine you are building an image for a web server. You use RUN to install Apache into the image. But you do not want Apache running while you are building the image. You want Apache to start when someone actually creates and runs a container from that image. So you use CMD to start the Apache service.


Working with COPY and ADD

Let us look at how to get files into your image.

First, create a sample local file to work with:

echo "<h1>Hello from Docker</h1>" > index.html

Now write a Dockerfile that copies this file and also pulls something from the internet:

FROM ubuntu
COPY index.html /tmp/
ADD https://dlcdn.apache.org/tomcat/tomcat-10/v10.1.39/bin/apache-tomcat-10.1.39.tar.gz /tmp/

Build it:

docker build -t rias:version3 .

Create a container and check:

docker run -it --name container3 rias:version3
cd /tmp
ls

You will find both the index.html you copied from your local machine and the Tomcat archive that was downloaded from the internet.

That is the practical difference between COPY and ADD in action. COPY works with files already on your local system. ADD can reach out to a URL and fetch the file during the build process.


WORKDIR and LABEL in Practice

Let us now write a Dockerfile that demonstrates WORKDIR and LABEL:

FROM ubuntu
WORKDIR /myapp
COPY index.html .
LABEL author="Alex"

Build it:

docker build -t rias:version4 .

Create a container:

docker run -it --name container4 rias:version4

The moment you enter the container, notice which directory you land in. You are already in /myapp. That is WORKDIR at work. It set the default directory for the container.

If you run ls, you will see index.html there because we copied it to ., which resolves to /myapp thanks to WORKDIR.

Now exit the container and check the label:

docker inspect container4

Scroll through the output and look for the Labels section. You will see "author": "Alex" listed there. That is the metadata we attached to the image using LABEL.


Environment Variables with ENV

Write this Dockerfile:

FROM ubuntu
ENV COURSE="DevOps"
ENV TRAINER="Alex"

Build it:

docker build -t rias:version5 .

Create a container:

docker run -it --name container5 rias:version5

Inside the container:

echo $COURSE
echo $TRAINER

You get DevOps and Alex printed back. These environment variables were set at image build time and are available in every container created from this image.

In real-world applications, this is how you pass configuration into containers. You might set database hostnames, port numbers, application modes, or API endpoints using ENV. It keeps your application flexible without hardcoding values directly into your code.


Real-World Example: Deploying a Website with Apache

Now let us pull everything together into a proper real-world example.

The goal here is to take a web application that lives in a GitHub repository, copy it into a Docker image with Apache installed, and serve it from a running container.

Step 1: Clone the repository

First, install git on your host machine if you have not already, and clone the repository:

apt install git -y
git clone https://github.com/your-username/website.git

You now have a website directory with index.html and supporting files inside it.

Step 2: Write the Dockerfile

Go up one level so you are outside the website directory, then create the Dockerfile:

FROM ubuntu
RUN apt update -y
RUN apt install apache2 -y
RUN apt install apache2-utils -y
COPY website/ /var/www/html/
RUN service apache2 restart
EXPOSE 80
CMD ["/usr/sbin/apachectl", "-D", "FOREGROUND"]

Let us walk through what each line does here.

FROM ubuntu gives us a clean Ubuntu base.

The two RUN install commands set up Apache and its utility tools. Apache itself (apache2) is the web server. apache2-utils provides additional helper tools that Apache may need.

COPY website/ /var/www/html/ takes all the code from the website folder on your local machine and places it into /var/www/html/ inside the image, which is exactly where Apache looks for web files to serve.

RUN service apache2 restart restarts Apache during image creation to make sure the configuration is properly loaded.

EXPOSE 80 documents that this container will use port 80.

The final CMD line is important. It starts Apache in foreground mode. This is a critical concept in Docker containers. A container stays alive only as long as its main process is running. If you start Apache in background mode (as a daemon), that process exits immediately, and Docker thinks the container is done, so it stops. By running Apache with -D FOREGROUND, we keep the Apache process running in the foreground, which keeps the container alive as long as Apache is running.

Step 3: Build the image

docker build -t first-project:version1 .

Watch the output. You will see Docker working through each step, installing Apache, copying your website files, and finally packaging it all into an image.

Step 4: Create the container with port mapping

docker run -itd --name web-container1 -p 80:80 first-project:version1

Let us unpack the -p 80:80 part.

When you write -p 80:80, the format is <host-port>:<container-port>.

Your container has Apache running on port 80 inside it. But a container is isolated. You cannot reach it from your browser directly. Your EC2 instance (the host machine) sits in between. So you map the host machine's port 80 to the container's port 80. Now when someone hits your EC2 IP on port 80, that traffic routes into the container and reaches Apache.

After the container starts, open a browser and navigate to your EC2 instance's public IP address:

http://<your-ec2-ip>

Your web application loads, served by Apache running inside a Docker container.

This is the fundamental shift from traditional deployment. Previously, you installed Apache directly on the EC2 instance and deployed your application there. Now, Apache and your application both live inside a container, and the EC2 instance simply runs Docker.


Deploying Tomcat with a Jenkins WAR File

Let us take this one step further with another real-world scenario. This time, instead of starting from a blank Ubuntu image, we will use an existing official Tomcat image that already has Tomcat installed.

Step 1: Download the Jenkins WAR file

wget https://get.jenkins.io/war-stable/latest/jenkins.war

You now have jenkins.war on your local machine.

Step 2: Write the Dockerfile

FROM tomcat:latest
ENV CATALINA_HOME=/usr/local/tomcat
RUN rm -rf /usr/local/tomcat/webapps/*
COPY jenkins.war /usr/local/tomcat/webapps/
EXPOSE 8080
CMD ["catalina.sh", "run"]

Let us go through this carefully.

FROM tomcat:latest pulls the official Tomcat image from Docker Hub. This image already has Java and Tomcat installed. We do not need to install them ourselves. This is the power of using specialized base images. You skip all the setup work that someone else has already done.

ENV CATALINA_HOME=/usr/local/tomcat sets the environment variable pointing to Tomcat's home directory. Catalina is the name of Tomcat's servlet engine, and /usr/local/tomcat is where it lives inside the official Tomcat image.

RUN rm -rf /usr/local/tomcat/webapps/* removes any default web applications that come bundled with the Tomcat image. We want a clean slate before deploying our own application.

COPY jenkins.war /usr/local/tomcat/webapps/ copies the Jenkins WAR file we downloaded locally into Tomcat's webapps directory. Tomcat automatically picks up any WAR file placed in that directory and deploys it.

EXPOSE 8080 documents that Tomcat runs on port 8080.

CMD ["catalina.sh", "run"] starts Tomcat in the foreground using the Catalina startup script, keeping the container alive.

Step 3: Build the image

docker build -t tomcat-image:version1 .

Step 4: Run the container

docker run -itd --name tomcat-container1 -p 8080:8080 tomcat-image:version1

Here we map the host's port 8080 to the container's port 8080. After the container starts, open your browser and go to:

http://<your-ec2-ip>:8080/jenkins

Jenkins should load up, running inside Tomcat, inside a Docker container, on your EC2 instance.


Quick Reference: When to Use Each Component

Here is a clean summary of all the Dockerfile components we covered and when to reach for each one.

FROM - Always the first line. Declares the base image.

RUN - Executes shell commands during image build time. Use for installing packages, creating files, running setup scripts.

CMD - Runs when a container starts from the image. Use for starting services and applications.

ENTRYPOINT - Similar to CMD but higher priority. Used when the container should behave like a standalone executable.

COPY - Copies files from your local machine into the image. Use for application code and local configuration files.

ADD - Like COPY, but can also download files from URLs and extract archives. Use for internet resources.

WORKDIR - Sets the default working directory inside the container for subsequent instructions.

LABEL - Adds metadata to the image such as author name, version, or description.

ENV - Sets environment variables that are available during build and at runtime inside containers.

ARG - Defines variables available only during the build process.

EXPOSE - Documents which port the container's application listens on.


Summary

What started as a simple question, "what is a Dockerfile?", has taken us all the way through eight practical Dockerfile examples and two real-world deployments.

The key ideas to walk away with are these.

A Dockerfile automates image creation. You write the instructions once, and Docker repeats them exactly every time. This makes your infrastructure reproducible and shareable.

RUN and CMD are not the same thing. RUN bakes things into the image at build time. CMD fires when a container starts from that image. Getting this distinction right is what separates a working Dockerfile from a broken one.

Port mapping with -p host-port:container-port is how you expose your containerized application to the outside world. The container's internal port and the host's external port do not have to match, but they need to be explicitly mapped.

Using specialized base images like tomcat:latest saves you from writing a lot of boilerplate setup. Start with the right base image and your Dockerfile stays small and focused.

And finally, running your main service in foreground mode inside a container is what keeps the container alive. If your process exits, your container exits with it.

These fundamentals apply to virtually every Dockerfile you will write from here on out. The specific tools and applications will change, but the logic stays the same. Master these building blocks and Dockerfile writing becomes a natural part of how you think about deploying applications.

More from this blog

Ansible Playbooks Explained: From First YAML File to Managing Real Servers

If you have already run a few Ansible ad hoc commands and seen how they work, you already understand the core idea: one command, many servers. But ad hoc commands only take you so far. When you need to install software, start services, create users, copy files, and print confirmation messages, all in one automated run across multiple servers, that is when you move to playbooks. Playbooks are where Ansible truly earns its place in a DevOps workflow. Everything you do in Ansible at scale, you do through playbooks.

Jun 19, 202613 min read
Ansible Playbooks Explained: From First YAML File to Managing Real Servers

You Have 400 Servers to Configure. Now What? Let Ansible Do the Work.

Picture this. You have four EC2 instances running in your AWS account, and someone asks you to install Apache on all four of them. What do you do? The obvious answer most people go with is SSH into each machine, run the install command, repeat. Simple enough when it is four servers. But what happens when it is forty? Or four hundred? In a real enterprise environment, that number is not exaggerated at all. That is exactly the problem that Ansible was built to solve. And once you understand what it does and how it thinks, you will wonder how anyone managed large infrastructure without it.

Jun 18, 202612 min read
You Have 400 Servers to Configure. Now What? Let Ansible Do the Work.

Building Production-Ready Docker Deployments with Secrets, Stacks, and Distroless Images

This post wraps up the core Docker Swarm curriculum by covering four important topics that complete the picture of production-ready containerized deployments. We start with Docker Secrets, which solves the real-world problem of passing sensitive credentials into containers without hardcoding them. We then look at Docker Stack, which is how you run multi-service Docker Compose files across a Swarm cluster instead of a single host. After that we cover the distinction between replicated and global services, which is a concept that appears in Kubernetes as well. We close with a look at Portainer for those who prefer a visual interface, and a brief introduction to distroless images. Each of these topics builds on everything covered so far. If you have your Swarm cluster running, you can follow along with every command shown here.

Jun 17, 202614 min read
Building Production-Ready Docker Deployments with Secrets, Stacks, and Distroless Images

Beyond One Server: Solving Docker Scaling with Swarm and Container Networks

This post covers three separate but deeply connected topics. We start by finishing what was started with Docker Hub, pushing all four bank service images to a remote registry so they survive beyond any single machine. From there, we identify a real architectural problem with single-host Docker deployments and introduce Docker Swarm as the solution. Finally, we close with Docker networking, explaining how containers communicate with each other both on the same host and across different hosts. By the end of this article, you will understand how to push and pull images from Docker Hub, how to set up a multi-node Docker Swarm cluster, how to create and scale services across that cluster, what self-healing means in practice, and how Docker networking works under the hood.

Jun 17, 202618 min read
Beyond One Server: Solving Docker Scaling with Swarm and Container Networks

Docker Compose in Action: Multi-Container Apps, Nginx Load Balancing & Docker Hub

We built four containerized microservices for an bank application in the last post. Internet banking, mobile banking, insurance, and loans, each running in its own container, each exposed on a separate port. The setup worked. But the process of building and running each container individually by hand was repetitive, error-prone, and simply not practical at scale. This post introduces Docker Compose, and by the end, you will understand not just how to use it, but why it exists, what its real limitations are, how to combine it with Nginx to build a working high availability architecture, and how to push your images to Docker Hub so they are available beyond your local machine. There is also a hands-on project included here that builds a Flask-based Python application behind an Nginx load balancer, which you are expected to complete as a practical exercise.

Jun 16, 202617 min read
Docker Compose in Action: Multi-Container Apps, Nginx Load Balancing & Docker Hub
S

Sai Praneeth's Blogs

37 posts

From SDLC and Agile to DevOps and CI/CD, this blog is where I share structured technical notes, concepts and practical insights in Cloud and DevOps engineering.