Understanding Dockerfiles: From Basics to Real-World Deployment
If you have been working with Docker for even a short time, you have probably heard this phrase thrown around: "just write a Dockerfile." But what does that actually mean, and how does it work under the hood? In this article, we are going to walk through Dockerfiles the same way a classroom session would. We will start from the very basics, build up through each component one at a time, and finish with a real-world deployment where you host an actual web application and a Tomcat server running Jenkins inside Docker containers. Before this, you may have created Docker images by spinning up a container, manually installing things inside it, and then running docker commit to snapshot that container into an image. That works, but it is not the right approach in practice. It is manual, error-prone, and not repeatable. The correct and professional way to create images is through a Dockerfile.

What Is a Dockerfile?
A Dockerfile is a text file that contains a set of instructions for automatically building a Docker image. Think of it as a blueprint. You write down everything your image should contain, and Docker reads that file and builds the image for you.
The key phrase here is automate image creation. That is exactly what a Dockerfile does. Instead of manually going inside a container and running commands one by one, you write all of those instructions into the Dockerfile, and Docker handles the rest.
One important rule to remember before anything else: the file name must always be Dockerfile with a capital D and nothing else. Not dockerfile, not docker-file, not Dockerfile.txt. Just Dockerfile. This is because Docker looks for that exact name by default when you run the build command.
From docker commit to Dockerfile: Why the Shift?
To understand why Dockerfiles matter, let us revisit how we used to create images without one.
The old workflow went like this:
Pull a base image (say, Ubuntu)
Create a container from that image
Log into the container
Manually install all the tools and applications you need
Exit the container
Run
docker committo save the container state as a new image
docker commit <container_id> my-image:v1
This works. But it has real problems. Every time someone else wants to recreate this image, they need to repeat all those manual steps. If something goes wrong, you cannot easily trace what was done. There is no history, no automation, and no repeatability.
The Dockerfile solves all of this. You write everything once, in a file, and anyone with that file can reproduce the exact same image by running a single command.
docker build -t my-image:v1 .
That is the command. docker build. Whenever you have a Dockerfile and you want to create an image from it, you use docker build. Remember that.
Dockerfile Components: The Building Blocks
Inside a Dockerfile, we have something called components (also called instructions). These are always written in capital letters. Let us go through each one.
FROM
FROM ubuntu
FROM is always the first line in any Dockerfile. It specifies the base image you want to start with. Every Docker image is built on top of some base image. That base image could be ubuntu, amazon linux, alpine, or even a more specific image like tomcat or python. You always need to start somewhere, and FROM is how you declare that starting point.
RUN
RUN apt update -y
RUN apt install git maven apache2 -y
RUN is used to execute Linux commands during the image creation process. This is critical to understand. When you run docker build, Docker reads the Dockerfile line by line. Every RUN command gets executed at build time, meaning the results of those commands are baked into the image itself.
If you want to install software, update packages, create files, or run any shell command as part of building the image, you use RUN.
CMD
CMD apt install mysql-server -y
CMD is executed at container creation time, not during image build. This distinction is very important and we will come back to it with a full example shortly.
Think of it this way. RUN prepares the image. CMD is what runs when someone starts a container from that image. The most common use of CMD in real-world Dockerfiles is to start a service. For example, starting Apache or Tomcat when the container launches.
ENTRYPOINT
ENTRYPOINT is similar to CMD but with higher priority. If both ENTRYPOINT and CMD are present in a Dockerfile, ENTRYPOINT takes precedence. It is typically used when you want the container to behave like an executable. We will go deeper into this in future sessions, but it is good to know it exists.
COPY
COPY index.html /var/www/html/
COPY is used to copy files from your local machine into the image. If you have application code sitting on your EC2 instance or local system, and you want that code to end up inside your Docker image, you use COPY.
The syntax is simple: COPY <source-on-local> <destination-in-image>.
This is one of the most commonly used instructions in real-world Dockerfiles, because your application code always lives locally and needs to get into the image somehow.
ADD
ADD https://example.com/some-file.tar.gz /tmp/
ADD is similar to COPY, but it can also download files from the internet. If you need to pull a file from a URL and place it inside the image, ADD is the command for that. It also has the ability to automatically extract compressed archives, which COPY does not do.
So the rule of thumb is: use COPY for local files, use ADD for internet files or archives.
WORKDIR
WORKDIR /app
WORKDIR sets the default working directory inside the container. After setting WORKDIR, any subsequent RUN, CMD, COPY, or ADD instructions will operate relative to that directory. It is the equivalent of doing cd /app inside the container, except it persists as the default location.
LABEL
LABEL author="Alex"
LABEL is used to add metadata to the image. You can use it to tag who created the image, what version it is, or any other descriptive information. This does not affect how the image works technically, but it is useful for organization and documentation purposes.
You can inspect this metadata later using:
docker inspect <container-name>
ENV
ENV COURSE="DevOps"
ENV TRAINER="Alex"
ENV is used to set environment variables inside the image and all containers created from it. These variables are accessible from within the running container just like any shell environment variable.
For example, after setting ENV COURSE="DevOps", you could log into the container and run:
echo $COURSE
And it would print DevOps.
In real-world usage, ENV is extremely powerful. You can pass database connection strings, credentials, configuration flags, and more through environment variables.
ARG
ARG is similar to ENV but only available during the build process, not inside the running container. Think of ARG as build-time variables and ENV as runtime variables. We will cover ARG in more depth in coming sessions.
EXPOSE
EXPOSE 80
EXPOSE is used to document which port your application inside the container listens on. If your web server runs on port 80, you declare that with EXPOSE 80. If Tomcat runs on 8080, you use EXPOSE 8080.
It is important to understand that EXPOSE alone does not actually publish the port to the outside world. It is more of documentation within the Dockerfile. To actually make the port accessible from outside, you need to use the -p flag when running the container. But EXPOSE is still important because it tells Docker and anyone reading the Dockerfile which port the application expects to use.
Building Your First Dockerfile
Let us write an actual Dockerfile now. Open a terminal on your machine and create a file named Dockerfile.
vi Dockerfile
Inside the file, write the following:
FROM ubuntu
RUN apt update -y
RUN apt install git maven tree apache2 -y
RUN touch file1
Save and exit. Now let us build this into an image.
docker build -t rias:version1 .
Breaking this command down:
docker buildis the command to create an image from a Dockerfile-t rias:version1tags the image with the nameriasand the versionversion1.means the Dockerfile is in the current directory
When you run this, Docker will read the Dockerfile and execute each instruction in sequence. You will see output flying by in your terminal as each step completes.
To verify the image was created:
docker images
You should see rias listed there with the tag version1.
Now create a container from this image:
docker run -it --name container1 rias:version1
Once you are inside the container, try:
git -v
Git is there, already installed. Why? Because it was installed during the image build process. When you created a container from that image, git came along with it because it was baked into the image itself.
Understanding Image Layers
When you ran docker build above, you might have noticed the output showing something like 1/4, 2/4, 3/4, 4/4. That is Docker telling you it is processing each instruction in order.
This is the concept of image layers. Instead of creating the entire image in one single shot, Docker builds it layer by layer. Each instruction in the Dockerfile creates a new layer on top of the previous one.
This design has a practical benefit. If you modify your Dockerfile and rebuild, Docker is smart enough to reuse the layers that have not changed. It only re-executes the instructions that are new or different. That is why rebuilding an image after a small change is much faster than the first build.
RUN vs CMD: A Distinction That Matters
This is one of the most important things to understand about Dockerfiles, so let us spend time on it with a proper example.
Write a new Dockerfile like this:
FROM ubuntu
RUN apt update -y
RUN apt install git maven apache2 -y
RUN touch file1
RUN apt install python3 -y
CMD apt install mysql-server -y
Build it:
docker build -t rias:version2 .
When Docker processes this file, everything under RUN gets executed during the build. The image ends up with git, maven, apache2, python3, and that empty file all baked in.
But the CMD line? That does not run during the build. Nothing about MySQL appears in the build output.
Now create a container from this image:
docker run -it --name container2 rias:version2
The moment the container starts, the CMD fires. MySQL server installation begins right then. You are watching it happen at container startup time, not at image build time.
So to summarize this cleanly:
RUN runs during image creation (build time). Use it to install packages, run setup scripts, create files, anything that should be permanently baked into the image.
CMD runs at container creation (startup time). Use it to start services, kick off your application, or run any process that should begin when the container comes alive.
The reason this separation exists is practical. Imagine you are building an image for a web server. You use RUN to install Apache into the image. But you do not want Apache running while you are building the image. You want Apache to start when someone actually creates and runs a container from that image. So you use CMD to start the Apache service.
Working with COPY and ADD
Let us look at how to get files into your image.
First, create a sample local file to work with:
echo "<h1>Hello from Docker</h1>" > index.html
Now write a Dockerfile that copies this file and also pulls something from the internet:
FROM ubuntu
COPY index.html /tmp/
ADD https://dlcdn.apache.org/tomcat/tomcat-10/v10.1.39/bin/apache-tomcat-10.1.39.tar.gz /tmp/
Build it:
docker build -t rias:version3 .
Create a container and check:
docker run -it --name container3 rias:version3
cd /tmp
ls
You will find both the index.html you copied from your local machine and the Tomcat archive that was downloaded from the internet.
That is the practical difference between COPY and ADD in action. COPY works with files already on your local system. ADD can reach out to a URL and fetch the file during the build process.
WORKDIR and LABEL in Practice
Let us now write a Dockerfile that demonstrates WORKDIR and LABEL:
FROM ubuntu
WORKDIR /myapp
COPY index.html .
LABEL author="Alex"
Build it:
docker build -t rias:version4 .
Create a container:
docker run -it --name container4 rias:version4
The moment you enter the container, notice which directory you land in. You are already in /myapp. That is WORKDIR at work. It set the default directory for the container.
If you run ls, you will see index.html there because we copied it to ., which resolves to /myapp thanks to WORKDIR.
Now exit the container and check the label:
docker inspect container4
Scroll through the output and look for the Labels section. You will see "author": "Alex" listed there. That is the metadata we attached to the image using LABEL.
Environment Variables with ENV
Write this Dockerfile:
FROM ubuntu
ENV COURSE="DevOps"
ENV TRAINER="Alex"
Build it:
docker build -t rias:version5 .
Create a container:
docker run -it --name container5 rias:version5
Inside the container:
echo $COURSE
echo $TRAINER
You get DevOps and Alex printed back. These environment variables were set at image build time and are available in every container created from this image.
In real-world applications, this is how you pass configuration into containers. You might set database hostnames, port numbers, application modes, or API endpoints using ENV. It keeps your application flexible without hardcoding values directly into your code.
Real-World Example: Deploying a Website with Apache
Now let us pull everything together into a proper real-world example.
The goal here is to take a web application that lives in a GitHub repository, copy it into a Docker image with Apache installed, and serve it from a running container.
Step 1: Clone the repository
First, install git on your host machine if you have not already, and clone the repository:
apt install git -y
git clone https://github.com/your-username/website.git
You now have a website directory with index.html and supporting files inside it.
Step 2: Write the Dockerfile
Go up one level so you are outside the website directory, then create the Dockerfile:
FROM ubuntu
RUN apt update -y
RUN apt install apache2 -y
RUN apt install apache2-utils -y
COPY website/ /var/www/html/
RUN service apache2 restart
EXPOSE 80
CMD ["/usr/sbin/apachectl", "-D", "FOREGROUND"]
Let us walk through what each line does here.
FROM ubuntu gives us a clean Ubuntu base.
The two RUN install commands set up Apache and its utility tools. Apache itself (apache2) is the web server. apache2-utils provides additional helper tools that Apache may need.
COPY website/ /var/www/html/ takes all the code from the website folder on your local machine and places it into /var/www/html/ inside the image, which is exactly where Apache looks for web files to serve.
RUN service apache2 restart restarts Apache during image creation to make sure the configuration is properly loaded.
EXPOSE 80 documents that this container will use port 80.
The final CMD line is important. It starts Apache in foreground mode. This is a critical concept in Docker containers. A container stays alive only as long as its main process is running. If you start Apache in background mode (as a daemon), that process exits immediately, and Docker thinks the container is done, so it stops. By running Apache with -D FOREGROUND, we keep the Apache process running in the foreground, which keeps the container alive as long as Apache is running.
Step 3: Build the image
docker build -t first-project:version1 .
Watch the output. You will see Docker working through each step, installing Apache, copying your website files, and finally packaging it all into an image.
Step 4: Create the container with port mapping
docker run -itd --name web-container1 -p 80:80 first-project:version1
Let us unpack the -p 80:80 part.
When you write -p 80:80, the format is <host-port>:<container-port>.
Your container has Apache running on port 80 inside it. But a container is isolated. You cannot reach it from your browser directly. Your EC2 instance (the host machine) sits in between. So you map the host machine's port 80 to the container's port 80. Now when someone hits your EC2 IP on port 80, that traffic routes into the container and reaches Apache.
After the container starts, open a browser and navigate to your EC2 instance's public IP address:
http://<your-ec2-ip>
Your web application loads, served by Apache running inside a Docker container.
This is the fundamental shift from traditional deployment. Previously, you installed Apache directly on the EC2 instance and deployed your application there. Now, Apache and your application both live inside a container, and the EC2 instance simply runs Docker.
Deploying Tomcat with a Jenkins WAR File
Let us take this one step further with another real-world scenario. This time, instead of starting from a blank Ubuntu image, we will use an existing official Tomcat image that already has Tomcat installed.
Step 1: Download the Jenkins WAR file
wget https://get.jenkins.io/war-stable/latest/jenkins.war
You now have jenkins.war on your local machine.
Step 2: Write the Dockerfile
FROM tomcat:latest
ENV CATALINA_HOME=/usr/local/tomcat
RUN rm -rf /usr/local/tomcat/webapps/*
COPY jenkins.war /usr/local/tomcat/webapps/
EXPOSE 8080
CMD ["catalina.sh", "run"]
Let us go through this carefully.
FROM tomcat:latest pulls the official Tomcat image from Docker Hub. This image already has Java and Tomcat installed. We do not need to install them ourselves. This is the power of using specialized base images. You skip all the setup work that someone else has already done.
ENV CATALINA_HOME=/usr/local/tomcat sets the environment variable pointing to Tomcat's home directory. Catalina is the name of Tomcat's servlet engine, and /usr/local/tomcat is where it lives inside the official Tomcat image.
RUN rm -rf /usr/local/tomcat/webapps/* removes any default web applications that come bundled with the Tomcat image. We want a clean slate before deploying our own application.
COPY jenkins.war /usr/local/tomcat/webapps/ copies the Jenkins WAR file we downloaded locally into Tomcat's webapps directory. Tomcat automatically picks up any WAR file placed in that directory and deploys it.
EXPOSE 8080 documents that Tomcat runs on port 8080.
CMD ["catalina.sh", "run"] starts Tomcat in the foreground using the Catalina startup script, keeping the container alive.
Step 3: Build the image
docker build -t tomcat-image:version1 .
Step 4: Run the container
docker run -itd --name tomcat-container1 -p 8080:8080 tomcat-image:version1
Here we map the host's port 8080 to the container's port 8080. After the container starts, open your browser and go to:
http://<your-ec2-ip>:8080/jenkins
Jenkins should load up, running inside Tomcat, inside a Docker container, on your EC2 instance.
Quick Reference: When to Use Each Component
Here is a clean summary of all the Dockerfile components we covered and when to reach for each one.
FROM - Always the first line. Declares the base image.
RUN - Executes shell commands during image build time. Use for installing packages, creating files, running setup scripts.
CMD - Runs when a container starts from the image. Use for starting services and applications.
ENTRYPOINT - Similar to CMD but higher priority. Used when the container should behave like a standalone executable.
COPY - Copies files from your local machine into the image. Use for application code and local configuration files.
ADD - Like COPY, but can also download files from URLs and extract archives. Use for internet resources.
WORKDIR - Sets the default working directory inside the container for subsequent instructions.
LABEL - Adds metadata to the image such as author name, version, or description.
ENV - Sets environment variables that are available during build and at runtime inside containers.
ARG - Defines variables available only during the build process.
EXPOSE - Documents which port the container's application listens on.
Summary
What started as a simple question, "what is a Dockerfile?", has taken us all the way through eight practical Dockerfile examples and two real-world deployments.
The key ideas to walk away with are these.
A Dockerfile automates image creation. You write the instructions once, and Docker repeats them exactly every time. This makes your infrastructure reproducible and shareable.
RUN and CMD are not the same thing. RUN bakes things into the image at build time. CMD fires when a container starts from that image. Getting this distinction right is what separates a working Dockerfile from a broken one.
Port mapping with -p host-port:container-port is how you expose your containerized application to the outside world. The container's internal port and the host's external port do not have to match, but they need to be explicitly mapped.
Using specialized base images like tomcat:latest saves you from writing a lot of boilerplate setup. Start with the right base image and your Dockerfile stays small and focused.
And finally, running your main service in foreground mode inside a container is what keeps the container alive. If your process exits, your container exits with it.
These fundamentals apply to virtually every Dockerfile you will write from here on out. The specific tools and applications will change, but the logic stays the same. Master these building blocks and Dockerfile writing becomes a natural part of how you think about deploying applications.






