Day 12 || Docker Multistage Builds: Minimizing Image Size and Maximizing Efficiency

Day 12 || Docker Multistage Builds: Minimizing Image Size and Maximizing Efficiency

Introduction

Docker is a powerful tool for containerizing and deploying applications, but efficient image management is crucial, especially in production environments. Multistage Docker builds offer a solution to optimize image sizes and improve the overall efficiency of containerized applications.

Multistage Builds

Multistage builds, a feature introduced in Docker 17.05, allows developers to create multiple build stages within a single Docker file. Each stage can use a different base image and perform specific tasks, such as building, compiling, or packaging code. The key advantage is that you can separate development and production dependencies, resulting in smaller and more efficient production images.

The Problem with Single-Stage Builds

In a traditional single-stage Docker build, all the development dependencies, build tools, and source code are included in the final image. While this approach is suitable for development environments, it creates larger and less efficient images for production use. These images may contain unnecessary libraries, debuggers, and development artifacts, increasing the image size and resource utilization.

The Multistage Solution

Multistage builds to

address the shortcomings of single-stage builds by dividing the build process into distinct stages. Let's consider a Python web application as an example:

Stage 1 (Build Stage):

  • Use a base image with development tools and dependencies.

  • Set up a working directory and copy the necessary files, including requirements.txt.

  • Create a virtual environment and install dependencies.

  • Copy the application code and build any assets.

Stage 2 (Production Stage):

  • Switch to a smaller production-ready base image.

  • Set up the working directory and copy only essential files, including the virtual environment and application code, from the build stage.

  • Expose the necessary ports.

  • Define the command to run the application in a production environment.

Benefits of Multistage Builds

  1. Smaller Image Sizes: By separating development and production dependencies, the production image remains lean and optimized, reducing storage space and download times.

  2. Enhanced Security: Removing development tools and dependencies from the production image minimizes potential attack vectors, enhancing the overall security of the containerized application.

  3. Efficient Resource Usage: Smaller image sizes lead to more efficient use of system resources during deployment, scaling, and orchestration.

  4. Faster Deployment: Smaller images are quicker to deploy and require less bandwidth, improving application deployment times.

The Problem

Imagine you're developing a Python web application using Flask, and you want to containerize it for deployment. You've written your code and created a Dockerfile, but there's a concern: the Docker image contains unnecessary development tools and dependencies, making it larger than it needs to be.

Single-Stage Dockerfile

In a single-stage Dockerfile, you might have something like this:

# Use a Python base image
FROM python:3.8

# Set the working directory
WORKDIR /app

# Copy the application code into the container
COPY . .

# Install application dependencies
RUN pip install -r requirements.txt

# Expose the application's port
EXPOSE 5000

# Run the application
CMD ["python", "app.py"]

In this single-stage approach, the image includes development dependencies, build tools, and potentially unused files. This results in a larger image size.

Multistage Dockerfile to the Rescue

Now, let's optimize the Docker image using a multistage approach:

# Stage 1: Build the application
FROM python:3.8 as builder

# Set the working directory
WORKDIR /app

# Copy only the necessary files for installing dependencies
COPY requirements.txt .

# Install application dependencies in a virtual environment
RUN python -m venv venv
RUN . venv/bin/activate && pip install -r requirements.txt

# Copy the entire application code
COPY . .

# Stage 2: Create the production image
FROM python:3.8-slim

# Set the working directory
WORKDIR /app

# Copy the virtual environment and application code from the builder stage
COPY --from=builder /app/venv /app/venv
COPY --from=builder /app .

# Expose the application's port
EXPOSE 5000

# Run the application
CMD ["python", "app.py"]

In this multistage approach:

  • Stage 1 (Build Stage): We use a base image with development dependencies, set up a virtual environment, and install the required dependencies. Only the necessary files are copied. This stage builds the application and prepares it for production.

  • Stage 2 (Production Stage): We switch to a smaller base image that's optimized for production. We copy the virtual environment and application code from the builder stage. The resulting image contains only the production essentials.

Did you find this article valuable?

Support Aqib Hafeez(DevOps enthusiast) by becoming a sponsor. Any amount is appreciated!