Infrastructure
Docker & Containers
Containers solve a fundamental problem: "It works on my machine but not in production." Docker has become the standard way to package applications so they run consistently everywhere.
What Is a Container?
A container is an isolated process running on your machine with its own filesystem, its own environment variables, its own ports, its own "world." It's lighter weight than a virtual machine—it shares the kernel with the host operating system—but provides similar isolation.
Imagine containers as shipping containers for software. You pack your application with all its dependencies (Node.js, PostgreSQL libraries, fonts, configuration files) into a container. You ship that container to production. On the production server, Docker unpacks the container and runs it. The same container works on your laptop, in staging, in production.
This solves "works on my machine" in two ways: first, your container is identical everywhere. Second, your laptop and production server both run Docker, so there's no difference in how containers are executed.
Images vs Containers
Images are blueprints. Containers are running instances. Think of an image as a class in programming, and a container as an instance of that class.
An image is a static file containing your application code, runtime environment, libraries, and configuration. You build an image once. Images are stored in registries (Docker Hub, AWS ECR, Google Container Registry). A container is what you get when you run an image. You can run the same image 10 times and get 10 containers.
Containers are ephemeral. You create them, they run, you destroy them. Data stored inside a container is lost when it stops. This is intentional—containers should be stateless. Persistent data lives in databases or external storage.
Dockerfile Basics
A Dockerfile is a recipe for building an image. Each line in a Dockerfile is an instruction:
# Start from official Node.js image
FROM node:18-alpine
# Set working directory inside container
WORKDIR /app
# Copy package.json from host into container
COPY package.json package-lock.json ./
# Install dependencies
RUN npm install
# Copy application code
COPY . .
# Expose port 3000 (documentation, doesn't actually open port)
EXPOSE 3000
# Run application
CMD ["npm", "start"]
Key instructions:
- FROM: Base image to build on. Most applications start with an official runtime (node, python, java).
- WORKDIR: Directory inside the container where subsequent commands run.
- COPY: Copy files from host into the container.
- RUN: Execute a command during image build (installing dependencies).
- EXPOSE: Documents which ports the application uses (doesn't actually open them).
- CMD: Default command to run when the container starts.
Building the image:
docker build -t my-app:1.0 .
This builds an image named `my-app` with tag `1.0` from the Dockerfile in the current directory.
Running a container from the image:
docker run -p 3000:3000 my-app:1.0
This runs the image, mapping port 3000 from the container to port 3000 on your machine. Visit localhost:3000.
Dockerfile Best Practices
Use alpine images. Alpine is a minimal Linux distribution. `node:18-alpine` is much smaller than `node:18`, resulting in faster builds and smaller images.
Minimize layers. Each instruction in a Dockerfile creates a layer. Fewer layers mean smaller images. Combine RUN commands when possible:
# Bad: 3 layers
RUN apt-get update
RUN apt-get install curl
RUN rm -rf /var/lib/apt/lists/*
# Good: 1 layer
RUN apt-get update && apt-get install curl && rm -rf /var/lib/apt/lists/*
Order instructions for caching. Docker caches layers. If you change your application code, Docker rebuilds from that layer onward. Put stable instructions (base image, runtime) first and volatile instructions (COPY .) later.
Use .dockerignore. Like .gitignore, this tells Docker not to copy certain files into the image. Exclude node_modules, .git, logs:
node_modules
.git
.env
logs
Multi-Stage Builds
Multi-stage builds allow you to build images in stages, discarding intermediate stages. This is useful for compiled languages and reduces final image size.
# Stage 1: Build
FROM node:18-alpine AS builder
WORKDIR /app
COPY package.json ./
RUN npm install
COPY . .
RUN npm run build
# Stage 2: Runtime
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/node_modules ./node_modules
CMD ["node", "dist/index.js"]
This builds the application in stage 1, then copies only the compiled output and necessary dependencies to stage 2. The final image doesn't include build tools, test files, or source code, keeping it smaller.
Docker Compose for Local Development
Real applications have multiple services: web server, database, cache, background job queue. Running each in a separate container manually is tedious.
Docker Compose defines multiple services in a YAML file and runs them together:
version: '3.8'
services:
app:
build: .
ports:
- "3000:3000"
environment:
DATABASE_URL: postgres://user:password@db:5432/myapp
depends_on:
- db
db:
image: postgres:15
environment:
POSTGRES_USER: user
POSTGRES_PASSWORD: password
POSTGRES_DB: myapp
Run with:
docker compose up
This starts both the app and database, connects them, sets environment variables. Docker Compose is the standard way to develop multi-service applications locally.
Container Registries
Once built, images are stored in registries. Docker Hub is the default public registry. You can push images:
docker tag my-app:1.0 myusername/my-app:1.0
docker push myusername/my-app:1.0
Now anyone can pull your image:
docker run myusername/my-app:1.0
Private registries (AWS ECR, Google Container Registry, Azure Container Registry) store images you don't want public. Most companies use private registries for proprietary applications.
Running Containers in Production
Simply running Docker on a server is possible but limited. If a container crashes, nothing restarts it. If traffic increases, you can't automatically scale to more containers. This is where orchestration tools come in.
For small projects, simple Docker on a VPS works fine. For larger projects, use container orchestration (Kubernetes, Docker Swarm) or serverless platforms that handle deployment for you.
Container Health Checks
Orchestration systems need to know if a container is healthy. Health checks define how to determine if a container is functioning:
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
This runs `curl http://localhost:3000/health` every 30 seconds. If it fails 3 times, the container is considered unhealthy and should be restarted.
Restart Policies
When a container crashes, what should happen?
docker run --restart=always my-app:1.0
Options: `no` (don't restart), `always` (always restart), `unless-stopped` (restart unless explicitly stopped), `on-failure` (restart only if exit code indicates failure).
Containers vs Virtual Machines
| Name | Containers | Vms |
|---|---|---|
| Size | Small (tens of MB typically) | Large (gigabytes) |
| Startup time | Seconds | Minutes |
| Isolation | Process-level isolation | Full OS isolation |
| Resource efficiency | Efficient, lightweight | Resource-hungry |
| Use case | Microservices, modern applications | When full OS isolation needed |
When Containers Are Overkill
Containers are powerful but add complexity. For simple applications, they might be unnecessary:
- Single-developer projects with simple deployment
- Fully managed platforms (Vercel, Render) that handle deployment
- Applications that don't change frequently
- Teams without operations experience
If your deployment process is "push button, done," containers might not add value. If deployment is complex (multiple services, custom configurations), containers shine.
Common Docker Mistakes
Running as root. By default, Docker runs as root inside containers. This is a security risk. Run as a non-root user whenever possible:
RUN useradd -m appuser
USER appuser
Storing state in containers. Container data is ephemeral. If your app needs persistent data (logs, uploads), use volumes or external storage.
Ignoring Docker build context. Docker sends the entire directory to the build context. Exclude large files and sensitive data with .dockerignore.
Docker in CI Pipelines
Continuous integration systems often build and test Docker images automatically. Push code → GitHub Actions builds Docker image → tests run in container → image pushed to registry → deployed to production. This ensures every deployment uses a tested, versioned image.
The Reality
Docker has become the standard in professional development. Most companies use containers in some form. Understanding containers is now a basic requirement for full-stack developers.
Don't over-engineer early. If you're building something simple, a managed platform might be faster than containers. But as applications grow, containerization becomes invaluable for consistency and reproducibility.