Dockerfile & Image Layers
Each Dockerfile instruction creates a read-only layer. Layers are stacked using overlayfs. An image is the ordered stack of all layers.
Layers are like stacking transparent sheets. Each RUN = a new sheet. You can draw on a new sheet but never erase a previous one. 'Deleting' just puts a sticky note saying 'ignore this' — the old drawing is still there taking up space.
Instructions that produce layers: RUN, COPY, ADD. Instructions that don't produce new layers: FROM, ENV, ARG, LABEL, EXPOSE, CMD, ENTRYPOINT, WORKDIR. Layers are content-addressed (SHA256 of content). Shared layers between images are stored once on disk and pulled once. The container adds a thin writable layer on top at runtime (copy-on-write for any writes).
overlayfs works with lowerdir (read-only layers), upperdir (writable container layer), and workdir. Reading a file: check upperdir first, fall through to lowerdir layers. Writing a file: copy-on-write from lowerdir to upperdir, then modify. Deleting a file: whiteout file created in upperdir. Layer size optimization: if you install packages in one RUN then delete in a second RUN, the deletion still adds a layer but the original bytes remain in the layer below — total image size doesn't shrink. Must combine into one RUN or use multi-stage builds. COPY vs ADD: ADD auto-extracts tarballs and accepts URLs — almost always use COPY instead for clarity.
Every RUN, COPY, ADD creates an immutable layer identified by its SHA256 hash. Layers are reused across images and cached during builds. The container adds a writable layer on top using overlayfs copy-on-write. The critical implication: apt-get install in one RUN, apt-get clean in the next RUN does NOT reduce image size — the installed bytes live in layer N, the deletion is a whiteout in layer N+1. Combine them: RUN apt-get update && apt-get install -y pkg && rm -rf /var/lib/apt/lists/*
apt-get clean in a separate RUN layer does NOT shrink the image. Layers are immutable and additive — deletion only adds a whiteout marker. Use single RUN instructions or multi-stage builds to keep images small.