Skip to content

Docker Cleanup Automation — Design

Problem

VPS2 (dev server, 38 GB disk) fills up fast — 30 GB used (79%). Main consumers:

ComponentSizeReclaimable
Docker Images13.8 GB10.1 GB (73%)
Docker Build Cache4.9 GB4.4 GB
GitLab Runner builds1.3 GB~1 GB
Hourly backups (×26)2.6 GBNot worth touching

Root cause: CI runs docker image prune -f (dangling only). Build cache, old prod tags, and runner workspaces are never cleaned.

Solution: CI Post-Deploy Cleanup

Approach

Add cleanup stage to .gitlab-ci.yml that runs after deploy (and E2E for develop).

develop pipeline cleanup (cleanup:dev)

Runs after test:e2e stage:

bash
# 1. Prune build cache older than 72h
docker builder prune -a -f

# 2. Prune dangling images (already in deploy:dev, keep for safety)
docker image prune -f

# 3. Clean runner workspace artifacts (root-owned from Docker)
rm -rf /home/gitlab-runner/builds/*/0/*/partizap-frontend/.nuxt
rm -rf /home/gitlab-runner/builds/*/0/*/partizap-frontend/node_modules

prod pipeline cleanup (cleanup:prod)

Runs after deploy:prod:

bash
# 1. Prune build cache older than 72h
docker builder prune -a -f

# 2. Keep only 5 latest prod version tags
docker images partizap-frontend-prod --format '{{.Tag}}' \
  | grep '^v' | sort -V | head -n -5 \
  | xargs -r -I{} docker rmi partizap-frontend-prod:{}

# 3. Prune dangling images
docker image prune -f

What we DON'T clean

  • Playwright image (3.38 GB) — cached intentionally for E2E speed
  • YouTrack image (5.22 GB) — active container
  • node:22-alpine (231 MB) — base CI image
  • Hourly backups (2.6 GB) — rsync hardlinks, incremental cost is low

Expected impact

  • Immediate: ~5-6 GB freed (build cache + old tags)
  • Ongoing: near-zero accumulation (cleanup runs on every push)

CI structure change

yaml
stages:
  - validate
  - release
  - build
  - deploy
  - test-e2e
  - cleanup     # NEW

cleanup:dev:
  stage: cleanup
  tags: [partizap-shell]
  needs: [test:e2e]
  script: ...
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"

cleanup:prod:
  stage: cleanup
  tags: [partizap-shell]
  needs: [deploy:prod]
  script: ...
  rules:
    - if: $CI_COMMIT_TAG =~ /^v/

Risk assessment

  • docker builder prune -a --filter until=24h — safe, only removes cache >3 days old
  • Prod tag cleanup keeps 5 versions — sufficient rollback window
  • Runner workspace cleanup only targets .nuxt and node_modules — recreated on every CI run
  • allow_failure: true on cleanup jobs — failed cleanup shouldn't block pipeline