DaemonSet, Job & CronJob
DaemonSet runs one pod per node. Job runs a task to completion. CronJob runs a Job on a schedule.
DaemonSet = placing one security guard on every floor of a building (one per node). Job = hiring a temp worker to move boxes, then they leave. CronJob = the cleaning crew that shows up every Tuesday at 9 PM on a schedule.
DaemonSet: exactly one pod on every node (or matching nodes). Use for: log collectors (fluentd, filebeat), monitoring agents (node-exporter), CNI plugins, storage drivers. Automatically adds pods to new nodes. Job: runs pods until the desired number of completions succeed (parallelism, completions fields). Useful for batch processing, database migrations. CronJob: creates Jobs on a cron schedule. concurrencyPolicy: Allow, Forbid, or Replace if the previous run hasn't finished.
DaemonSet pods bypass normal scheduling — they use the node's available resources without competing through the scheduler. DaemonSet update strategies: RollingUpdate (default) and OnDelete (only update when you manually delete the pod). Job failure policy: backoffLimit (retry count), activeDeadlineSeconds (timeout). Job completionMode: NonIndexed (default) vs Indexed (each pod gets an index — for sharded work). CronJob history: successfulJobsHistoryLimit and failedJobsHistoryLimit (defaults 3 and 1). CronJob startingDeadlineSeconds: if a job missed its schedule window, don't start it.
DaemonSet is for infrastructure agents — you need exactly one per node for things like log collection, metrics scraping, or CNI. Job is for finite tasks — database migrations, report generation. CronJob wraps a Job with a cron schedule. Key CronJob gotcha: if the cluster is down during a scheduled time, it catches up based on startingDeadlineSeconds — you may get multiple jobs running at once unless you set concurrencyPolicy: Forbid.
Completed Job pods are NOT automatically deleted. They remain in Completed state until the Job's TTL (ttlSecondsAfterFinished) or manual cleanup. Without TTL cleanup, completed pods accumulate and hit cluster object limits.