Controlling when, how, and in what order ArgoCD deploys your resources
Use arrow keys or click sides to navigate
The Story
The Midnight Incident
A junior developer runs kubectl edit on a production deployment, bumping replicas from 3 to 10. Nobody notices until the cloud bill arrives. With ArgoCD self-heal, the change would have been reverted within minutes. But the team also needs database migrations to run before app deployments, and old resources should be cleaned up automatically. Let us learn how ArgoCD handles all of this.
Manual vs Auto Sync
Manual Sync
ArgoCD detects drift and shows "OutOfSync" but waits for human approval before applying changes.
Good for production environments
Requires UI click or CLI command
Change review before deploy
Auto Sync
ArgoCD automatically applies changes when it detects drift between Git and the cluster.
Great for dev/staging environments
True GitOps -- push to Git, auto deploy
Combine with self-heal for full automation
Enabling Auto-Sync
spec:
syncPolicy:
automated:
prune: false # Do NOT auto-delete removed resources (safe default)
selfHeal: false # Do NOT auto-revert manual changes (safe default)
allowEmpty: false # Fail if Git path produces zero resources
Careful: Enabling automated without prune and selfHeal means ArgoCD will auto-sync new commits from Git but will NOT delete removed resources and will NOT revert manual cluster changes. This is the safest starting point.
Self-Heal: ArgoCD Fights Back
Self-heal automatically reverts any manual changes made directly to the cluster.
Imagine someone runs kubectl scale deployment/api --replicas=10 on production. With self-heal enabled, ArgoCD detects the drift within seconds and scales it back to whatever Git says. The cluster always matches Git -- no exceptions.
Tip: Self-heal checks happen every 5 seconds (not the 3-minute polling interval). It uses a live Kubernetes watch for near-instant detection.
Prune: Automatic Cleanup
Prune automatically deletes Kubernetes resources that were removed from Git.
If you remove a Service from Git, the Service stays in the cluster as an orphan. ArgoCD shows it as "OutOfSync" but does nothing.
With Prune
If you remove a Service from Git, ArgoCD automatically deletes it from the cluster on the next sync. Clean and tidy.
Quiz Time
Sync Basics
1. What happens when auto-sync is enabled but prune is false, and you remove a Deployment from Git?
Correct! Without prune enabled, ArgoCD will not delete resources removed from Git. The resource stays in the cluster but may show as out of sync.
2. How quickly does ArgoCD's self-heal detect manual cluster changes?
Correct! Self-heal uses Kubernetes API watches for near-instant detection (typically within 5 seconds), unlike Git polling which runs every 3 minutes.
3. What does allowEmpty: false protect against?
Correct! If the source path is empty (misconfigured path, broken Helm chart), and prune is enabled, ArgoCD would delete ALL existing resources. allowEmpty: false prevents this catastrophic scenario.
Sync Options
Fine-tune sync behavior with sync options. These can be set globally or per-resource via annotations.
spec:
syncPolicy:
syncOptions:
- CreateNamespace=true # Create ns if it doesn't exist
- PrunePropagationPolicy=foreground # Wait for dependents
- PruneLast=true # Prune after all creates/updates
- Validate=true # Validate manifests before apply
- ServerSideApply=true # Use server-side apply
- ApplyOutOfSyncOnly=true # Only apply changed resources
- RespectIgnoreDifferences=true # Skip ignored fields in sync
CreateNamespace
syncOptions:
- CreateNamespace=true
By default, ArgoCD assumes the target namespace already exists. If it does not, the sync fails. CreateNamespace=true tells ArgoCD to create the namespace automatically before syncing resources.
This is one of the most commonly needed sync options, especially when using ApplicationSets to deploy to multiple environments.
ServerSideApply
syncOptions:
- ServerSideApply=true
What: Uses Kubernetes Server-Side Apply instead of client-side kubectl apply
Why: Better conflict detection and field ownership tracking
When: Large CRDs (like Istio VirtualService), resources managed by multiple controllers
Benefit: Avoids the "metadata.annotations too long" error on large resources
Real-world fix: If you see "metadata.annotations: Too long" errors during sync, enabling ServerSideApply usually fixes it. It eliminates the kubectl.kubernetes.io/last-applied-configuration annotation.
ApplyOutOfSyncOnly
syncOptions:
- ApplyOutOfSyncOnly=true
Without (Default)
ArgoCD applies all resources in the application on every sync, even if they have not changed. Safe but slow for large applications.
With ApplyOutOfSyncOnly
ArgoCD only applies resources that are actually out of sync. Much faster for applications with many resources.
Recommended for applications with 50+ resources to reduce API server load and speed up syncs.
Per-Resource Sync Options
Apply sync options to individual resources using annotations.
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
annotations:
# Skip this resource during prune operations
argocd.argoproj.io/sync-options: Prune=false
# Use replace instead of apply (destructive but clean)
# argocd.argoproj.io/sync-options: Replace=true
# Skip validation for this resource
# argocd.argoproj.io/sync-options: Validate=false
Use case: Annotate critical resources like PVCs with Prune=false so they are never accidentally deleted, even if removed from Git.
Advanced
Sync Waves: Ordered Deployment
Sync waves let you control the order in which resources are applied.
Wave -1: Namespace
→
Wave 0: ConfigMaps, Secrets
→
Wave 1: Deployments
→
Wave 2: Ingress
Resources in the same wave are applied together. ArgoCD waits for all resources in a wave to be healthy before proceeding to the next wave. Waves are processed from lowest to highest number.
1. What does the sync option ServerSideApply=true fix?
Correct! Server-side apply eliminates the last-applied-configuration annotation that can exceed the annotation size limit on large CRDs.
2. When using sync waves, when does ArgoCD proceed to the next wave?
Correct! ArgoCD waits for all resources in a wave to report as healthy before moving to the next wave. If a resource in a wave fails, the sync stops.
3. How do you prevent a PVC from being pruned even when prune is globally enabled?
Correct! Per-resource annotations override global sync options. Adding Prune=false to the PVC annotation protects it from deletion while other resources can still be pruned.
Hooks
Resource Hooks
Resource hooks let you run Kubernetes resources (usually Jobs) at specific points during the sync lifecycle.
PreSync
→
Sync
→
PostSync
Skip
SyncFail
Hooks are annotated Kubernetes resources. They run at their designated phase and can be cleaned up automatically.
PreSync Hook
Runs before the main sync. Perfect for database migrations, schema changes, or pre-flight checks.
Real-world use: Before deploying a new API version, run a database migration Job. If the migration fails, the sync stops and the new version is never deployed.
Sync Hook
Runs during the main sync, alongside your application resources.
Control when hook resources (usually Jobs) are cleaned up.
HookSucceeded
Delete the hook resource after it succeeds. If it fails, it stays for debugging.
HookFailed
Delete the hook resource if it fails. Useful when you only want to keep successful runs.
BeforeHookCreation
Delete the previous hook resource before creating a new one. Prevents name conflicts across syncs. This is the default.
No Policy
Hook resources are never deleted. You must clean them up manually. Can cause name conflicts.
Combining Waves and Hooks
# Full deployment pipeline using waves and hooks:
# PreSync, Wave -2: Run DB migration
# PreSync, Wave -1: Verify migration succeeded
# Sync, Wave 0: Deploy ConfigMaps, Secrets
# Sync, Wave 1: Deploy Deployments, Services
# Sync, Wave 2: Deploy Ingress, NetworkPolicies
# PostSync, Wave 0: Run smoke tests
# PostSync, Wave 1: Send Slack notification
Key insight: Waves work within each hook phase. You can have multiple waves within PreSync, multiple within Sync, and multiple within PostSync. This gives you fine-grained ordering control.
Quiz Time
Hooks Knowledge
1. Which hook type would you use for running database migrations?
Correct! PreSync is ideal for database migrations because they must complete successfully before the new application version is deployed.
2. What is the default hook delete policy?
Correct! BeforeHookCreation is the default. It deletes the previous hook resource before creating a new one on the next sync, preventing name conflicts.
3. What happens if a PreSync hook (Job) fails?
Correct! If a PreSync hook fails, the entire sync operation stops. The Sync phase never begins, protecting you from deploying an app that depends on a failed migration.
Retry Policies
Configure automatic retries when a sync fails due to transient errors.
spec:
syncPolicy:
retry:
limit: 5 # Max retry attempts (0 = no retries)
backoff:
duration: 5s # Initial delay between retries
factor: 2 # Multiply delay by this factor each retry
maxDuration: 3m # Maximum delay between retries
# Retry schedule with these settings:
# Attempt 1: immediate
# Attempt 2: after 5s
# Attempt 3: after 10s
# Attempt 4: after 20s
# Attempt 5: after 40s (capped at 3m)
When to use: Transient errors like API server timeouts, webhook admission failures during rolling updates, or race conditions with CRDs being registered.
Replace vs Apply
# Per-resource annotation to use kubectl replace instead of apply
metadata:
annotations:
argocd.argoproj.io/sync-options: Replace=true
# Or force the resource (delete and recreate)
argocd.argoproj.io/sync-options: Force=true
Apply (Default)
Three-way merge patch. Safe, preserves fields set by other controllers. Can fail on immutable fields.
Replace
Full resource replacement. Use when Apply fails on immutable field changes (e.g., Job spec changes).
The Skip annotation tells ArgoCD to completely ignore this resource during sync operations.
The resource will not be applied, updated, or pruned by ArgoCD
Useful for resources managed by external tools (e.g., Crossplane, external operators)
Also useful for template files that should exist in Git but not be deployed
Alternative: You can also exclude resources using the argocd.argoproj.io/compare-options: IgnoreExtraneous annotation to stop ArgoCD from comparing a resource.
Real-World Pipeline
Complete Deployment Pipeline
-2
PreSync: Database Migration
Job runs Flyway/Liquibase migrations. If it fails, deployment stops.
-1
PreSync: Migration Verification
Job verifies the schema is correct before proceeding.
0
Sync Wave 0: ConfigMaps & Secrets
Configuration deployed first so Deployments can mount them.
1
Sync Wave 1: Deployments & Services
Application pods roll out. ArgoCD waits for healthy before continuing.
2
PostSync: Smoke Tests & Notification
Verify the deployment works, then notify the team on Slack/Teams.
Sync Status Codes
Understanding what you see during and after a sync operation.
Succeeded: All resources synced, healthy, and hooks completed successfully
Failed: One or more resources failed to sync or a hook failed
Running: Sync is in progress -- resources are being applied
Pruned: Resources were deleted because they no longer exist in Git
OutOfSync: Sync needed -- live state differs from desired state
# View sync operation details
argocd app get my-app --show-operation
# View sync history
argocd app history my-app
Quiz Time
Final Review
1. Can sync waves be used within hook phases (e.g., multiple waves in PreSync)?
Correct! Waves work within each hook phase. You can have PreSync wave -2, PreSync wave -1, Sync wave 0, Sync wave 1, PostSync wave 0, etc.
2. When should you use Replace=true instead of the default Apply?
Correct! Replace is needed when you change immutable fields (like Job spec.template). Apply would fail because it tries to patch the resource, but Replace deletes and recreates it.
3. With retry backoff settings of duration=5s, factor=2, maxDuration=3m and limit=5, what is the delay before the 4th retry?
The backoff increases exponentially: 5s, 10s, 20s, 40s. The delay before the 4th attempt would be 20 seconds (5s x 2^2).
Summary
Module 03 Recap
Manual sync waits for approval; auto-sync deploys on Git changes
Self-heal reverts manual cluster changes within seconds
Prune deletes resources removed from Git; protect critical resources with per-resource annotations