Container dies. Data gone. Let's fix that.
Civica Kubernetes Training
"Container dies. Data gone. That is the problem. Every container has a writable filesystem, but it is as permanent as writing on a whiteboard. The moment someone erases it -- restart, crash, reschedule -- everything you wrote is lost."
A Volume is a directory accessible to containers in a Pod. It is defined in the Pod spec and mounted into one or more containers.
Volume Lifecycle
Both containers read/write to the same volume
"emptyDir is like scratch paper on your desk. It appears when you sit down (Pod starts) and gets thrown away when you leave (Pod deleted). But if you just take a coffee break (container restart), the paper stays."
medium: Memory)apiVersion: v1
kind: Pod
metadata:
name: shared-data
spec:
containers:
- name: writer
image: busybox
command: ["sh","-c",
"echo hello > /data/greeting; sleep 3600"]
volumeMounts:
- name: scratch
mountPath: /data
- name: reader
image: busybox
command: ["sh","-c",
"cat /data/greeting; sleep 3600"]
volumeMounts:
- name: scratch
mountPath: /data
volumes:
- name: scratch
emptyDir: {}
hostPath mounts a file or directory from the host node's filesystem into a Pod. Useful for system-level workloads but dangerous for regular applications.
Directory, File, DirectoryOrCreate, FileOrCreateapiVersion: v1
kind: Pod
metadata:
name: log-reader
spec:
containers:
- name: reader
image: busybox
command: ["tail","-f",
"/var/log/syslog"]
volumeMounts:
- name: host-logs
mountPath: /var/log
readOnly: true
volumes:
- name: host-logs
hostPath:
path: /var/log
type: Directory
| Type | Lifecycle | Shared Across Pods | Use Case |
|---|---|---|---|
| emptyDir | Dies with Pod | No | Scratch space, sidecar sharing |
| hostPath | Persists on node | Only on same node | System daemons, node-level access |
| PersistentVolume | Independent of Pod | Yes (with access modes) | Databases, file uploads, stateful apps |
| configMap | Cluster lifetime | Yes (read-only) | Configuration files |
| secret | Cluster lifetime | Yes (read-only) | Credentials, TLS certs |
| projected | Varies | No | Combine multiple sources into one mount |
medium: Memory stores data where?medium: Memory on an emptyDir creates a tmpfs (RAM-backed filesystem). It is very fast but counts against the container's memory limit, and data is lost on Pod deletion."emptyDir is scratch paper. hostPath ties you to one desk. But what if you need a real filing cabinet -- one that stays put even when you change offices? That is what Persistent Volumes are for."
A PersistentVolume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically by a StorageClass.
ReadWriteOnce (RWO) -- one node read/write
ReadOnlyMany (ROX) -- many nodes read-only
ReadWriteMany (RWX) -- many nodes read/write
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-azure-disk
spec:
capacity:
storage: 50Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: managed-premium
azureDisk:
diskName: my-data-disk
diskURI: /subscriptions/.../my-data-disk
kind: Managed
"You do not ask the gym owner to physically hand you locker number 47. You fill out a claim form: 'I need a medium locker, with a lock.' The gym matches you to an available one. That is a PVC."
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: data-claim
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: managed-premium
# Check binding status
$ kubectl get pvc data-claim
NAME STATUS VOLUME CAPACITY
data-claim Bound pv-azure-01 20Gi
Once a PVC is bound to a PV, any Pod in the same namespace can mount it. The Pod references the PVC name -- it never needs to know about the underlying PV or cloud storage details.
volumes referencing the PVCvolumeMounts with mount pathapiVersion: v1
kind: Pod
metadata:
name: database
spec:
containers:
- name: postgres
image: postgres:15
volumeMounts:
- name: db-storage
mountPath: /var/lib/postgresql/data
env:
- name: POSTGRES_PASSWORD
value: "secret"
volumes:
- name: db-storage
persistentVolumeClaim:
claimName: data-claim
PV/PVC Lifecycle
PVC binds to PV → PVC deleted → PV enters Released state
rm -rf and make available again.Retain for production databases. With Delete, deleting the PVC also deletes the Azure Disk -- data gone forever.
"Not all lockers are created equal. Some are small and free (Standard HDD). Some are large and climate-controlled (Premium SSD). A StorageClass defines the tier -- you pick which tier you want in your PVC."
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: premium-ssd
provisioner: disk.csi.azure.com
parameters:
skuName: Premium_LRS
kind: Managed
reclaimPolicy: Delete
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
AKS comes with several built-in StorageClasses. You can also create custom ones.
| StorageClass | Provisioner | Disk Type | Access Mode | Use Case |
|---|---|---|---|---|
| managed | disk.csi.azure.com | Standard SSD (StandardSSD_LRS) | RWO | General workloads |
| managed-premium | disk.csi.azure.com | Premium SSD (Premium_LRS) | RWO | Production databases |
| managed-csi | disk.csi.azure.com | Standard SSD (CSI) | RWO | Default in newer AKS |
| managed-csi-premium | disk.csi.azure.com | Premium SSD (CSI) | RWO | High-performance |
| azurefile | file.csi.azure.com | Azure Files (Standard) | RWX | Shared file access |
| azurefile-premium | file.csi.azure.com | Azure Files (Premium) | RWX | High-perf shared files |
"With dynamic provisioning, you do not need an admin to create storage in advance. You submit your claim, and the system automatically provisions a new disk in Azure, creates a PV, and binds it to your PVC. It is like walking into a hotel and a room is ready before you even check in."
# Just create a PVC -- that's it!
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: my-data
spec:
accessModes:
- ReadWriteOnce
storageClassName: managed-csi-premium
resources:
requests:
storage: 100Gi
# Azure automatically creates:
# 1. A Premium SSD Managed Disk
# 2. A PV object in Kubernetes
# 3. Binds PVC to PV
$ kubectl get pvc,pv
NAME STATUS VOLUME CAPACITY
my-data Bound pvc-a1b2c3 100Gi
By default, PVCs are provisioned immediately. But this can cause problems in multi-zone clusters -- the disk might be created in a different zone than the Pod.
WaitForFirstConsumer in multi-zone AKS clusters. Otherwise, you may get a disk in Zone 1 and a Pod in Zone 2, causing a mount failure.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: zone-aware-ssd
provisioner: disk.csi.azure.com
parameters:
skuName: Premium_LRS
volumeBindingMode: WaitForFirstConsumer
WaitForFirstConsumer
Running out of space? You can expand a PVC without downtime (if the StorageClass allows it).
allowVolumeExpansion: truespec.resources.requests.storage# Expand PVC from 20Gi to 50Gi
$ kubectl patch pvc data-claim -p \
'{"spec":{"resources":{"requests":{"storage":"50Gi"}}}}'
# Check status
$ kubectl get pvc data-claim
NAME STATUS CAPACITY
data-claim Bound 50Gi
# For Azure Disk, restart the Pod:
$ kubectl delete pod database
# Pod recreated by Deployment/StatefulSet
# Filesystem resized on mount
Retain keeps the PV and its underlying storage after the PVC is deleted. This prevents accidental data loss. With Delete, deleting the PVC also deletes the Azure Disk -- your data is permanently gone.WaitForFirstConsumer, the PV is not created until a Pod needs it. This lets Kubernetes provision the disk in the same availability zone as the scheduled Pod, preventing cross-zone mount failures."We have solved the data persistence problem. But applications need more than just disk space. They need configuration (database URLs, feature flags) and secrets (passwords, API keys). Kubernetes has dedicated objects for both."
The bulletin board for app configuration. Non-sensitive key-value pairs and files.
The vault for sensitive data. Base64-encoded, with access controls.
"A ConfigMap is like the bulletin board in your office. You pin up configuration that anyone can read: the Wi-Fi password for guests, the lunch menu, office hours. It is not secret -- just information your app needs."
# Create from literal values
$ kubectl create configmap app-config \
--from-literal=DB_HOST=postgres \
--from-literal=LOG_LEVEL=info
# Create from a file
$ kubectl create configmap nginx-conf \
--from-file=nginx.conf
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
DB_HOST: postgres.default.svc.cluster.local
LOG_LEVEL: info
MAX_CONNECTIONS: "100"
containers:
- name: app
image: myapp:latest
envFrom:
- configMapRef:
name: app-config
# Or specific keys:
env:
- name: DATABASE_HOST
valueFrom:
configMapKeyRef:
name: app-config
key: DB_HOST
containers:
- name: nginx
image: nginx:latest
volumeMounts:
- name: config
mountPath: /etc/nginx/conf.d
volumes:
- name: config
configMap:
name: nginx-conf
# Each key becomes a file
# Key = filename, Value = content
"If ConfigMaps are the bulletin board, Secrets are the locked safe in the manager's office. Passwords, API keys, TLS certificates -- anything you would not want posted on the wall goes here."
Opaque, kubernetes.io/tls, kubernetes.io/dockerconfigjson# Create from literal
$ kubectl create secret generic db-creds \
--from-literal=username=admin \
--from-literal=password=S3cur3P@ss
# Create TLS secret
$ kubectl create secret tls my-tls \
--cert=tls.crt --key=tls.key
apiVersion: v1
kind: Secret
metadata:
name: db-creds
type: Opaque
data:
username: YWRtaW4= # base64
password: UzNjdXIzUEBzcw== # base64
containers:
- name: app
image: myapp:latest
env:
- name: DB_USER
valueFrom:
secretKeyRef:
name: db-creds
key: username
- name: DB_PASS
valueFrom:
secretKeyRef:
name: db-creds
key: password
containers:
- name: app
image: myapp:latest
volumeMounts:
- name: creds
mountPath: /etc/secrets
readOnly: true
volumes:
- name: creds
secret:
secretName: db-creds
# Files: /etc/secrets/username
# /etc/secrets/password
# Values are auto-decoded from base64
# Azure Key Vault CSI SecretProviderClass
apiVersion: secrets-store.csi.x-k8s.io/v1
kind: SecretProviderClass
metadata:
name: azure-kv
spec:
provider: azure
parameters:
keyvaultName: my-keyvault
objects: |
array:
- |
objectName: db-password
objectType: secret
tenantId: "your-tenant-id"
"Kubernetes started with built-in storage drivers for every cloud provider. That quickly became unsustainable. Enter CSI -- the Container Storage Interface. It is a plugin system that lets any storage vendor write a driver that works with any container orchestrator."
disk.csi.azure.com -- Azure Managed Disksfile.csi.azure.com -- Azure Files (SMB/NFS)blob.csi.azure.com -- Azure Blob Storage (NFS/FUSE)secrets-store.csi.k8s.io -- Azure Key VaultCSI Architecture
Volume Snapshots are point-in-time copies of a PV. Like taking a photo of your data at a specific moment -- you can restore from it later.
# Create a snapshot
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: db-snapshot
spec:
volumeSnapshotClassName: azure-disk-snapshot
source:
persistentVolumeClaimName: data-claim
# Restore from snapshot
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: restored-data
spec:
storageClassName: managed-csi-premium
dataSource:
name: db-snapshot
kind: VolumeSnapshot
apiGroup: snapshot.storage.k8s.io
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
StatefulSets use volumeClaimTemplates to automatically create a unique PVC for each replica. This is how databases like PostgreSQL and MongoDB get per-replica storage.
data-postgres-0, data-postgres-1, etc.apiVersion: apps/v1
kind: StatefulSet
metadata:
name: postgres
spec:
serviceName: postgres
replicas: 3
selector:
matchLabels:
app: postgres
template:
spec:
containers:
- name: postgres
image: postgres:15
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: managed-csi-premium
resources:
requests:
storage: 50Gi
A Projected Volume combines multiple volume sources into a single mount point. Think of it as a folder that pulls files from different places.
secret, configMap, downwardAPI, serviceAccountTokenvolumes:
- name: app-config
projected:
sources:
- configMap:
name: app-settings
- secret:
name: app-secrets
- downwardAPI:
items:
- path: "labels"
fieldRef:
fieldPath: metadata.labels
| Symptom | Likely Cause | Fix |
|---|---|---|
| PVC stuck in Pending | No matching PV or StorageClass | Check kubectl describe pvc for events |
| Pod stuck in ContainerCreating | Volume cannot attach/mount | Check kubectl describe pod and node events |
| Multi-attach error | Azure Disk (RWO) used by multiple Pods on different nodes | Use Azure Files (RWX) or ensure single-node access |
| Permission denied on mount | Container runs as non-root but volume is root-owned | Use securityContext.fsGroup in Pod spec |
| Zone mismatch | Disk in zone 1, Pod in zone 2 | Use WaitForFirstConsumer binding mode |
| Secret not found | Secret in wrong namespace or typo in name | Verify with kubectl get secret -n <ns> |
# Useful debugging commands
$ kubectl describe pvc my-claim # Check PVC events
$ kubectl get events --sort-by=.lastTimestamp # Recent events
$ kubectl describe pod my-pod # Check mount errors
$ kubectl get sc # List StorageClasses
In Module 7, we will build on everything we have learned:
Monitoring, logging, and metrics with Prometheus and Grafana
Liveness, readiness, and startup probes in depth
Advanced troubleshooting techniques for production clusters
kubectl describe pvc my-claim shows the Events section, which contains detailed messages about why the PVC cannot be bound -- such as no matching StorageClass, insufficient capacity, or provisioner errors.Module 6: Storage & Volumes -- Complete
Questions? Let's discuss before moving to Module 7.
Civica Kubernetes Training