Kafka Integration, Tuning, Observability, and Production Readiness
From running Knative to running it well in production.
Knative offers three levels of Kafka integration:
apiVersion: sources.knative.dev/v1beta1
kind: KafkaSource
metadata:
name: payment-events
spec:
consumerGroup: knative-payments
bootstrapServers:
- kafka-cluster.kafka.svc.cluster.local:9092
topics:
- payments.completed
- payments.failed
sink:
ref:
apiVersion: eventing.knative.dev/v1
kind: Broker
name: default
# Optional: initial offset
initialOffset: latest # or "earliest"
# Install Kafka Channel
kubectl apply -f https://github.com/knative-extensions/eventing-kafka-broker/releases/download/knative-v1.14.0/eventing-kafka-channel-install.yaml
# Set KafkaChannel as the default channel type
apiVersion: v1
kind: ConfigMap
metadata:
name: default-ch-webhook
namespace: knative-eventing
data:
default-ch-config: |
clusterDefault:
apiVersion: messaging.knative.dev/v1beta1
kind: KafkaChannel
spec:
numPartitions: 3
replicationFactor: 3
Key benefit: Events survive pod restarts and broker failures. In-Memory channels do NOT provide this guarantee.
# Install Kafka Broker
kubectl apply -f https://github.com/knative-extensions/eventing-kafka-broker/releases/download/knative-v1.14.0/eventing-kafka-broker.yaml
# Create a Kafka Broker
apiVersion: eventing.knative.dev/v1
kind: Broker
metadata:
name: kafka-broker
annotations:
eventing.knative.dev/broker.class: Kafka
spec:
config:
apiVersion: v1
kind: ConfigMap
name: kafka-broker-config
namespace: knative-eventing
---
apiVersion: v1
kind: ConfigMap
metadata:
name: kafka-broker-config
namespace: knative-eventing
data:
default.topic.partitions: "6"
default.topic.replication.factor: "3"
bootstrap.servers: "kafka-cluster:9092"
| Feature | KafkaSource | KafkaChannel | Kafka Broker |
|---|---|---|---|
| Direction | Kafka to Knative | Within Knative | Within Knative |
| Use Case | Import external events | Durable channels | Full event hub |
| Ordering | Per partition | Per partition | Per partition |
| Performance | High | High | Highest |
| Complexity | Low | Medium | Medium |
| Durability | Kafka guarantees | Kafka guarantees | Kafka guarantees |
Recommendation: Use Kafka Broker for new deployments. Use KafkaSource when integrating with existing Kafka topics.
1. What is the key advantage of using KafkaChannel over InMemoryChannel?
2. What is the difference between KafkaSource and Kafka Broker?
3. What broker class annotation do you use for a Kafka-backed Broker?
Two approaches to create your own event sources:
Injects K_SINK into any Deployment. Your app posts CloudEvents to that URL.
Knative manages the container lifecycle. It injects K_SINK and runs your container.
apiVersion: sources.knative.dev/v1
kind: ContainerSource
metadata:
name: azure-blob-watcher
spec:
template:
spec:
containers:
- image: myregistry.azurecr.io/blob-watcher:v1
env:
- name: STORAGE_ACCOUNT
value: "mystorageaccount"
- name: CONTAINER_NAME
value: "uploads"
- name: CONNECTION_STRING
valueFrom:
secretKeyRef:
name: storage-secret
key: connection-string
sink:
ref:
apiVersion: eventing.knative.dev/v1
kind: Broker
name: default
Your container polls Azure Blob Storage and emits CloudEvents to $K_SINK when new blobs appear.
apiVersion: v1 kind: ConfigMap metadata: name: config-autoscaler namespace: knative-serving data: # Scale-to-zero settings enable-scale-to-zero: "true" scale-to-zero-grace-period: "30s" scale-to-zero-pod-retention-period: "0s" # Scaling windows stable-window: "60s" # Window for stable mode decisions panic-window-percentage: "10" # % of stable window for panic mode panic-threshold-percentage: "200" # Trigger panic if 2x target # Scale bounds max-scale-up-rate: "1000" # Max ratio of scale-up per tick max-scale-down-rate: "2" # Max ratio of scale-down per tick # Target utilization target-burst-capacity: "200" # Extra capacity for bursts activator-capacity: "100" # Requests activator can buffer
Normal (Stable Mode) Panic Mode
------------------ ----------
60-second window 6-second window (10% of 60s)
Gradual scaling Aggressive scaling
Panic triggers when:
observed_concurrency > 2x target (panic-threshold: 200%)
Returns to stable when:
traffic stays below target for full stable window
Three pillars of observability for Knative:
Logging: Structured JSON logs from all Knative components, configurable via config-logging ConfigMap.
# config-observability ConfigMap (knative-serving namespace) apiVersion: v1 kind: ConfigMap metadata: name: config-observability namespace: knative-serving data: # Enable Prometheus metrics metrics.backend-destination: prometheus # Request metrics reporting period metrics.reporting-period-seconds: "5" # Enable request tracing enable-tracing: "true" # Tracing backend zipkin-endpoint: "http://zipkin.observability.svc.cluster.local:9411/api/v2/spans" # Sample rate (1.0 = trace everything, 0.1 = 10%) sample-rate: "0.1"
| Metric | What It Tells You |
|---|---|
revision_request_count | Total requests per revision |
revision_request_latencies | Response time distribution (p50, p95, p99) |
revision_app_request_count | Requests reaching your container (excludes queue-proxy overhead) |
autoscaler_desired_pods | How many pods the autoscaler wants |
autoscaler_actual_pods | How many pods are actually running |
activator_request_count | Requests buffered by the activator (cold starts) |
queue_depth | Requests waiting in queue-proxy |
1. When does the Knative autoscaler enter "panic mode"?
2. Which ConfigMap controls tracing and metrics in Knative Serving?
3. What does the max-scale-down-rate setting control?
# config-logging ConfigMap
apiVersion: v1
kind: ConfigMap
metadata:
name: config-logging
namespace: knative-serving
data:
# Log level for Knative components
loglevel.controller: "info"
loglevel.autoscaler: "info" # Set to "debug" for troubleshooting
loglevel.activator: "info"
loglevel.webhook: "info"
loglevel.queueproxy: "info"
# Structured logging format
zap-logger-config: |
{
"level": "info",
"development": false,
"outputPaths": ["stdout"],
"errorOutputPaths": ["stderr"],
"encoding": "json",
"encoderConfig": {
"timeKey": "ts",
"levelKey": "level",
"nameKey": "logger",
"callerKey": "caller",
"messageKey": "msg"
}
}
Istio provides advanced networking features beyond basic Kourier:
# Install Knative with Istio networking
kubectl apply -f https://github.com/knative/net-istio/releases/download/knative-v1.14.0/net-istio.yaml
# Configure Knative to use Istio
kubectl patch configmap/config-network \
--namespace knative-serving \
--type merge \
--patch '{"data":{"ingress-class":"istio.ingress.networking.knative.dev"}}'
| Factor | Kourier | Istio |
|---|---|---|
| Complexity | Simple | Complex |
| Resource Usage | Lightweight (~50MB) | Heavy (~500MB+) |
| mTLS | No | Yes (automatic) |
| Auth Policies | No | Yes |
| Traffic Mirroring | No | Yes |
| Learning Curve | Low | High |
| Best For | Dev, simple prod | Enterprise, multi-tenant |
Recommendation: Start with Kourier. Move to Istio when you need mTLS, authorization, or are already using Istio for other workloads.
# Example ResourceQuota for a team namespace
apiVersion: v1
kind: ResourceQuota
metadata:
name: team-alpha-quota
namespace: team-alpha
spec:
hard:
pods: "50"
requests.cpu: "20"
requests.memory: "40Gi"
limits.cpu: "40"
limits.memory: "80Gi"
| Strategy | Impact | Trade-off |
|---|---|---|
Set minScale: "1" | Eliminates cold start | Always-on cost |
| Small container images | Faster image pull | Build complexity |
| Pre-pull with DaemonSet | No image pull delay | Disk space on nodes |
| Fast app startup | Reduces init time | App refactoring |
Increase target-burst-capacity | More headroom | More idle pods |
Use scale-down-delay | Prevents premature scale-down | More idle time |
# Pre-pull images with a DaemonSet
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: image-prepull
spec:
template:
spec:
initContainers:
- name: prepull
image: myregistry.azurecr.io/my-api:v1
command: ["sh", "-c", "exit 0"]
containers:
- name: pause
image: registry.k8s.io/pause:3.9
For fast, lightweight endpoints:
annotations:
autoscaling.knative.dev/target: "100"
spec:
containerConcurrency: 0 # unlimited
Result: Fewer pods, each handling many requests.
For heavy processing (ML, image processing):
annotations:
autoscaling.knative.dev/target: "1"
spec:
containerConcurrency: 1 # one at a time
Result: Many pods, each handling one request.
Tip: Load test to find the right target. Start with target = 70% of what your container can handle at acceptable latency.
1. What is the most effective way to completely eliminate cold starts?
2. When should you choose Istio over Kourier for Knative networking?
3. For an ML inference endpoint that takes 5 seconds per request, what is the recommended concurrency setup?
# Step 1: Check service status kn service describe my-api # Look at Conditions: # Ready: False # ConfigurationsReady: False # RoutesReady: True # Step 2: Check the latest revision kn revision describe my-api-00003 # Look for: # ContainerHealthy: False # ResourcesAvailable: False # Step 3: Check pods kubectl get pods -l serving.knative.dev/service=my-api kubectl describe pod my-api-00003-deployment-xxx # Common causes: # - Image pull errors (wrong image name, missing credentials) # - Container crash (check logs: kubectl logs ...) # - Readiness probe failing # - Resource limits too low (OOMKilled) # - Missing ConfigMaps or Secrets
# Revision stuck in "not ready" kubectl get revisions NAME CONFIG NAME READY REASON my-api-00003 my-api False ContainerMissing # Check revision details kubectl get revision my-api-00003 -o yaml | grep -A 10 "conditions" # Common REASON values and fixes:
| Reason | Cause | Fix |
|---|---|---|
ContainerMissing | Image not found | Check image name and registry access |
ExitCode1 | Container crashes | Check container logs |
ResourcesUnavailable | Not enough cluster resources | Scale cluster or reduce requests |
ProgressDeadlineExceeded | Pod took too long to start | Check probes, image size, startup time |
# Events not being delivered? Systematic check: # 1. Check Broker status kubectl get broker default -o yaml # Is READY: True? # 2. Check Trigger status kubectl get triggers -o wide # Are all triggers READY: True? # Is the subscriber URL correct? # 3. Check Source status kubectl get pingsource,kafkasource -o wide # 4. Deploy event-display to see what's arriving kn service create debug-display \ --image gcr.io/knative-releases/knative.dev/eventing/cmd/event_display kn trigger create catch-all --broker default --sink ksvc:debug-display kubectl logs -l serving.knative.dev/service=debug-display -f # 5. Check eventing controller logs kubectl logs -n knative-eventing -l app=eventing-controller --tail=50 # 6. Check dead letter sink for failed deliveries kubectl logs -l serving.knative.dev/service=dead-letter-handler
# AKS-specific: Internal load balancer for Kourier kubectl annotate svc kourier -n kourier-system \ service.beta.kubernetes.io/azure-load-balancer-internal="true"
# Create AKS cluster optimized for Knative
az aks create \
--resource-group myRG \
--name knative-cluster \
--node-count 3 \
--node-vm-size Standard_D4s_v3 \
--network-plugin azure \
--network-policy azure \
--enable-cluster-autoscaler \
--min-count 3 \
--max-count 20 \
--zones 1 2 3 \
--attach-acr myACR \
--enable-managed-identity
# Configure DNS for Knative
# Option 1: Use Azure DNS with external-dns
# Option 2: Use nip.io for development
# Option 3: Configure config-domain with your domain
kubectl patch configmap/config-domain \
--namespace knative-serving \
--type merge \
--patch '{"data":{"mycompany.com":""}}'
Is it an HTTP workload?
/ \
Yes No --> Regular Deployment
|
Is it stateless?
/ \
Yes No --> Regular Deployment (StatefulSet)
|
Does it have variable/bursty traffic?
/ \
Yes No (steady) --> Either works, Knative adds convenience
|
Do you want scale-to-zero?
/ \
Yes No
| |
Knative! Do you want built-in traffic splitting?
/ \
Yes No --> Regular Deployment is fine
|
Knative!
# Knative can coexist with regular Deployments! # Same cluster, same namespace, no conflicts. # Convert a Deployment to Knative Service: # 1. Take your container image # 2. Create a Knative Service YAML # 3. Move env vars, secrets, configmaps # 4. Add autoscaling annotations # 5. Deploy and test # 6. Switch DNS / traffic
Serverless on K8s, no vendor lock-in, Serving + Eventing components
Autoscaling (KPA), traffic splitting, custom domains, TLS
CloudEvents, Broker/Trigger, Sources, Sequences, dead letters
Kafka, tuning, observability, troubleshooting, AKS best practices
Your containers now scale to zero, spring back to life, split traffic, and react to events. Welcome to serverless on Kubernetes.