Setting Up Kubernetes Orchestration for Mobile App Backend
A mobile app with 50,000 active users creates a load that varies 10–15 times throughout the day: morning peak, lunch, evening. On a single server, this means either wasting resources at night or degradation at peak times. Kubernetes solves this through horizontal auto-scaling, rolling updates without downtime, and service isolation.
Basic Architecture for Mobile Backend
Typical set of components:
- API Deployment — stateless service, scales horizontally
- WebSocket Service — separate Deployment with sticky sessions or via Redis Pub/Sub
- Worker Deployment — background task processing (image resizing, push sending)
- PostgreSQL — StatefulSet or managed (RDS, Cloud SQL)
- Redis — StatefulSet or managed (ElastiCache, Memorystore)
- Ingress — nginx-ingress or Traefik with TLS termination
Deployment and HPA
apiVersion: apps/v1
kind: Deployment
metadata:
name: mobile-api
namespace: production
spec:
replicas: 3
selector:
matchLabels:
app: mobile-api
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0 # Zero-downtime
template:
metadata:
labels:
app: mobile-api
spec:
containers:
- name: api
image: ghcr.io/myorg/mobile-api:1.2.3
ports:
- containerPort: 3000
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: url
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: redis-credentials
key: url
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health/live
port: 3000
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: mobile-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: mobile-api
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
The preStop sleep for 5 seconds is needed so the Ingress has time to remove the Pod from rotation before it starts closing connections.
Secrets and Confidential Data
APNs .p8 keys, FCM server key, JWT secrets — in Kubernetes Secrets:
kubectl create secret generic apns-credentials \
--from-file=AuthKey_XXXXXX.p8 \
--from-literal=key_id=XXXXXXXXXX \
--from-literal=team_id=YYYYYYYYYY
For production, External Secrets Operator with AWS Secrets Manager or HashiCorp Vault is recommended — then secret rotation doesn't require recreating Kubernetes Secret manually.
WebSocket and Sticky Sessions
WebSocket — stateful connection. During rolling update, the old Pod should wait for all active connections to close. Nginx-ingress configuration:
nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
nginx.ingress.kubernetes.io/upstream-hash-by: "$remote_addr"
But it's better to make WebSocket service stateless via Redis Pub/Sub: client connects to any pod, messages are routed via Redis channel. Then rolling update is transparent.
Monitoring and Observability
Prometheus + Grafana — standard for Kubernetes. For mobile backend, key metrics:
-
http_request_duration_secondswith percentiles p50/p95/p99 -
websocket_connections_active -
push_notification_delivery_rate -
database_pool_sizeanddatabase_query_duration
Alerts on p99 latency > 2s, error rate > 1%, pod restart count > 3 in 5 minutes.
CI/CD with GitOps
ArgoCD or Flux track changes in Git repository with manifests and apply them to the cluster. CI only builds the image and updates the tag in the manifest via kustomize edit set image or Helm values:
# .github/workflows/deploy.yml
- name: Update image tag
run: |
cd k8s/overlays/production
kustomize edit set image ghcr.io/myorg/mobile-api=ghcr.io/myorg/mobile-api:${{ github.sha }}
git commit -am "deploy: ${{ github.sha }}"
git push
ArgoCD sees the commit and syncs the cluster.
Process
Audit current infrastructure → design namespace/RBAC structure → write manifests (Deployment, Service, Ingress, HPA) → set up Secrets management → configure monitoring → integrate with CI/CD → load test auto-scaling → write documentation.
Timeline: 5 days for a typical backend on GKE/EKS/AKS. Cost is calculated individually after analyzing architecture and SLA requirements.







