Introduction: Why This Topic Matters#
When I first learned Kubernetes, I thought:
"If I can deploy everything with YAML... and even simplify it with Helm... what else is left?"
That assumption holds until you hit real-world systems.
Helm helps you install applications.
But production systems do not just need installation. They need:
- backups
- scaling logic
- failover
- lifecycle management
This is where things start breaking down.
Helm gets you to Day 1, which is deployment.
But production lives in Day 2, which is operations.
And that is exactly where Kubernetes Operators come in.
Core Concept: First-Principles Explanation#
At its core, Kubernetes works like this:
You declare a desired state, then Kubernetes tries to make reality match it.
This is powered by controllers.
For example:
- You say: "I want 3 pods."
- Kubernetes ensures 3 pods are always running.
So what is an Operator?#
An Operator is just a custom controller with domain knowledge.
- A controller manages generic resources like Pods and Deployments.
- An operator manages your application logic.
Simple mental model#
- Controller = "Keep things running"
- Operator = "Run my system like an expert would"
Key Components and Building Blocks#
To understand operators, you only need 3 pieces:
1. Custom Resource Definition#
A Custom Resource Definition, or CRD, defines a new type in Kubernetes.
2. Custom Resource#
A Custom Resource, or CR, is an instance of that type. It is your actual configuration.
3. Operator#
The operator contains the controller logic. It watches the resource and acts on it.
Example: your own resource#
apiVersion: example.com/v1kind: DatabaseBackupspec: schedule: "*/5 * * * *"This is not built-in Kubernetes.
You just taught Kubernetes a new concept:
DatabaseBackup
How It Works: Step-by-Step Flow#
Let us simplify the entire flow:
- You define a CRD, so Kubernetes now understands your custom type.
- You create a Custom Resource.
- The operator is running inside the cluster.
- The operator watches the Kubernetes API.
- When your resource changes, the operator reacts.
- The operator performs actions such as scaling, backups, or failover.
Important insight#
Operators do not magically monitor your app directly.
They:
- watch Kubernetes objects via the API
- optionally call external systems when needed
Real-World Example: Database Backup Operator#
Let us use something every engineer understands:
Database backups.
Instead of manually:
- writing cron jobs
- managing scripts
- remembering schedules
You define:
kind: DatabaseBackupspec: schedule: "*/10 * * * *"And the operator ensures:
- backups run
- retries happen
- failures are handled
Code and Implementation#
Example file: databasebackup_crd.yaml#
What it does:
Defines a new Kubernetes resource type called DatabaseBackup.
apiVersion: apiextensions.k8s.io/v1kind: CustomResourceDefinitionmetadata: name: databasebackups.example.comspec: group: example.com names: kind: DatabaseBackup plural: databasebackups scope: Namespaced versions: - name: v1 served: true storage: true schema: openAPIV3Schema: type: object properties: spec: type: object properties: schedule: type: stringExample file: operator.py#
What it does:
Watches DatabaseBackup resources and reacts when one is created.
import kopf @kopf.on.create('databasebackups')def handle_backup(spec, **kwargs): schedule = spec.get('schedule') print(f"Backup triggered with schedule: {schedule}")In real life, this would create a Kubernetes Job, call a backup API, or trigger a workflow.
Example file: operator-deployment.yaml#
What it does:
Runs the operator inside the cluster.
apiVersion: apps/v1kind: Deploymentmetadata: name: backup-operatorspec: replicas: 1 selector: matchLabels: app: backup-operator template: metadata: labels: app: backup-operator spec: containers: - name: operator image: your-operator-image:latestImportant:
- It runs as a Pod.
- It is not exposed externally.
- It just watches the cluster.
Project and Folder Structure#
project-root/├── infra/├── platform/├── services/├── operator/│ ├── crds/│ │ └── databasebackup_crd.yaml│ ├── operator.py│ ├── operator-deployment.yaml│ └── examples/│ └── backup.yaml├── ci-cd/Key idea:
- The operator is isolated.
- It acts as an automation layer.
- It works alongside your apps.
How It Runs in Real Life#
kubectl apply -f databasebackup_crd.yamlkubectl apply -f operator-deployment.yamlkubectl apply -f backup.yamlFlow:
- The CRD is registered.
- The operator starts as a pod.
- You create a resource.
- The operator reacts instantly.
Multi-language ecosystem#
| Language | Library | Docs |
|---|---|---|
| Python | Kopf | https://docs.kopf.dev/ |
| Go | Kubebuilder | https://book.kubebuilder.io/ |
| JavaScript | @dot-i/k8s-operator | https://github.com/dot-i/k8s-operator |
Final Mental Model#
Here is the simplest way to think about it:
- Helm installs your app.
- Kubernetes keeps it running.
- Operator runs it like an expert.
Bonus: Where This Fits in Real Systems#
In real production setups:
- Terraform provisions EKS.
- CI/CD builds and deploys.
- Helm installs apps.
- Operators automate operations.
- GitOps tools like Argo CD keep everything in sync.
Final Thought#
Operators are not "advanced Kubernetes."
They are simply:
"Encoding operational knowledge into code."
And once that clicks, everything else becomes easier.
