Why this note matters#
A lot of people learn Kubernetes by memorizing commands:
kubectl get podskubectl apply -f deployment.yamlkubectl rollout restart
That works for getting started, but it does not explain what Kubernetes is actually doing under the hood.
If you want Kubernetes to feel less mysterious, you need a stronger mental model.
This is the one I keep coming back to:
Kubernetes is a desired state engine.
You tell it what you want. The control plane keeps comparing reality to that desired state and keeps acting until they match.
Core idea
Kubernetes is not just a place where containers run. It is a control system that keeps trying to make the cluster match the declared state.
- Desired state
- RBAC
- Scheduling
- Reconciliation
The request flow that explains most of Kubernetes#
When a request enters Kubernetes, there is a sequence behind it.
1. Request -> API Server2. Authentication -> Who are you?3. Authorization -> What are you allowed to do?4. Execution -> Scheduler and controllers actThis flow matters because it explains a lot of real-world behavior.
For example:
- why one user can deploy but another cannot
- why a service account can read pods but not secrets
- why a Deployment keeps creating replacement pods
- why a pod can stay Pending even when the Deployment exists
The API server is the front door.
Everything meaningful passes through it.
API server: the entry point#
The Kubernetes API server is the central interface for the cluster.
It receives requests from:
- humans using
kubectl - CI/CD pipelines
- controllers
- operators
- internal components
When you apply a manifest, you are not directly creating pods yourself.
You are sending a desired state declaration to the API server.
That declaration gets stored, validated, and then acted on by other control plane components.
This is why Kubernetes feels declarative instead of imperative.
You are usually saying:
This is the state I want.
Not:
Run these exact steps in this exact order forever.
Authentication vs authorization#
These two concepts are often mixed together, but they solve different problems.
Authentication: who are you?#
Authentication is about identity.
Kubernetes needs to know who is making the request before it can decide what that request is allowed to do.
Common authentication methods include:
- client certificates
- bearer tokens
- OIDC
- cloud provider identity flows
In real-world managed clusters, human access often comes through cloud identity.
That usually looks like this:
- you log in through AWS, Azure, or another identity system
- you receive a token
- the API server validates that token
Human access in practice#
For humans, Kubernetes access is often tied to cloud IAM and organization identity.
That is why production access usually feels like:
- identity provider login
- short-lived credential or token
- API server validation
- RBAC check
This is better than using long-lived shared credentials because the identity is clearer and easier to audit.
Pod access inside the cluster#
Pods also need identity.
Inside Kubernetes, workloads typically use service accounts.
That gives a pod a Kubernetes identity inside the cluster.
If that workload also needs access to cloud resources, that identity can be bridged into the cloud. In AWS, a common example is IRSA.
That is a much better model than embedding cloud access keys inside the application.
Security direction
Modern Kubernetes access patterns favor short-lived identity-based credentials for both humans and workloads instead of static secrets.
Authorization: what are you allowed to do?#
Once identity is known, Kubernetes moves to authorization.
This is where the question becomes:
What actions can this identity perform?
The most common answer in Kubernetes is RBAC.
RBAC stands for Role-Based Access Control.
It defines permissions through:
RoleClusterRoleRoleBindingClusterRoleBinding
These are normal Kubernetes resources stored through the API server, just like many other objects in the cluster.
Where roles and permissions are defined#
RBAC is declarative too.
Roles and bindings are defined as YAML and stored in the cluster.
Role#
Namespace-scoped permissions.
ClusterRole#
Cluster-wide permissions or reusable permission sets.
RoleBinding#
Connects a Role to a user, group, or service account inside a namespace.
ClusterRoleBinding#
Connects a ClusterRole to a subject at cluster scope.
Real-world RBAC example#
This role allows reading pod information in the dev namespace:
apiVersion: rbac.authorization.k8s.io/v1kind: Rolemetadata: name: pod-reader namespace: devrules: - apiGroups: [""] resources: ["pods"] verbs: ["get", "list", "watch"]Then bind it to a user:
apiVersion: rbac.authorization.k8s.io/v1kind: RoleBindingmetadata: name: read-pods-binding namespace: devsubjects: - kind: User name: aliceroleRef: kind: Role name: pod-reader apiGroup: rbac.authorization.k8s.ioThe result is:
alicecan view pods indevalicecannot automatically modify or delete them- the API server enforces both identity and permission checks
That is the security chain in action:
Identity verified -> Permissions checked -> Request allowed or deniedControllers: the part that keeps the cluster honest#
Once a request is accepted, Kubernetes still has to make something happen.
This is where controllers become important.
Controllers continuously:
- watch cluster state
- compare desired state with actual state
- act to close the gap
That loop is the heart of Kubernetes.
Watch -> Compare -> Act -> RepeatIf you create a Deployment with replicas: 3, Kubernetes is not done after storing that YAML.
A controller keeps checking whether 3 matching pods are really running.
If only 2 exist, it creates another one. If one crashes, it creates another one. If the version changes, it drives the rollout process.
Core controller responsibilities#
Controllers carry a lot of the operational behavior people associate with Kubernetes.
Replication#
Controllers ensure the correct number of pods exists.
If a pod disappears, the controller works to restore the declared replica count.
Node health response#
Controllers also react when nodes fail.
If a node stops reporting properly:
- it can be marked
NotReady - workloads on it become unavailable
- replacement pods can be created on healthy nodes if capacity exists elsewhere
Continuous correction#
This is the bigger lesson:
Kubernetes is not a one-time execution engine.
It is a continuously correcting system.
Scheduler: the placement engine#
The scheduler is often mentioned alongside controllers, but it does a specific job:
It decides where a pod should run.
It is not about time. It is about placement.
Useful correction
The scheduler does not run your workload itself. It chooses the most suitable node for a pending pod based on cluster state and scheduling rules.
If a pod stays in Pending, that often means the scheduler could not find a valid placement.
What the scheduler looks at#
Scheduling is not random.
The scheduler evaluates whether a node is a valid and sensible target based on several factors.
1. Resource availability#
At the simplest level, the target node needs enough available capacity for the pod's requests.
That can include:
- CPU
- memory
- GPU
If no node can satisfy the pod's requests, the pod stays pending until capacity appears.
2. Affinity and anti-affinity#
These are placement rules based on relationships with other pods.
They help answer questions like:
- should these workloads stay close together?
- should these replicas be spread apart?
Affinity: place related pods together#
Affinity is useful when two workloads benefit from being close.
A common reason is lower latency between tightly coupled services.
apiVersion: apps/v1kind: Deploymentmetadata: name: frontendspec: replicas: 2 selector: matchLabels: app: frontend template: metadata: labels: app: frontend spec: affinity: podAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - backend topologyKey: "kubernetes.io/hostname" containers: - name: frontend image: nginxWhat this means:
- the scheduler looks for pods labeled
app=backend - it tries to place the
frontendpod on the same node requiredDuringSchedulingIgnoredDuringExecutionmakes this a hard rule during scheduling
Anti-affinity: spread similar pods apart#
Anti-affinity helps reduce failure blast radius.
That is common for replicated services that should not all land on one node.
apiVersion: apps/v1kind: Deploymentmetadata: name: payment-servicespec: replicas: 3 selector: matchLabels: app: payment template: metadata: labels: app: payment spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app operator: In values: - payment topologyKey: "kubernetes.io/hostname" containers: - name: payment image: nginxThe result is simple:
- do not place multiple payment replicas on the same node if this rule must be enforced
That improves availability when a node fails.
3. Taints and tolerations#
Taints and tolerations work from a different angle.
Instead of expressing pod-to-pod relationships, they control whether certain pods are allowed onto certain nodes.
Taints: node-side keep-out rules#
A taint is applied to the node.
Example:
kubectl taint nodes node1 key=value:NoScheduleThat tells the scheduler:
Do not place pods here unless they tolerate this taint.
Tolerations: pod-side permission to enter#
A toleration does not force a pod onto a node.
It only says:
This pod is allowed to run on a node with a matching taint.
Example:
apiVersion: apps/v1kind: Deploymentmetadata: name: allowed-podspec: replicas: 1 selector: matchLabels: app: allowed template: metadata: labels: app: allowed spec: containers: - name: nginx image: nginx tolerations: - key: "key" operator: "Equal" value: "value" effect: "NoSchedule"This pod can be scheduled onto a node tainted like this:
kubectl taint nodes node1 key=value:NoScheduleNow compare that with a mismatched toleration:
apiVersion: apps/v1kind: Deploymentmetadata: name: blocked-podspec: replicas: 1 selector: matchLabels: app: blocked template: metadata: labels: app: blocked spec: containers: - name: nginx image: nginx tolerations: - key: "key" operator: "Equal" value: "wrong-value" effect: "NoSchedule"That does not match the taint value, so the pod is still blocked from that node.
A more flexible version uses Exists:
tolerations: - key: "key" operator: "Exists" effect: "NoSchedule"That means the pod tolerates any taint using that key, regardless of value.
A simple scheduler mental model#
If scheduling rules feel abstract, this summary helps:
Affinity = keep related pods togetherAnti-affinity = spread similar pods apartTaints = block pods from a node by defaultTolerations = allow specific pods onto tainted nodesWhere these rules are defined#
| Feature | Where Defined |
|---|---|
| Affinity / Anti-affinity | Deployment or Pod YAML |
| Tolerations | Deployment or Pod YAML |
| Taints | Applied on nodes through infrastructure or kubectl |
Deployments and rolling updates#
Deployments are one of the easiest ways to see the control loop in practice.
When you update a Deployment, Kubernetes does not usually destroy everything at once.
Instead, it performs a rolling update.
That generally looks like this:
- Create a new pod
- Wait for it to become healthy
- Remove an old pod
- Repeat until the new version fully replaces the old one
That behavior gives you important benefits:
- lower downtime risk
- safer rollout progression
- simpler rollback path when something goes wrong
In CI/CD, this is powerful because the pipeline usually only needs to submit the new desired state.
The Deployment controller then handles the rollout mechanics.
Node health and resilience#
Kubernetes reliability depends heavily on node health.
Imagine this scenario:
- A node stops responding
- The cluster detects that problem
- The node may be marked
NotReady - Workloads on that node become unavailable
- Replacement pods are scheduled elsewhere if healthy capacity exists
This is why single-node environments are dangerous for important workloads.
If the only node fails, there is nowhere else to place replacements.
Single-node limit
Kubernetes can recreate pods only if there is another healthy node with enough capacity to accept them. Self-healing still depends on available infrastructure.
The stronger pattern is:
- multiple nodes
- spread across Availability Zones
- scheduling rules that avoid stacking critical replicas on one machine
That is where Kubernetes moves from demo environment to resilient platform.
Cost vs resilience#
There is always a trade-off here.
More nodes usually mean:
- more spare capacity
- better failure tolerance
- higher infrastructure cost
A common platform strategy is to use spare capacity intentionally:
- place lower-priority workloads there
- use
PriorityClass - allow preemption when critical workloads need room
This is a good example of real cloud engineering:
you are not only making the system work, you are balancing reliability and cost.
Important controller types across the ecosystem#
Once you understand the reconciliation model, the wider Kubernetes ecosystem makes much more sense.
Workload controllers#
DeploymentReplicaSetStatefulSetDaemonSetJobCronJob
Autoscaling#
HPAVPACluster AutoscalerKarpenterKEDA
Networking#
- Ingress controllers
- AWS Load Balancer Controller
IstioLinkerd
Security and policy#
OPA GatekeeperKyvernoFalco
Secrets and config#
- External Secrets Operator
cert-manager
Observability#
- Prometheus Operator
- Grafana
GitOps and delivery#
- Flux
- Argo CD
- Tekton
Infrastructure#
- Crossplane
- AWS Controllers for Kubernetes
Backup and chaos#
- Velero
- LitmusChaos
These tools can feel unrelated at first, but many of them follow the same core pattern:
- watch a resource
- compare actual state with desired state
- act until they match
That is why the controller model is so important.
The final mental model#
If you want one compact map of Kubernetes internals, use this:
API Server -> entry pointAuth -> who you areRBAC -> what you can doScheduler -> where it runsControllers -> make and keep it runningNodes -> where workloads liveThis is the model that helps many separate ideas click into one system.
Golden rule#
At its core, Kubernetes keeps coming back to the same loop:
Watch. Compare. Act.
That is why it can:
- recreate lost pods
- roll out new versions
- enforce placement rules
- reconcile platform resources continuously
Closing thought#
Once you start seeing Kubernetes as a control system rather than just a container runtime, a lot of confusion disappears.
You are no longer only thinking:
- how do I run this Deployment?
You are also thinking:
- how does the request enter the cluster?
- how is identity verified?
- how are permissions enforced?
- who decides placement?
- what keeps the declared state true over time?
That shift is what starts turning Kubernetes from a tool you use into a system you actually understand.
