Skip to content

I'm currently populating my catalog on the site. Pardon the prefilled data. The entries are actively being updated and cleaned up.

Previous website
Featured Win

Tightening EKS ingress health checks to make rollouts boring again

Reduced deployment friction by aligning ALB health checks, readiness behavior, and ingress expectations so rollout failures became faster to diagnose and less disruptive.

  • Debugging Stories
  • AWS
  • Kubernetes
  • EKS
  • Debugging
Date
2026-04-21
Category
Debugging Stories
Role
Cloud / DevOps Engineer
Proof Type
Production incident fix

Impact

Turned a recurring ingress failure into a repeatable, low-drama diagnosis path with clearer rollout signals and faster recovery.

Situation#

A service looked healthy from inside the cluster, but traffic routed through the ALB kept failing health checks during deployment windows. The issue was not a single broken setting. It was drift between ingress assumptions, readiness behavior, and what the load balancer actually expected.

What I changed#

I traced the request path from Ingress to target group behavior, then tightened the interfaces between Kubernetes and AWS:

  • aligned the health check path with what the application really exposed
  • checked service-to-pod port mapping against ingress expectations
  • made readiness behavior reflect external availability more accurately
  • verified controller-created target group behavior instead of only reading manifests

Why it worked#

The problem stopped looking like "ALB is failing" and started looking like an interface mismatch between declared routing intent and runtime health semantics. Once those boundaries were made explicit, diagnosis got much faster and the rollout path became more predictable.

Operational result

The biggest win was not only the fix itself, but a calmer release path with fewer ambiguous failures during deployment.

Reusable lesson#

In EKS, ingress failures are often coordination failures across multiple control loops rather than one bad YAML field. The fastest path is usually to verify each boundary directly: Ingress, target group behavior, service mapping, pod readiness, and application path exposure.

Related Wins

Additional wins that show adjacent production improvements, design calls, and debugging work.

Cut EKS cluster costs by 34% without touching capacity

Identified and fixed a cluster cost problem caused by over-provisioned node groups and unset resource requests — without reducing actual workload capacity or changing application behaviour.

  • Infrastructure
  • Production outcome
  • AWS
  • EKS

Team offsite to Cape Coast — and why it mattered more than I expected

A two-day trip with the Scratchcode team to Cape Coast turned into one of the most useful alignment sessions we've had. A mix of work, history, and real conversations about where we're headed.

  • Team Moments
  • Team milestone
  • Team
  • Culture

Spoke at the Accra cloud community meetup

Presented a session on Kubernetes scheduling and node failure recovery to a room of about 40 engineers and students in Accra. First time speaking on infrastructure topics to a local audience.

  • Community
  • Speaking engagement
  • Speaking
  • Kubernetes