MeshWorld India Logo MeshWorld.
aws eks kubernetes karpenter devops 4 min read

AWS EKS Production Tuning Cheatsheet: The Complete Reference

Jena
By Jena
AWS EKS Production Tuning Cheatsheet: The Complete Reference

Deploying Kubernetes on AWS using Elastic Kubernetes Service (EKS) requires optimal configurations to achieve high availability, scale efficiently, and control operational costs. Standard out-of-the-box settings are insufficient for high-concurrency production deployments.

This reference sheet covers Karpenter-driven autoscaling, AWS VPC CNI network tuning, EKS Pod Identity setups, and cost-effective node pools.


- **Karpenter Autoscaling**: Replace slow Auto Scaling Groups (ASGs) with Karpenter to provision instances directly from AWS EC2 APIs. - **Prefix Delegation**: Expand pod capacity per node significantly using AWS VPC CNI prefix delegation. - **EKS Pod Identities**: Authenticate pods directly to AWS resources using EKS Pod Identity bindings, bypassing complex IRSA OIDC configurations. - **Spot Instances**: Leverage mixed Spot and On-Demand architectures with Karpenter to reduce monthly compute bills.

Before diving into this cheatsheet, check out my previous deep-dive on Nginx Cheat Sheet: Routing, SSL & Performance Guide to see how we structured these patterns in practice.

Deploying Karpenter for Fast Autoscaling

Karpenter is a highly efficient cluster autoscaler that bypasses legacy AWS Auto Scaling Groups (ASGs). It talks directly to the Amazon EC2 API, launching the exact node size required by pending workloads in under 15 seconds.

1. Define Karpenter NodePool (nodepool.yaml)

apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: general-compute
spec:
  template:
    spec:
      requirements:
        # Request Spot instances for non-critical workloads, fallback to On-Demand
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64", "arm64"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
      nodeClassRef:
        name: default-ec2-nodeclass
  # Limit the maximum aggregate resources Karpenter can provision
  limits:
    cpu: 1000
    memory: 4000Gi
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h # Rotate nodes every 30 days

2. Define AWS EC2NodeClass (nodeclass.yaml)

apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default-ec2-nodeclass
spec:
  amiFamily: Bottlerocket # High-security, minimal container operating system
  role: KarpenterNodeRole-meshworld
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: meshworld-eks-cluster
  securityGroupSelectorTerms:
    - tags:
        kubernetes.io/cluster/meshworld-eks-cluster: owned

AWS VPC CNI Network Tuning

By default, the AWS VPC Container Network Interface (CNI) assigns a real primary IP address from your VPC CIDR block to every pod. This can exhaust node IP limits quickly. Enable prefix delegation to solve this issue.

# 1. Enable Prefix Delegation (allows attaching /28 prefix blocks instead of single IPs)
kubectl set env daemonset aws-node -n kube-system AWS_VPC_K8S_CNI_CUSTOM_NETWORK_CFG=true
kubectl set env daemonset aws-node -n kube-system ENABLE_PREFIX_DELEGATION=true

# 2. Configure Warm IP Targets to prevent massive AWS EC2 API call rates
# Keeps a pool of 16 IPs warm on each node
kubectl set env daemonset aws-node -n kube-system WARM_IP_TARGET=16

# 3. Verify daemonset logs for confirmation
kubectl logs -n kube-system -l k8s-app=aws-node

Authenticating with EKS Pod Identities

EKS Pod Identities simplify how applications running inside pods authenticate with AWS services (like S3 or DynamoDB), replacing the complex legacy IAM Roles for Service Accounts (IRSA) OIDC setup.

1. Create a Kubernetes Service Account

apiVersion: v1
kind: ServiceAccount
metadata:
  name: app-s3-reader-sa
  namespace: production

2. Configure AWS CLI Association

Link the service account to your target IAM Role using the AWS CLI or Terraform. This assigns credentials dynamically at startup.

# Link Service Account to your AWS IAM Role
aws eks create-pod-identity-association \
  --cluster-name meshworld-eks-cluster \
  --namespace production \
  --service-account app-s3-reader-sa \
  --role-arn arn:aws:iam::111122223333:role/EksS3ReaderRole

Production Pod Disruption Budgets (PDBs)

To guarantee service availability during cluster upgrades or consolidation, always establish Pod Disruption Budgets to restrict voluntary evictions.

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: api-service-pdb
  namespace: production
spec:
  # Guarantees that at least 80% of replicas are always active
  minAvailable: 80%
  selector:
    matchLabels:
      app: api-service