Skip to content

Cross-account TargetGroupBinding

This page is the design deep-dive that complements the project page. It covers the IAM threading, the SG topology, and the controller config that actually makes this work in production.

There are two AWS accounts. Both have VPCs connected via Transit Gateway (the platform is peering-free by policy).

Internet
┌────────────────────────────────────────────────────────┐
│ AWS account A — "edge" (e.g. shared-staging) │
│ │
│ ACM cert ALB │
│ │ │ │
│ └─────────▶│ Listener :443 → TG-1 │
│ ▼ │
│ TargetGroup-1 │
│ (type: ip) │
│ ▲ │
│ │ pod IPs registered │
│ │ (cross-account) │
│ │ │
│ ┌───────────┴──────────────┐ │
│ │ AWS-assumable role for │ │
│ │ AWS LBC in account B │ │
│ └──────────────────────────┘ │
└────────────────────────────────────────────────────────┘
(TGW)
┌────────────────────────────────────────────────────────┐
│ AWS account B — "workload" (cluster account) │
│ │
│ EKS cluster │
│ │ │
│ ▼ │
│ AWS LBC pod ─── Pod Identity ─── IAM role │
│ │ in account B │
│ │ │ │
│ │ ▼ │
│ │ sts:AssumeRole into account A │
│ │ │ │
│ ▼ ▼ │
│ TargetGroupBinding CRD │
│ spec: │
│ targetGroupARN: arn:...:A/TG-1 │
│ serviceRef: my-app-svc │
│ │
│ App Service ─── pods (in B's pod CIDR) │
│ │
│ ALB health probes → pod IPs (via TGW) │
└────────────────────────────────────────────────────────┘

Three roles matter:

1. AWS LBC’s Pod Identity role (account B)

Section titled “1. AWS LBC’s Pod Identity role (account B)”

The Pod Identity association binds the AWS LBC ServiceAccount to an IAM role in account B. That role has the LBC’s normal permissions on B’s resources, plus:

{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Resource": "arn:aws:iam::ACCOUNT_A:role/aws-lbc-cross-account"
}

In account A, a role exists that account B’s Pod Identity role can assume:

// trust policy
{
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::ACCOUNT_B:role/aws-lbc-pod-identity" },
"Action": "sts:AssumeRole"
}
// permissions policy
{
"Effect": "Allow",
"Action": [
"elasticloadbalancing:DescribeTargetHealth",
"elasticloadbalancing:DescribeTargetGroups",
"elasticloadbalancing:RegisterTargets",
"elasticloadbalancing:DeregisterTargets"
],
"Resource": "arn:aws:elasticloadbalancing:REGION:ACCOUNT_A:targetgroup/*"
}

Note: scope-down to specific target-group ARN patterns if you want stricter blast radius.

3. EC2 / VPC permissions stay in account B

Section titled “3. EC2 / VPC permissions stay in account B”

The AWS LBC still needs ec2:DescribeInstances, ec2:DescribeSecurityGroups, etc. — but those are on account B’s own resources and don’t need cross-account threading.

Once IAM is wired, you teach the cluster about the cross-account target group:

apiVersion: elbv2.k8s.aws/v1beta1
kind: TargetGroupBinding
metadata:
name: my-app-tgb
namespace: apps
spec:
serviceRef:
name: my-app
port: 80
targetGroupARN: arn:aws:elasticloadbalancing:us-east-1:ACCOUNT_A:targetgroup/my-app-tg/abcd1234
targetType: ip
ipAddressType: ipv4
# health-check config inherits from the TG itself; no override needed

The AWS LBC reconciles this, calls RegisterTargets against account A’s target group (via the AssumeRole), and the ALB starts seeing pod IPs from account B’s pod CIDR.

The ALB in account A has its own SG. That SG’s outbound rule has to allow egress to the workload pod CIDR in account B. Since cross-account SG references don’t compose across TGW-routed accounts, you do this with CIDR rules:

resource "aws_security_group_rule" "alb_egress_to_workload_pod_cidr" {
type = "egress"
from_port = 0
to_port = 65535
protocol = "tcp"
cidr_blocks = [var.workload_account_pod_cidr]
security_group_id = aws_security_group.alb.id
}

Similarly, the workload-side pod-SG (or securityGroup on the EKS managed node group / Karpenter NodePool) needs to allow ingress from the ALB SG’s CIDR (the ALB VPC’s CIDR is sufficient if you’re not pinpointing).

Without a readiness gate, the ALB’s target.health = healthy and the Kubernetes pod.status.ready = true race. A rolling deploy can take down old pods before the new pods are healthy in the ALB, dropping connections.

The AWS LBC ships a pod readiness gate that holds the pod in Ready=false until the ALB has the target marked healthy. Turn it on per-namespace:

apiVersion: v1
kind: Namespace
metadata:
name: apps
labels:
elbv2.k8s.aws/pod-readiness-gate-inject: enabled

Now rolling deploys are health-aware end-to-end.

In the AWS LBC Helm values, add:

serviceAccount:
create: false
name: aws-load-balancer-controller
# Pod Identity association set in account B's IAM, not here
controller:
extraArgs:
- --aws-region=us-east-1
- --cluster-name=<your-workload-cluster>
# No flag is needed for AssumeRole — the SDK picks it up from the
# AssumeRole policy attached to the Pod Identity role.

The trick is no special LBC flag is needed for the cross-account reconcile. The TargetGroupBinding’s targetGroupARN field tells the controller which account to operate on; the SDK resolves credentials via the AssumeRole chain the Pod Identity role exposes.

Without the pod readiness gate, a flapping target can trigger:

  1. Pod boots, registers as target (cross-account)
  2. Pod’s first request is the ALB health check; the app isn’t warm; check fails
  3. ALB marks target unhealthy → pod isn’t sent traffic
  4. Pod is still Ready=true in Kubernetes, so a rolling deploy moves on to the next pod
  5. Old pod was terminated → no healthy targets → 503

The readiness gate breaks the cycle: step 4 doesn’t happen because the pod isn’t Kubernetes-Ready until ALB-healthy.

This pattern was end-to-end validated on a pilot staging cluster → shared-edge account during May 2026. The next day, it rolled across all six staging clusters in the fleet — Pod Identity association, cross-account IAM role, TargetGroupBinding CRDs, SG egress rules, readiness gates — using the ArgoCD ApplicationSet pattern from GitOps engine to fan out the cluster-side config.