How to assume an AWS IAM role from a Service Account in EKS with Terraform

Jacob Lärfors • 5 minutes • 2021-12-07

How to assume an AWS IAM role from a Service Account in EKS with Terraform

When working with AWS Elastic Kubernetes Service (EKS) clusters, your pods will likely want to interact with other AWS services and possibly other EKS clusters. In a recent project we were setting up ArgoCD with multiple EKS clusters and our goal was to use Kubernetes Service Accounts to assume an AWS IAM role to authenticate with other EKS clusters. This led to some learning and discovering that we'd like to share with you.


When running workloads in EKS, the running pods will operate under a service account which allows us to enforce RBAC within a Kubernetes cluster. Well, we are not going to talk more about that in this post, we want to talk about how we can do things outside of our cluster and interact with other AWS services. The AWS documentation for this is fairly good if you want a reference point. There is also a workshop on this topic which might be useful to run through.

In our case, we were trying to communicate across EKS clusters to allow ArgoCD to manage multiple clusters and there is a pretty mammoth GitHub issue with people struggling (and succeeding!) with this. That GitHub issue partly inspired this blog post - if it was an easy topic people would not struggle and a blog would not be necessary ;)

The Plan#

The simple breakdown of what we need:

  1. An EKS cluster with an IAM OIDC provider
  2. A Kubernetes Service Account in the EKS cluster
  3. An AWS IAM role which we are going to assume (meaning we can do whatever that role is able to do)
  4. An AWS IAM role policy that allows our Service Account (2.) to assume our AWS IAM role (3.)

We won't bore you with creating an EKS cluster and an IAM OIDC provider... Pick your poison for how you want to do this... We personally use Terraform and the awesome EKS module that has a convenient input enable_irsa which creates the OIDC provider for us.

Basic Deployment with Terraform#

Before we create the Service Account and the IAM role we need to define the names of these as there's a bit of a cyclic dependency - the Service Account needs to know the role ARN, and the role policy needs to know the Service Account name and namespace (if we want to limit scope, which we do!).

Locals#

So let's define some locals to keep things simple and DRY.

1# locals.tf
2
3locals {
4  k8s_service_account_name      = "iam-role-test"
5  k8s_service_account_namespace = "default"
6
7  # Get the EKS OIDC Issuer without https:// prefix
8  eks_oidc_issuer = trimprefix(data.aws_eks_cluster.eks.identity[0].oidc[0].issuer, "https://")
9}

IAM#

And let's define the Terraform code that creates the IAM role with a policy allowing the service account to assume that role.

 1# iam.tf
 2
 3#
 4# Get the caller identity so that we can get the AWS Account ID
 5#
 6data "aws_caller_identity" "current" {}
 7
 8#
 9# Get the EKS cluster we want to target
10#
11data "aws_eks_cluster" "eks" {
12  name = "<cluster-name>"
13}
14
15#
16# Create the IAM role that will be assumed by the service account
17#
18resource "aws_iam_role" "iam_role_test" {
19  name               = "iam-role-test"
20  assume_role_policy = data.aws_iam_policy_document.iam_role_test.json
21}
22
23#
24# Create IAM policy allowing the k8s service account to assume the IAM role
25#
26data "aws_iam_policy_document" "iam_role_test" {
27  statement {
28    actions = ["sts:AssumeRoleWithWebIdentity"]
29
30    principals {
31      type = "Federated"
32      identifiers = [
33        "arn:aws:iam::${data.aws_caller_identity.current.account_id}:oidc-provider/${local.eks_oidc_issuer}"
34      ]
35    }
36
37    # Limit the scope so that only our desired service account can assume this role
38    condition {
39      test     = "StringEquals"
40      variable = "${local.eks_oidc_issuer}:sub"
41      values = [
42        "system:serviceaccount:${local.k8s_service_account_namespace}:${local.k8s_service_account_name}"
43      ]
44    }
45  }
46}

Service Account and Pod#

Then we need to create some Kubernetes resources. When working with Terraform it can make a lot of sense to use the Terraform Kubernetes Provider to apply our Kubernetes resources, especially as Terraform knows the ARN of the role and we can reuse our locals. However, if you don't want yet another provider dependency in Terraform you can easily do this with vanilla Kubernetes.

Terraform Kubernetes#

NOTE: you will need to configure the Kubernetes provider if you want to do this via Terraform

 1# kubernetes.tf
 2
 3#
 4# Create the Kubernetes service account which will assume the AWS IAM role
 5#
 6resource "kubernetes_service_account" "iam_role_test" {
 7  metadata {
 8    name      = local.k8s_service_account_name
 9    namespace = local.k8s_service_account_namespace
10    annotations = {
11      # This annotation is needed to tell the service account which IAM role it
12      # should assume
13      "eks.amazonaws.com/role-arn" = aws_iam_role.iam_role_test.arn
14    }
15  }
16}
17
18#
19# Deploy Kubernetes Pod with the Service Account that can assume an AWS IAM role
20#
21resource "kubernetes_pod" "iam_role_test" {
22  metadata {
23    name      = "iam-role-test"
24    namespace = local.k8s_service_account_namespace
25  }
26
27  spec {
28    service_account_name = local.k8s_service_account_name
29    container {
30      name  = "iam-role-test"
31      image = "amazon/aws-cli:latest"
32      # Sleep so that the container stays alive
33      # #continuous-sleeping
34      command = ["/bin/bash", "-c", "--"]
35      args    = ["while true; do sleep 5; done;"]
36    }
37  }
38}

Vanilla Kubernetes#

And now the same as above with vanilla Kubernetes YAML.

 1---
 2apiVersion: v1
 3kind: ServiceAccount
 4metadata:
 5  name: iam-role-test
 6  namespace: default
 7  annotations:
 8    # TODO: replace ACCOUNT_ID with your account id
 9    eks.amazonaws.com/role-arn: arn:aws:iam::<ACCOUNT_ID>:role/iam-role-test
10---
11apiVersion: v1
12kind: Pod
13metadata:
14  name: iam-role-test
15  namespace: default
16spec:
17  serviceAccountName: iam-role-test
18  containers:
19    - name: iam-role-test
20      image: amazon/aws-cli:latest
21      # Sleep so that the container stays alive
22      # #continuous-sleeping
23      command: ["/bin/bash", "-c", "--"]
24      args: ["while true; do sleep 5; done;"]

Verify the setup#

We can describe the pod (i.e. kubectl describe pod iam-role-test) and check the volumes, mounts and environment variables attached to the pod, but seeing as we way launched a pod with the AWS CLI, let's just get in there and check! Exec into the running container and execute the aws CLI:

 1# Exec into the running pod
 2kubectl exec -ti iam-role-test -- /bin/bash
 3
 4# Check the AWS Security Token Service identity
 5bash-4.2# aws sts get-caller-identity
 6{
 7    "UserId": "AROA46FON4H773JH4MPJD:botocore-session-1637837863",
 8    "Account": "889424044543",
 9    "Arn": "arn:aws:sts::889424044543:assumed-role/iam-role-test/botocore-session-1637837863"
10}
11
12# Check the AWS environment variables
13bash-4.2# env | grep "AWS_"
14AWS_ROLE_ARN=arn:aws:iam::<ACCOUNT_ID>:role/iam-role-test
15AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
16AWS_DEFAULT_REGION=eu-west-1
17AWS_REGION=eu-west-1

As you can see, the AWS Service Token Service (STS) confirms that we have successfully assumed the role we wanted to! And if we check our environment variables we can see that these have been in injected when we started the pod, and the AWS_WEB_IDENTITY_TOKEN_FILE file is the part that is sensitive and mounted when we run the container.

Alternative Approaches#

If we remove the service account from the pod and use the default service account (which exists per namespace), we can see who AWS STS thinks we are:

 1# Exec into the running pod
 2kubectl exec -ti iam-role-test -- /bin/bash
 3
 4# Check the AWS Security Token Service identity
 5bash-4.2# aws sts get-caller-identity
 6{
 7    "UserId": "AROA46FON4H72Q3SPL6SC:i-0d0aff479cf2e2405",
 8    "Account": "889424044543",
 9    "Arn": "arn:aws:sts::<ACCOUNT_ID>:assumed-role/<cluster-name>XXXXXXXXX/i-<node-instance-id>"
10}
11
12# Check the AWS environment variables
13bash-4.2# env | grep "AWS_"
14# ... it's empty!

Having created the cluster with the EKS Terraform module, it has created a role for our autoscaling node group... and what we could do is allow this role to assume another role which grants us the access that we might need...

However, I find managing Service Accounts in Kubernetes much easier than assigning roles to node groups and it seems this is the recommended approach based on searches online. However I thought it worth mentioning here anyway.

Next Steps#

Now that you can run a Pod with a service account that can assume an IAM role, just give that IAM role the permissions it needs to do what you want.

In our case, we needed the role to access another EKS cluster and so the role does not need any more policies in AWS, but it needs to be added to the aws-auth ConfigMap that controls the RBAC of the target cluster. But let's not delve into that here, there's already a ton of posts around that :)

To add a cluster in ArgoCD you can either use the argocd CLI, or do it with Kubernetes Secrets. Of course we did it with Kubernetes Secrets, and of course we did that with Terraform after we create the cluster!

 1#
 2# Get the target cluster details to use in our secret
 3#
 4data "aws_eks_cluster" "target" {
 5  name = "<cluster_name>"
 6}
 7
 8#
 9# Create a secret that represents a new cluster in ArgoCD.
10#
11# ArgoCD will use the provided config to connect and configure the target cluster
12#
13resource "kubernetes_secret" "cluster" {
14  metadata {
15    name      = "argocd-cluster-name"
16    namespace = "argocd"
17    labels = {
18      # Tell ArgoCD that this secret defines a new cluster
19      "argocd.argoproj.io/secret-type" = "cluster"
20    }
21  }
22
23  data = {
24  # Just a display name
25    name   = data.aws_eks_cluster.target.id
26    server = data.aws_eks_cluster.target.endpoint
27    config = jsonencode({
28      awsAuthConfig = {
29        clusterName = data.aws_eks_cluster.target.id
30        # NOTE: roleARN not needed as ArgoCD will already assume the role that
31        # has access to the target cluster (added to aws-auth ConfigMap)
32      }
33      tlsClientConfig = {
34        insecure = false
35        caData   = data.aws_eks_cluster.target.certificate_authority.0.data
36      }
37    })
38  }
39
40  type = "Opaque"
41}

And boom! We can now create EKS clusters with Terraform and register them with ArgoCD using our Service Account that can assume an AWS IAM Role that is added to the target cluster RBAC... Sometimes it's confusing just to write this stuff, but happy Terraform, Kubernetes and AWS'ing (and GitOps'ing with ArgoCD perhaps)!

  1. AWS EKS IAM Roles for Service Accounts (IRSA): https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
  2. Workshop on IAM Roles for Service Accounts (IRSA): https://www.eksworkshop.com/beginner/110_irsa/preparation/
  3. Create AWS IAM OIDC Provider for EKS: https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html
  4. Managing users or IAM roles in EKS: https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
  5. AWS EKS Terraform Module: https://registry.terraform.io/modules/terraform-aws-modules/eks/aws/latest
  6. ArgoCD: https://argo-cd.readthedocs.io/en/stable/
  7. Related GitHub issue on ArgoCD: https://github.com/argoproj/argo-cd/issues/2347


Comments


Read similar posts

Blog

2023-08-22

7 minutes

How to build dashboards of your Kubernetes cluster with Steampipe

In this blog post we will take a look at Steampipe, which is a tool that can be used to query all kinds of APIs using an unified language for the queries; SQL. We’ll be querying a Kubernetes cluster with Steampipe and then building a beautiful dashboard out of our queries without breaking a sweat.

Blog

2023-05-29

8 minutes

How to scale Kubernetes with any metrics using Kubernetes Event-driven Autoscaling (KEDA)

In this blog, we will try to explore how a sample application like Elastic Stack can be scaled based on metrics other than CPU, memory or storage usage.

Event

2023-04-28

2 minutes

Helsinki HashiCorp User Group Meetup #6 summary

A summary of the sixth Helsinki HashiCorp User Group (HUG) including presentations on Infrastructure as Code in early stage startups and Azure Landing Zone.

Sign up for our monthly newsletter.

By submitting this form you agree to our Privacy Policy