I recently had a requirement to securely access a couple of Google Cloud APIs as a service account user, where those calls were being made from a Fargate task running on AWS. The until-relatively-recently way to do this was:
- Create a service account in the Google Cloud developer console
- Assign it whatever permissions it needs
- Create a ‘key’ for the account – in essence a long-lived private key used to authenticate as that service account
- Use that key in your Cloud SDK calls from your AWS Fargate instance
This isn’t ideal, because of that long-lived credential in the form of the ‘key’ – it can’t be scoped to require a particular originator and while you can revoke it from the developer console, if the credential leaks you’ve got an infinitely long-lived token usable from anywhere – you’d need to know it had leaked to prevent its use.
Google’s Workload Identity Federation is the new hotness in that regard, and is supported by almost all of the client libraries now. Not the .NET one though, irritatingly, which is why this post from Johannes Passing is, if you need to do this from .NET-land, absolutely the guide to go to.
The new approach is more in line with modern authentication standards and uses federation between AWS and Google Cloud to support generating short-lived, scoped credentials that are used for the actual work and no secrets needing to be shared between the two environments.
The docs are broadly excellent, but I was pleased at how clever the AWS <-> Google Cloud integration is given that there isn’t any AWS-supported explicit identity federation actually happening, in the sense of established protocols (like OIDC, which both clouds support in some fashion).
How it works
On the Google Cloud side, you set up a ‘Workload identity pool’ – in essence a collection of external identities that can be given some access to Google Cloud services. Aside from some basic metadata, a pool has one or more ‘providers’ associated with it. A provider represents an external source of identities, for our example here AWS.
A provider can be parameterised:
- Mappings translate between the incoming assertions from the provider and those of Google Cloud’s IAM system
- Conditions restrict the identities that can use the identity pool via a rich syntax
You can also attach Google service accounts to the pool, allowing those accounts to be impersonated by identities in the pool. You can restrict access to a given service account via conditions, in a very similar way to restricting access to the pool itself.
To get an access token on behalf of the service account, a few things are happening (in the background for most client libraries, and explicitly in the .NET case).
Authenticating with the pool
In AWS land, we authenticate with the Google pool by asking it to exchange a provider-issued token for one that Google’s STS will recognise. For AWS, the required token is (modulo some encoding and formatting) a signed ‘GetCallerIdentity’ request that you might yourself send to the AWS STS.
Our calling code in AWS-land doesn’t finish the call – we don’t need to. Instead, we sign a request and then pass that signed request to Google which makes the call itself. We include in the request (and the fields that are signed over) the URI of the ‘target resource’ on the Google side – the identity pool that we want to authenticate to.
The response from AWS to Google’s call to the STS will include the ARN of the identity for whom credentials on the AWS side are available. If you’re running in ECS or EC2, these will represent the IAM role of the executing task.
We need share nothing secret with Google to do this, and we can’t fake an identity on AWS that we don’t have access to.
- The ARN of the identity returned in the response to GetCallerIdentity includes the AWS account ID and the name of any assumed role – the only thing we could ship to Google is proof of an identity that we already have access to on the AWS side.
- The Google workflow identity pool identifier is signed over in the GetCallerIdentity request, so the token we send to Google can only be used for that specific user pool (and Google can verify that, again with no secrets involved). This means we can’t accidentally ship a token to the wrong pool on the Google side.
- The signature can be verified without access to any secret information by just making the request to the AWS STS. If the signature is valid, Google will receive an identity ARN, and if the payload has been tampered with or is otherwise invalid then the request will fail.
None of the above requires any cooperation between AWS and Google cloud, save for AWS not changing ARN formats and breaking identity pool conditions and mappings.
What happens next?
All being well, the Google STS returns to us a temporary access token that we can then use to generate a real, scoped access token to use with Google APIs. That token can be nice and short lived, restricting the window over which it can be abused should it be leaked.
What about for long-lived processes?
Our tokens can expire in a couple of directions:
- Our AWS credentials can and will expire and get rolled over automatically by AWS (when not using explicit access key IDs and just using the profile we’re assuming from the execution role of the environment)
- Our short-lived Google service account credential can expire
Both are fine and handled the same way – re-run the whole process. Signing a new GetCallerIdentity request is quick, trivial and happens locally on the source machine. And Google just has to make one API call to establish that we’re still who we said we were and offer up a temporary token to exchange for a service account identity.