Header image

Setup Azure Container Registry Caching

Table of Contents

Introduction

If you’re building container images in ACR using a public base image from Docker Hub (docker.io), you might have run into Docker Hub’s pull rate limits. This happens because when you run an ACR build task (e.g. you are building your image with az acr build), the build doesn’t happen on your machine or your CI runner — it runs inside Azure.

Docker Hub applies rate limits to unauthenticated pulls based on the source IP address, and ACR Tasks run on shared Azure build infrastructure that uses shared outbound IPs. So if lots of people in the same region are pulling images from Docker Hub, everyone is effectively sharing the same quota.

Even outside of builds, if your apps or clusters are pulling images straight from Docker Hub a lot, you can run into the same issue. Using ACR and enabling caching can help alleviate this problem.

General Gist

Broadly speaking, to enable ACR caching we will need to:

  • Obtain set of credentials for the docker.io registry (just a free account is fine to follow along with this)
  • Store these credentials securely in a Key Vault
  • Add a caching rule to our ACR for the image we would like to cache that uses these credentials
  • Update our image references to point to our ACR rather than to Docker Hub

How to

You can find a a Terraform / OpenTofu configuration that sets up an ACR with caching enabled then you can find that here. I’ll be referring to it throughout the rest of the post.

Creating Docker Hub Credentials

If you don’t already have one, create an account on docker.io here

Then:

  • Click your initials in the top right and then go to Account Settings in the dropdown menu
  • Go to Personal Access Tokens
  • Create yourself a PAT with read access giving it a descriptive name and an expiry date.
  • Make a note of your PAT (hopefully in your password manager :) ) as it will only be shown once.

Saving the PAT in your Key Vault

resource "azurerm_key_vault_secret" "docker_username" {
  name             = "docker-username"
  value_wo         = var.docker_username
  value_wo_version = 1
  key_vault_id     = azurerm_key_vault.kv.id
  expiration_date = time_offset.secret_expiration.rfc3339

  depends_on = [azurerm_role_assignment.kv_admin]
}

resource "azurerm_key_vault_secret" "docker_password" {
  name             = "docker-password"
  value_wo         = var.docker_password
  value_wo_version = 1
  key_vault_id     = azurerm_key_vault.kv.id
  expiration_date = time_offset.secret_expiration.rfc3339

  depends_on = [azurerm_role_assignment.kv_admin]
}

Here we are saving our docker credentials in our Key Vault. As an aside, we are using a write only value for our secrets. This means that our creds are not stored in plain text (or at all!) in our state file.

Allow our ACR to connect to Docker Hub

resource "azurerm_container_registry_credential_set" "docker_credentials" {
  name                  = "docker-hub-credentials"
  container_registry_id = azurerm_container_registry.acr.id
  login_server          = "docker.io"

  identity {
    type = "SystemAssigned"
  }

  authentication_credentials {
    username_secret_id = azurerm_key_vault_secret.docker_username.versionless_id
    password_secret_id = azurerm_key_vault_secret.docker_password.versionless_id
  }
}

This one caught me off guard - the credential set needs its own System Assigned Managed Identity, it doesn’t use the one for the ACR itself. This means that we need to grant the identity used by the credential set permission to access the Key Vault:


resource "azurerm_role_assignment" "kv_secrets_user_docker_username" {
  scope                = azurerm_key_vault_secret.docker_username.resource_versionless_id
  role_definition_name = "Key Vault Secrets User"
  principal_id         = azurerm_container_registry_credential_set.docker_credentials.identity[0].principal_id
}

resource "azurerm_role_assignment" "kv_secrets_user_docker_password" {
  scope                = azurerm_key_vault_secret.docker_password.resource_versionless_id
  role_definition_name = "Key Vault Secrets User"
  principal_id         = azurerm_container_registry_credential_set.docker_credentials.identity[0].principal_id
}

Here we are granting the Key Vault Secrets User role which will allow the identity used by the credential set to read the user name and password secret. We scoped the secrets to the individual secrets rather than the entire Key Vault.

Create a cache rule to store the image from Docker Hub in our ACR

resource "azurerm_container_registry_cache_rule" "nginx_cache" {
  name                  = "nginx-cache-rule"
  container_registry_id = azurerm_container_registry.acr.id
  target_repo           = "nginx"
  source_repo           = "docker.io/library/nginx"
  credential_set_id     = azurerm_container_registry_credential_set.docker_credentials.id
}

And here is where the magic happens! We are adding our cache rule to store our image from Docker Hub (in this case nginx) in our ACR so we can pull it from there next time.

The target_repo is what the repo will be called in your ACR, so to pull the nginx image from there, you would use the image: <YOUR_ACR_NAME>.azurecr.io/nginx

Testing

To do a quick test, we can try to pull this image. You’ll need docker and az cli installed to follow along here.

I’m assuming you have already logged in using az cli.

Login to your acr:

az acr login -n <YOUR_ACR_NAME>

Pull your cached image:

docker pull <YOUR_ACR_NAME>.azurecr.io/nginx

The first pull will create the repository in your ACR and store the image and subsequent pulls will use the image stored in ACR., rather than the one on docker hub.

Updating the cached images

Your cached image is not automatically updated. So, if you pull the latest tag, and it is cached, it will never be updated unless you delete and recreate the repository. If you are using versioned tags then that’s not a huge issue, but it is something you need to be aware of.