Fix Azure ACR Auth Failure Causing ImagePullBackOff in AKS
AKS pod stuck in ContainerCreating? ACR authentication mismatch is the usual suspect. Here's how to fix it fast.
Quick answer: Run az aks update -n to re-authenticate the cluster with your ACR. That fixes the root cause 90% of the time.
If you're seeing your Azure Kubernetes Service pod stuck in ContainerCreating with ImagePullBackOff, I know the frustration. This tripped me up the first time I migrated a production workload to AKS. The error literally means Kubernetes can't pull your container image from Azure Container Registry. The usual suspect? ACR authentication slipped—maybe you rotated registry keys, recreated the ACR, or the cluster's managed identity lost its permissions. Or your kubectl secret went stale. Let's fix it step by step.
Why This Happens
- Your AKS cluster uses a managed identity or service principal to pull images from ACR. If that identity loses the
AcrPullrole assignment, pulling fails. - You created a Kubernetes
docker-registrysecret manually, but the credentials expired or got corrupted. - Your pod spec references an image tag that doesn't exist in ACR (e.g., typos or wrong tag).
Step-by-Step Fix
- Check the pod event log
Runkubectl describe pod. Look for events like "Failed to pull image" with a 401 or 403 error. If you see "unauthorized: authentication required", you've got an ACR auth issue.-n - Verify ACR exists and the image tag is correct
Useaz acr repository show-tags --name. Confirm the tag matches your pod spec. A typo like--repository --output table v1.0instead of1.0will trigger ImagePullBackOff. - Re-attach the ACR to your AKS cluster
This is the nuclear option and the one I'd try first if you're sure the image tag is correct. Run:
This grants the AKS cluster's managed identity theaz aks update -n-g --attach-acr AcrPullrole. It's idempotent, so no harm running multiple times. - Delete any stale pull secrets and let AKS regenerate them
If you manually created a pull secret, delete it:
Then restart the deployment:kubectl delete secret-n kubectl rollout restart deployment. AKS's mutating webhook will inject the correct secret automatically if you attached the ACR.-n - Force a pod restart
After re-attaching, delete the stuck pod:kubectl delete pod. The ReplicaSet will create a new one that should pull the image successfully.-n
Alternative Fixes (If the Main Fix Fails)
- Manually create an ACR pull secret using admin credentials
If managed identity isn't your thing (or you're on an older AKS version), you can use the ACR admin account. Enable admin in the Azure portal under ACR > Access keys, then run:
Then reference this secret in your pod spec underkubectl create secret docker-registry acr-secret \ --docker-server=.azurecr.io \ --docker-username= \ --docker-password=$(az acr credential show --name --query "passwords[0].value" -o tsv) \ --namespace imagePullSecrets. - Check if your ACR is behind a firewall or private endpoint
If your ACR has a firewall or uses private endpoints, the AKS cluster must be in the same VNet or have a service endpoint. Runaz acr show --nameto see restrictions. If needed, add the AKS node subnet to the ACR firewall allow list.--query networkRuleSet - Regenerate the service principal credentials
If you're using a service principal instead of managed identity, its secret might have expired. Update it with:
Then re-attach the ACR.az aks update-credentials -g-n --reset-service-principal
Prevention Tips
- Always use managed identity for ACR access—it's simpler and auto-rotates credentials. Enable it when creating your AKS cluster with
--enable-managed-identity. - Set up AKS cluster auto-upgrade so you don't fall behind on patches that might fix auth issues.
- Use a CI/CD pipeline that tags images with commit hashes or semantic versions, not
latest. That way you never accidentally overwrite a tag. - Monitor pod events with Azure Monitor or a tool like
kubectl eventsto catch ImagePullBackOff before it impacts users.
I've seen this error pop up after AKS node pool upgrades or when someone accidentally deleted the ACR. The --attach-acr flag is your best friend—it re-establishes the trust relationship between the cluster and the registry. If that doesn't work, check the firewall rules or rotate your secrets. You'll be back up in minutes.
Was this solution helpful?