DNS-01 troubleshooting

A symptom-driven walk through the things that go wrong with DNS-01 — propagation, restricted zones, stale records, NS-glue mismatches, and edge cases on registrar-provided DNS.

01"Validating" forever

The job sits in validating for several minutes, then fails. Almost always one of:

  • TTL too high on a stale record. A previous _acme-challenge record with TTL 3600 still cached at recursors. Delete manually and wait the TTL out, or shorten the TTL on your zone defaults.
  • Wrong nameserver delegation. The registrar's NS records and your DNS host's NS records disagree. dig +short NS example.com @8.8.8.8 and compare to your provider's UI.
  • NS-glue caching. Recently moved DNS hosts? Some recursors hold the old delegation for 24h. If you can, validate from a recursor that won't have cached the old NS — dig _acme-challenge.api.example.com TXT @1.1.1.1.

02"Insufficient permissions"

The credential cannot create or delete the TXT record. Each provider has different scopes — DNS providers lists the minimum required. Common mistakes:

  • Cloudflare token scoped to the wrong zone (Specific Zone set to a sibling).
  • Route 53 IAM policy missing route53:GetChange — the worker can write but cannot poll for propagation.
  • Azure service principal scoped at subscription instead of zone resource group.

03Locked-down production zones

Your security team won't give you write access to example.com. Use CNAME aliasing: a one-time CNAME from _acme-challenge.<name>.example.com to a zone you own (e.g. acme.example.org), and CertAutoPilot writes there instead.

04CAA blocks issuance

The CA returns a caa error. Check existing CAA at the apex:

dig +short CAA example.com

If the apex has a CAA record listing only certain CAs (letsencrypt.org, digicert.com), other CAs are blocked. Either add the CA you're using or remove the restriction. CAA inheritance follows the closest enclosing record.

05Public-suffix names

You cannot get a certificate for a name on the Public Suffix List itself (github.io, azurewebsites.net) unless you control its DNS — which by definition you usually don't. If you're seeing this on something custom, register the parent zone and try again.

06DNSSEC mismatch

If your zone is DNSSEC-signed and the parent has stale DS records, recursors will reject responses. Symptoms: SERVFAIL from public resolvers, even though the records exist at the authoritative server. Fix the DS chain at the registrar.

07Rate limits

Let's Encrypt enforces 50 certs/registered-domain/week and 5 duplicate certs/week. Hitting either returns a rateLimited error. Switch to staging while iterating; certify only what you need on production.

08Where to look

Every issuance job has structured logs visible in the UI: Jobs → click the job → Logs tab. Each step is a separate event with timing. The propagation step shows the resolver and the answer — copy that into a dig command on the worker host to rule out network differences.