Configuration

Single YAML config file (/etc/certautopilot/config.yaml on standalone, rendered from the Helm config: block on Kubernetes), every value override-able via a CERTAUTOPILOT_* environment variable. Sensitive values — KEK material, JWT secret, MongoDB password, HSM PIN, API-key pepper — live only in env, never on disk.

01config.yaml — top-level keys

KeyPurpose
serverHTTP listener, TLS, trusted proxies, instance name.
databaseMongoDB connection (URI or host/port + credentials).
loggingzerolog level + format.
jwtJWT signer, access / refresh TTLs, issuer.
encryptionKEK provider (env / pkcs11) + PKCS#11 module config (key material is env-only).
telemetryOpenTelemetry tracing endpoint + Prometheus metrics toggle.
schedulerRenewal sweep cadence, leader-lock TTL.
workerMax concurrent discovery jobs.
api_keyAPI-key pepper + rate-limit tuning.

Reading order: config.yaml → env overrides → defaults. Env wins.

02server

server:
  host: 0.0.0.0        # bind address
  port: 8181           # listen port
  mode: release        # gin mode: release | debug
  instance_name: ""    # human-friendly identifier (Settings → Cluster)
  trusted_proxies:
    - 127.0.0.0/8
    - ::1/128
  tls:
    enabled: false     # backend-native TLS. Usually false; nginx/ingress terminates
    cert_file: ""
    key_file: ""

On standalone, the backend binds 127.0.0.1:18181 and nginx terminates TLS on 443. On Kubernetes, the chart binds 0.0.0.0:8181 behind a Service / Ingress.

instance_name drives the Settings → Cluster page and the leader-lock owner prefix. When empty, fallback chain is CERTAUTOPILOT_INSTANCE_NAME → POD_NAME → HOSTNAME → os.Hostname(). Values are sanitised to [A-Za-z0-9._-] and truncated to 63 chars; rewriting logs a warn at startup.

trusted_proxies determines which connection sources can set X-Forwarded-For on behalf of the real client. Mis-configured trusted_proxies = spoofable client IPs = broken rate-limiting + audit log. Keep it tight — include only your nginx loopback or ingress controller pod CIDR.

03database

database:
  uri: ""                # explicit URI overrides host/port/username/password
  host: 127.0.0.1
  port: 27017
  name: certautopilot
  username: certautopilot
  password: ""           # from env only — CERTAUTOPILOT_DATABASE_PASSWORD

If uri is set, it takes precedence and individual fields are ignored. Use uri for replica sets, TLS certificates, or DNS-SRV style mongodb+srv:// URLs.

MongoDB 6.0+ required

CertAutoPilot uses $expr / $switch inside update pipelines for atomic state transitions. MongoDB 6.0 or newer is required.

04JWT & auth tuning

jwt:
  secret: ""                 # from env only — CERTAUTOPILOT_JWT_SECRET
  access_token_ttl: 15m
  refresh_token_ttl: 168h    # 7 days
  issuer: certautopilot
  audience: certautopilot-users

api_key:
  pepper: ""                 # from env only
  pepper_previous: ""        # rotation support
  rate_limit_max_failures: 5 # per-pod failure cap before 429

Don't push access_token_ttl above an hour; it defeats quick revocation. Long refresh_token_ttl is safe because of rotation + reuse-detection — see Auth & RBAC.

05encryption

encryption:
  # provider: env | pkcs11 — locked at install time, cannot change at runtime
  provider: env
  current_version: 1
  # current_version_override pins THIS process to a specific version,
  # bypassing the keystore. Only for disaster recovery.
  # current_version_override: 1

  # Only relevant when provider == pkcs11:
  pkcs11:
    module_path: /usr/lib/softhsm/libsofthsm2.so
    token_label: certautopilot-prod
    pin_env: CERTAUTOPILOT_ENCRYPTION_PKCS11_PIN
    key_label_prefix: certautopilot-kek-v
    max_sessions: 0          # 0 = library default
    pool_wait_timeout: 0s    # 0 = wait forever
    use_gcm_iv_from_hsm: false   # true for AWS CloudHSM
Provider is install-locked

provider is written into a kek_install MongoDB document at install time and enforced on every startup. Switching between env and pkcs11 on an already-provisioned database is rejected.

Raw KEK material never lives in this file. It is supplied via per-version env vars (CERTAUTOPILOT_ENCRYPTION_ENV_KEK_V1_V{N}). For PKCS#11, the HSM PIN is pulled from the env var named in pkcs11.pin_env.

06telemetry · scheduler · worker

telemetry:
  tracing:
    enabled: false
    endpoint: http://localhost:4318   # OTLP HTTP collector
    sample_rate: 0.1                  # 0.0 – 1.0
  metrics:
    enabled: true                     # exposes GET /metrics

scheduler:
  interval: 1h                 # how often the scheduler sweeps for work
  leader_lock_ttl: 90s         # distributed lock lifetime
  heartbeat_interval: 30s      # leader re-asserts the lock this often

worker:
  max_concurrent_discovery: 4  # cap on parallel discovery jobs per worker

Scheduler mode (serve --mode=scheduler or --mode=all) runs a MongoDB-backed distributed lock — only one replica is active at any time. See Observability for how the metrics + traces flow into Prometheus / OTLP.

07Environment variable mapping

Every YAML key has an env equivalent. Naming rules:

  • Prefix: CERTAUTOPILOT_.
  • Separator: underscore; nested keys are joined with _.
  • All uppercase.
  • Env wins over file.
YAML keyEnvironment variable
server.portCERTAUTOPILOT_SERVER_PORT
database.uriCERTAUTOPILOT_DATABASE_URI
logging.levelCERTAUTOPILOT_LOGGING_LEVEL
jwt.access_token_ttlCERTAUTOPILOT_JWT_ACCESS_TOKEN_TTL
scheduler.intervalCERTAUTOPILOT_SCHEDULER_INTERVAL
encryption.pkcs11.module_pathCERTAUTOPILOT_ENCRYPTION_PKCS11_MODULE_PATH

08Secrets — env-only, never on disk

These must never land in config.yaml, Helm values, or shell history

The values below are loaded by systemd via EnvironmentFile= on standalone, or from a Kubernetes Secret on Helm. Anything pasted into a terminal with history enabled should be rotated.

VariablePurposeShape
CERTAUTOPILOT_JWT_SECRETHMAC key for access + refresh JWTs.≥ 32 bytes entropy. openssl rand -base64 48.
CERTAUTOPILOT_DATABASE_PASSWORDMongoDB app-user password (when uri is not used).Any string.
CERTAUTOPILOT_DATABASE_URIFull MongoDB connection URI (carries credentials).mongodb://user:pass@host/db or mongodb+srv://…
CERTAUTOPILOT_API_KEY_PEPPERPepper added to SHA-256 of every API key on hash / compare.64 hex chars (32 bytes). openssl rand -hex 32.
CERTAUTOPILOT_API_KEY_PEPPER_PREVIOUSOld pepper during rotation — verification-only.Same shape; drop after all keys reissued.
CERTAUTOPILOT_ENCRYPTION_ENV_KEK_V{N}Raw KEK material, one per version (env provider).64 hex chars = 32 bytes. V1 required; add V2 before rotation.
CERTAUTOPILOT_ENCRYPTION_CURRENT_VERSIONFresh-install seed for the first KEK version.Integer. Only consulted when kek_versions is empty (first install).
CERTAUTOPILOT_ENCRYPTION_PKCS11_PINHSM user PIN (only when provider = pkcs11).String. CloudHSM uses user:password; other vendors just the PIN.

09TLS — standalone (nginx)

The standalone installer provisions nginx as the TLS terminator. The backend binds 127.0.0.1:18181 plain HTTP; only nginx talks to it.

curl -fsSL https://raw.githubusercontent.com/CloudNativeWorks/certautopilot-archive/main/get.sh \
  | sudo bash -s -- --version=1.4.0 --mongo=local \
    --tls=self-signed \
    --bind-host=0.0.0.0 \
    --port=443 \
    --extra-hostnames=cap.example.com,cap-admin.example.com
  • --tls=self-signed — installer generates a 10-year cert with SANs for localhost, 127.0.0.1, ::1, --bind-host, and every --extra-hostnames entry.
  • --tls=provided --cert=<path> --key=<path> — use your own cert. Rerun the bootstrap to swap TLS material; other state is preserved.
  • nginx config lands at /etc/nginx/conf.d/certautopilot.conf; TLS material at /etc/certautopilot/tls/.

10TLS — Kubernetes (ingress)

The Helm chart exposes a NodePort Service by default. Front it with whatever ingress you use.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: certautopilot
  namespace: cap
  annotations:
    nginx.ingress.kubernetes.io/proxy-body-size: "16m"
    nginx.ingress.kubernetes.io/proxy-read-timeout: "120"
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  ingressClassName: nginx
  tls:
    - hosts: [ "cap.example.com" ]
      secretName: cap-tls
  rules:
    - host: cap.example.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: certautopilot
                port: { number: 8181 }

Add the cluster's pod CIDR to server.trusted_proxies so the backend trusts the ingress controller's X-Forwarded-For:

# values.yaml
config:
  server:
    trustedProxies:
      - 127.0.0.0/8
      - ::1/128
      - 10.244.0.0/16    # cluster pod CIDR

Bootstrapping the first cert

Chicken-and-egg: CertAutoPilot issues certs, but it itself needs one to accept browser logins. Two patterns:

  • External issuance for CertAutoPilot only. Issue the CertAutoPilot-fronting cert via cert-manager + Let's Encrypt (K8s) or one-off certbot (standalone). Once CertAutoPilot is up, have it issue the replacement on the next rotation.
  • Self-signed forever. Use --tls=self-signed and distribute the root via your MDM / config management. Common in air-gapped deployments.

11Health endpoints

  • /healthz — liveness. Returns 200 once the HTTP server is up.
  • /readyz — readiness. Returns 200 once DB + registry are ready; 503 during startup or if the DB becomes unreachable.
  • Both are unauthenticated. Don't expose them to the public internet through the ingress — use a server.trusted_proxies-restricted ACL or a basic-auth wrapper.

12License

CertAutoPilot uses Ed25519-signed licenses verified offline against a public key baked into the binary at build time (internal/license.PublicKey). Licenses can also be activated online against the CNW License API (https://license-api.cloudnativeworks.com); both modes coexist.

Plans & cert limits

PlanCert limitEnterprise features
free1
starter5
advance25
enterpriseunlimited (0)ldap, otp_policy, syslog

Plans are defined in internal/license/features.go. The Enterprise plan is the only tier that enables LDAP / Active Directory federation, the OTP enforcement policy, and Syslog forwarding — those three are gated; everything else (ACME, MSCA, discovery, distribution modules, KEK rotation, approvals, PKCS#11 HSM) ships in every plan.

Activate: Settings → License (org owner role) → paste your license key. The backend verifies signature + expiry, registers fingerprint + activation ID with the License API (when an API key is configured), and caches the validated state in MongoDB. Air-gap deployments use the offline path — same key, no network round-trip.

Enforcement

  • Cert-count limit: issuance blocks once active certs reach the plan's cap. Renewals still run on existing certs even at the cap — existing infrastructure is protected.
  • Feature gates: ldap false → LDAP page disabled (login still works for non-LDAP users); otp_policy false → OTP policy card hidden in Settings → Users; syslog false → Syslog forwarding page disabled. The presence-or-absence of these flags is the entire enterprise-feature surface today.

Expiry & grace

When the license exp passes, the backend enters a 7-day grace window (license.GracePeriodDuration = 7 * 24h). During grace, enterprise features keep working and the API responds normally; the UI surfaces an expired-banner. After grace, enterprise feature gates close and an admin must upload a renewed license. Cert renewals always continue regardless of license state — production infra is never broken by an expired license.

Renewals NEVER stop during grace

Letting prod infrastructure break because a license date rolled past the weekend is the opposite of what this product is for. Existing cert lifecycle runs; only enterprise features tighten after the grace window.

License status endpoint

GET /api/v1/license/status returns the cached license view (LicenseStatusView in internal/service/license_service.go):

{
  "valid": true,
  "plan": "enterprise",
  "plan_name": "Enterprise",
  "cert_limit": 0,
  "features": { "ldap": true, "otp_policy": true, "syslog": true },
  "expires_at": "2027-04-20T00:00:00Z",
  "in_grace_period": false,
  "grace_remaining_seconds": 0,
  "license_key": "...",
  "fingerprint": "...",
  "activation_id": "...",
  "activated_at": "2026-04-20T09:00:00Z",
  "last_checked_at": "2026-04-28T12:00:00Z",
  "api_key_configured": true,
  "license_mode": "online"
}

13Troubleshooting

"License signature invalid" or "expired" right after upload

Either a corrupted paste (stray whitespace / line breaks) or system-clock drift — the backend refuses tokens > 5 minutes in the past. Check NTP on the host.

"License requires feature X, not supported by this build"

You upgraded across a major feature boundary. Get a license issued against the new build's public key.

Almost always a SameSite mismatch or non-HTTPS dev. Secure cookies are refused over HTTP; behind a reverse proxy, set server.trusted_proxies so the backend correctly identifies origin.

Rate-limit bypassed / audit log shows wrong IP

trusted_proxies too permissive. Restrict to the exact loopback / pod CIDR your fronting proxy uses.