Skip to content

ci: keyless K8s creds via GitHub OIDC->Vault (retire KUBECONFIG_DEV/PROD)#76

Merged
mattmattox merged 1 commit into
mainfrom
ci/keyless-scoped-credentials
Jul 2, 2026
Merged

ci: keyless K8s creds via GitHub OIDC->Vault (retire KUBECONFIG_DEV/PROD)#76
mattmattox merged 1 commit into
mainfrom
ci/keyless-scoped-credentials

Conversation

@mattmattox

Copy link
Copy Markdown
Contributor

Summary

Converts the Kubernetes deploy path of this pipeline to keyless credentials (GitHub OIDC -> HashiCorp Vault), retiring the two org admin secrets KUBECONFIG_DEV and KUBECONFIG_PROD. No long-lived kubeconfig is stored in GitHub after this.

The Vault + RBAC backend is already provisioned and verified:

  • jwt auth mount github-actions, role gha-website (audience https://github.com/SupportTools)
  • Kubernetes creds path kubernetes-onprem/creds/website mints a 1h token for SA website-ci-deployer, scoped to get/patch only the six ArgoCD Applications supporttools-{mst,dev,qas,tst,stg,prd} in the argocd namespace.

What changed

Applied identically to both K8s deploy jobs (Deploy-NonProd [matrix mst/dev/qas/tst] and Deploy-Prod [matrix stg/prd]):

  • Added permissions: { contents: read, id-token: write } (mint the GitHub OIDC token).
  • Replaced the Setup Kubeconfig step that base64-decoded secrets.KUBECONFIG_DEV / secrets.KUBECONFIG_PROD with two steps:
    1. hashicorp/vault-action@v3 JWT login (path: github-actions, role: gha-website, jwtGithubAudience: https://github.com/SupportTools, exportToken: true).
    2. A run: step that mints the short-lived, namespace-scoped Kubernetes token from kubernetes-onprem/creds/website (guards non-empty/!= null, masks it) and builds the kubeconfig file targeting https://kubernetes.default.svc:443 with the in-cluster SA CA.
  • Removed all secrets.KUBECONFIG_DEV / secrets.KUBECONFIG_PROD references. The same keyless token serves every env (one a1-ops-prd cluster; there is no separate dev cluster).
  • Removed the Deploy ArgoCD Project step (kubectl apply -f argocd/project.yaml) from both jobs and left a one-line comment: the AppProject is cluster-services-managed, and the scoped token cannot apply AppProjects. All six Applications already exist, so only the patch branch of the deploy block fires.
  • Downstream deploy + health-poll steps are unchanged — they keep using kubectl --kubeconfig kubeconfig, which now points at the runtime-minted kubeconfig.

runs-on

No change — both deploy jobs were already self-hosted-linux (in-cluster ARC runner in arc-runners-supporttools), which is required for the keyless kubeconfig (in-cluster apiserver + SA CA) to work.

Not touched

  • The image build/push (Build job, DockerHub supporttools/website, DOCKER_USERNAME/DOCKER_PASSWORD), Helm packaging, and BOT_TOKEN helm-chart push are all left exactly as-is.
  • on: triggers unchanged (workflow_dispatch, push on main, nightly schedule).

Note

This is a draft PR. After merge, the org admin secrets KUBECONFIG_DEV and KUBECONFIG_PROD can be deleted.

@mattmattox mattmattox deployed to development July 2, 2026 20:01 — with GitHub Actions Active

# Keyless: GitHub OIDC -> Vault jwt auth. exportToken makes VAULT_TOKEN available to the mint step.
- name: Vault login (GitHub OIDC)
uses: hashicorp/vault-action@v3

# Keyless: GitHub OIDC -> Vault jwt auth. exportToken makes VAULT_TOKEN available to the mint step.
- name: Vault login (GitHub OIDC)
uses: hashicorp/vault-action@v3
@mattmattox mattmattox marked this pull request as ready for review July 2, 2026 21:06
@mattmattox mattmattox merged commit 51321cc into main Jul 2, 2026
8 of 11 checks passed
@mattmattox mattmattox deleted the ci/keyless-scoped-credentials branch July 2, 2026 21:07

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 8466102bed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +153 to +156
KUBE_TOKEN=$(curl -sf -H "X-Vault-Token: ${VAULT_TOKEN}" \
-X PUT -d '{"kubernetes_namespace":"arc-runners-supporttools"}' \
"${VAULT_ADDR}/v1/kubernetes-onprem/creds/website" \
| jq -r '.data.service_account_token')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Grant create before keeping the bootstrap branch

This kubeconfig is now sourced from the Vault-issued least-privilege token, but the deploy script still treats a missing supporttools-${ENVIRONMENT} Application as recoverable by piping argocd/${ENVIRONMENT}.yaml into kubectl apply below. The new role is scoped for the existing get/patch path, so if one of the six Applications is deleted or a cluster is rebuilt, this branch now fails with an RBAC error instead of recreating it as the old kubeconfig did; the same pattern exists in Deploy-Prod. Either give the token create on those Applications or replace the branch with a clear precondition failure.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants