Skip to content

tdz-cloud/argo-bootstrap

Repository files navigation

Argo Bootstrap

This repository bootstraps a multi-cluster Kubernetes environment with Argo CD, using a metal cluster that hosts vClusters for operations and workloads.

  • Secure Application Publishing: Utilizes Cloudflare Tunnel (provisioned via OpenTofu) to safely expose services and manage DNS for private clusters.

  • Automated Multi-Cluster Bootstrapping: Leverages Argo CD ApplicationSets templates to provision multiple clusters and workloads from a single source of truth.

  • Centralized Secret Management: Integrates HashiCorp Vault and ExternalSecrets to securely manage and distribute secrets across multiple clusters.

  • Multi-Tenant Virtualization: Deploys vcluster alongside OpenTofu to provide isolated virtual clusters for multi-tenant workloads.

  • Secure Private Networking: Embeds ZeroTier and Tailscale support to establish secure remote access to private services when required.

Homepage

Architecture

flowchart LR
  subgraph Operator[Operator workstation]
    Tools["vcluster + kubectl + helm + OpenTofu"]
  end

  Tools --> CloudInit["Cloud init (OpenTofu)"]
  CloudInit --> Cloudflare["Cloudflare DNS + Tunnel"]
  CloudInit --> RepoAccess["GitHub tokens/OAuth"]
  CloudInit --> Metal["Metal Kubernetes cluster"]

  Metal --> Argo["Argo CD"]
  Argo -->|ApplicationSets| MetalApps["Metal apps"]
  Argo -->|ApplicationSets| OpsApps["Operations apps"]
  Argo -->|ApplicationSets| WorkApps["Workloads apps"]

  Metal --> OpsV["vCluster: operations"]
  Metal --> WorkV["vCluster: workloads"]

  Vault["Vault"] -->|ExternalSecrets| OpsApps
  Vault -->|ExternalSecrets| WorkApps

  VPN["ZeroTier/Tailscale"] --> Metal
  Cloudflare --> Metal
Loading

Project layout

  • root/: bootstrap charts, configs, secrets, cloud-init, and vault-init
  • metal/: base cluster infrastructure (ingress, security, observability, kube-infra)
  • operations/: operations vCluster workloads (ingress, security, observability)
  • workloads/: application workloads (media, idp, services, observability)

Prerequisites

Tools:

Accounts and access:

  • Cloudflare account with API key and zone/account ID
  • GitHub PAT and OAuth app credentials for Argo CD SSO
  • Vault token with permissions to configure Kubernetes auth
  • ZeroTier or Tailscale (optional, for private routes)

Steps

0. Configure inputs

  1. Create OpenTofu inputs for cloud-init:

    cp root/cloud-init/terraform.tfvars.example root/cloud-init/terraform.tfvars

    Update values in root/cloud-init/terraform.tfvars, including:

    • Cloudflare zone, account ID, email, and API key
    • GitHub token and OAuth client secret
    • github_repo_url/github_username if you are using a fork
    • kube_context and private_ip
    • ZeroTier/Tailscale toggles and tokens if enabled
  2. Update Argo CD settings in root/argocd/values.yaml (domain, GitHub org, OAuth client ID).

  3. Update repository settings in root/bootstrap/values.yaml (repo URL and target revision).

  4. Prepare Vault inputs for root/vault-init:

    • Export TF_VAR_vault_token and optionally TF_VAR_vault_address, or
    • Create root/vault-init/terraform.tfvars with vault_token (and vault_address if needed).

1. Cloud init

Provision Cloudflare DNS, Cloudflare Tunnel, Argo CD OAuth secret, and the repo token.

cd root/cloud-init
tofu init
tofu apply

2. Install Argo CD

cd root/argocd
./apply.sh

3. Bootstrap Argo CD applications

This creates ApplicationSets for root/, metal/, operations/, and workloads/.

cd root/bootstrap
helm template \
    --namespace argocd \
    --values ./values.yaml \
    bootstrap . \
    | kubectl apply -n argocd -f -

4. Wait for metal, operations, and workloads clusters

Expect errors early on because the operations and workloads clusters cannot access Vault yet. The following applications are expected to fail temporarily:

  • Operations vCluster:
    • external-dns (cannot retrieve token from Vault)
    • cert-manager (cannot retrieve token from Vault)
    • Mimir, Loki, Grafana, Tempo, Grafana Agent (cannot retrieve token from Vault)
  • Workloads vCluster:
    • external-dns (cannot retrieve token from Vault)
    • cert-manager (cannot retrieve token from Vault)
    • Kubecost (cannot connect to Mimir)

The metal cluster should be healthy because it does not require Vault to retrieve tokens.

5. Prepare vCluster kubeconfigs for Vault init

cd root/vault-init
vcluster connect metal-workloads \
    -n vcluster \
    --update-current=false \
    --server=https://workloads-vcluster.internal.systemrestartengineering.cloud \
    --kube-config ./workloads-kubeconfig.yaml

vcluster connect metal-operations \
    -n vcluster \
    --update-current=false \
    --server=https://operations-vcluster.internal.systemrestartengineering.cloud \
    --kube-config ./operations-kubeconfig.yaml

6. Vault init

cd root/vault-init
tofu init
tofu apply

Access private routes

ZeroTier is used to reach private routes. After creating a ZeroTier network by running OpenTofu in root/cloud-init/zerotier.tf, install the ZeroTier client and join the network using the generated network ID. If you enable Tailscale instead, provide the auth key in root/cloud-init/terraform.tfvars.

VCluster

Retrieve the kubeconfig for the workloads cluster:

vcluster connect metal-workloads \
    -n vcluster \
    --update-current=false \
    --server=https://workloads-vcluster.internal.systemrestartengineering.cloud \
    --kube-config ./workloads-kubeconfig.yaml

Retrieve the kubeconfig for the operations cluster:

vcluster connect metal-operations \
    -n vcluster \
    --update-current=false \
    --server=https://operations-vcluster.internal.systemrestartengineering.cloud \
    --kube-config ./operations-kubeconfig.yaml

Known issues

Kubecost uses the "anonymous" tenant when sending data to Mimir

opencost/opencost/pull/1928

External DNS cannot update records for *.internal.systemrestartengineering.cloud

The operations and workloads clusters share the same DNS zone (*.internal.systemrestartengineering.cloud). If external-dns runs in both clusters, they will conflict and neither will be able to update records.

About

Argocd source of truth for multiple K8s clusters

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors