Container Orchestration Strategy¶
Current State¶
Running: Atlantis, Cloudflare Tunnel (via Ansible + Podman + systemd)
Planned: Zitadel, monitoring stack, more internal tools
Decision: Nomad¶
| Services | Solution | Complexity |
|---|---|---|
| 1-5 | Ansible + Podman/systemd | Low |
| 5-15 | Nomad (single node) | Medium |
| 15-50 | Nomad (cluster) | Medium |
| 50+ | Kubernetes | High |
Why Nomad:
- Lightweight (~50MB RAM vs K8s 1GB+)
- Single binary, works on single node
- Auto-healing, rolling deploys, health checks
- HashiCorp ecosystem (aligns with OpenTofu)
- Simple upgrade path to cluster when needed
Options Comparison¶
| Aspect | Ansible+Podman | Nomad | Kubernetes |
|---|---|---|---|
| Learning curve | Days | Weeks | Months |
| Min resources | Minimal | ~50MB | ~1GB |
| Auto-healing | No (just systemd) | Yes | Yes |
| Rolling deploys | Manual | Built-in | Built-in |
| Single node | Yes | Yes | Awkward |
| HA cluster | N/A | 3+ nodes | 3+ nodes |
Recommended Path¶
Current: Ansible + Podman/systemd
↓ (at 5+ services)
Next: Nomad (single node)
↓ (if need HA)
Later: Nomad cluster
↓ (if 50+ services or client requires)
Maybe: Kubernetes
Trigger for Nomad: When managing 5+ services or Ansible playbooks become unwieldy.
Related Documentation¶
- Identity & Access - Authentication architecture
- Ansible & Podman - Current deployment guide
- Kubernetes Deep Dive - K8s evaluation