Merge branch 'feat/v0.5-vm-security' — 7 blocks + 3 shell + 2 Rust + /vm-provision

Workspace Cargo.toml reconciled: all 8 crates (kei-ledger, kei-migrate, kei-changelog, ssh-check, firewall-diff, mock-render, visual-diff, tokens-sync) as members.
This commit is contained in:
Parfii-bot 2026-04-21 21:15:49 +08:00
commit 19850e1a45
30 changed files with 3674 additions and 1 deletions

View file

@ -0,0 +1,50 @@
# DEPLOY — Hetzner Cloud (CX22 / CAX11 + TF + Cloud Firewall)
**Why Hetzner:** cheapest EU VPS with reputable network. CX22 (x86, 2 vCPU / 4 GB / 40 GB) = **€3.79/mo + VAT**; CAX11 (Ampere ARM64, 2 vCPU / 4 GB / 40 GB) = **€3.79/mo + VAT**. Prices verified on <https://www.hetzner.com/cloud/> [VERIFIED 2026-04-21]. Hourly billing caps at the monthly rate — safe to spin down for tests.
**Terraform provider:** `hetznercloud/hcloud` (official). Pin version:
```hcl
terraform {
required_providers {
hcloud = { source = "hetznercloud/hcloud", version = "~> 1.49" }
}
}
provider "hcloud" { token = var.hcloud_token }
```
Token via env: `export HCLOUD_TOKEN=$(grep ^HCLOUD_TOKEN ~/.claude/secrets/.env | cut -d= -f2)`. **NEVER commit the token** (RULE 0.8 — see `domain-has-secrets.md`).
**Minimal `hcloud_server` resource:**
```hcl
resource "hcloud_server" "node" {
name = "kei-${var.env}-${var.role}"
image = "debian-12"
server_type = var.arch == "arm64" ? "cax11" : "cx22"
location = var.location # fsn1 / nbg1 / hel1 / ash / hil / sin
ssh_keys = [hcloud_ssh_key.admin.id]
user_data = file("${path.module}/cloud-init.yaml")
firewalls { firewall_id = hcloud_firewall.base.id }
labels = { project = "kei", env = var.env }
}
```
`ssh_keys` is **mandatory** — passing it disables the root password e-mail path.
**Cloud Firewall (stateful, IN by default DENY):**
```hcl
resource "hcloud_firewall" "base" {
name = "kei-base"
rule { direction = "in" protocol = "tcp" port = "22" source_ips = var.admin_cidrs }
rule { direction = "in" protocol = "icmp" source_ips = ["0.0.0.0/0", "::/0"] }
# Add app ports (443, 80) only when an app is deployed behind the node.
}
```
Attach to the server via `firewalls { firewall_id = … }`. Cloud Firewall is the FIRST line of defense — it drops traffic before it hits the VM's ufw (see `security-firewall-ufw.md`). Both layers MUST agree.
**Locations:** `fsn1` (Falkenstein DE), `nbg1` (Nürnberg DE), `hel1` (Helsinki FI), `ash` (Ashburn US), `hil` (Hillsboro US), `sin` (Singapore). Pick region closest to users; ARM64 `cax*` available in EU only [VERIFIED 2026-04-21].
**Snapshots + rescue:** `hcloud_snapshot` for golden images; `hcloud server enable-rescue` before SSH lockout recovery. Back up `user_data` and TF state (remote backend: S3-compatible such as R2).
**Primitives provided by KeiSeiKit:**
- `_primitives/provision-hetzner.sh` — wrapper around `hcloud` CLI, idempotent create/destroy, checks existing server by name first.
- Complement with `_primitives/harden-base.sh` run over SSH after first boot.
**Forbidden:** hcloud token in `.tf` or `.tfvars` committed to git; Cloud Firewall with port 22 open to `0.0.0.0/0`; creating servers with `keep_disk = false` then snapshotting (destroys data); using Hetzner Storage Boxes for anything needing low latency (they're SFTP-over-WAN).

View file

@ -0,0 +1,79 @@
# DEPLOY — Generic VPS (provider-agnostic cloud-init + ssh-first-contact)
**Target providers:** DigitalOcean Droplets, Vultr, UpCloud, Linode/Akamai. Each has slightly different Terraform providers + CLIs, but the Day-0 contract is identical: **boot a Debian/Ubuntu image with a cloud-init user-data blob; add one admin SSH key; nothing else.**
**Day-0 cloud-init blob (`cloud-init.yaml`) — universal:**
```yaml
#cloud-config
hostname: kei-${env}-${role}
timezone: UTC
package_update: true
package_upgrade: true
packages:
- ufw
- fail2ban
- unattended-upgrades
- auditd
- needrestart
- curl
- jq
users:
- name: keiadmin
groups: sudo
shell: /bin/bash
sudo: ALL=(ALL) NOPASSWD:ALL
ssh_authorized_keys:
- ${ADMIN_PUBKEY}
ssh_pwauth: false
disable_root: true
write_files:
- path: /etc/ssh/sshd_config.d/99-kei.conf
permissions: '0644'
content: |
PasswordAuthentication no
PermitRootLogin no
MaxAuthTries 3
AllowUsers keiadmin
ClientAliveInterval 120
ClientAliveCountMax 2
runcmd:
- [ systemctl, restart, ssh ]
- [ ufw, default, deny, incoming ]
- [ ufw, default, allow, outgoing ]
- [ ufw, allow, 22/tcp ]
- [ ufw, --force, enable ]
```
The blob is intentionally provider-neutral. Provider-specific bits (private-network bring-up, metadata service quirks) go in a short appendix the provisioner appends. See `_primitives/harden-base.sh` for post-boot hardening re-runs.
**SSH-first-contact (`ssh-first-contact.sh` pattern):**
```bash
# Wait for cloud-init to finish AND sshd to be ready on the new IP.
for i in $(seq 1 60); do
ssh -o ConnectTimeout=3 -o StrictHostKeyChecking=accept-new \
"keiadmin@$IP" "cloud-init status --wait" && break
sleep 5
done
ssh "keiadmin@$IP" "sudo test -f /var/lib/cloud/instance/boot-finished"
```
`StrictHostKeyChecking=accept-new` is OK only for the FIRST contact (TOFU). Store the fingerprint to `~/.ssh/known_hosts`; subsequent connects use default strict mode. Never use `StrictHostKeyChecking=no` — accepts MitM silently.
**Terraform skeleton (provider-agnostic via vars):**
```hcl
variable "provider_kind" {} # "digitalocean" | "vultr" | "upcloud" | "linode"
variable "region" {}
variable "size_slug" {} # provider-specific size id
variable "admin_pubkey" {} # raw ssh-ed25519 …
locals {
user_data = templatefile("${path.module}/cloud-init.yaml", { ADMIN_PUBKEY = var.admin_pubkey })
}
# ... then a module-per-provider resource that all read `local.user_data`
```
Keep TF state **local per-env-per-dev by default**; upgrade to remote backend (R2, S3, Terraform Cloud) only when ≥ 2 humans share state.
**Per-provider gotchas (verified 2026-04-21):**
- **DigitalOcean:** Marketplace "Docker" images skip unattended-upgrades — start from plain Debian 12 instead. IPv6 requires `ipv6 = true` on the droplet.
- **Vultr:** `vultr-cli` needs `VULTR_API_KEY`; default firewall is OPEN — attach a firewall group or rely solely on ufw.
- **UpCloud:** IPs rotate on full stop+start unless you request `floating_ip`. Finnish ASN often preferred over Hetzner in RU-routed workloads (see `project-vortex.md`).
- **Linode:** cloud-init runs before disk resize on some plans → `growpart` may need a rerun on first `ssh`.
**Forbidden:** baking the admin private key into an AMI/snapshot; reusing one SSH keypair across envs; letting cloud-init pull scripts from a mutable URL (`curl … | bash` in `runcmd:` — pin to a hash); running `apt-get dist-upgrade -y` in `runcmd` without `needrestart` to surface pending reboots.

View file

@ -0,0 +1,77 @@
# SECURITY — Audit Logging (auditd + journald forwarding)
**Goal:** every privileged action (sudo, ssh login, sensitive file edit) leaves a tamper-evident trail that survives the VM being reimaged.
**Stack:**
- `auditd` — Linux kernel audit framework, writes to `/var/log/audit/audit.log` in human-unfriendly but machine-parseable K/V format.
- `journald` — systemd's binary journal (`/var/log/journal/`), captures stdout/stderr of every service plus syslog stream.
- **Off-box shipping** (optional but recommended) — forward journald to a remote log collector (Loki, Vector, rsyslog+TLS). Local logs are destroyed on reimage.
**Install + enable:**
```
sudo apt install -y auditd audispd-plugins
sudo systemctl enable --now auditd
```
**Reference `/etc/audit/rules.d/99-kei.rules`:**
```
# KeiSeiKit audit baseline — pinned 2026-04-21. Loaded by augenrules on boot.
## 1. SSH events
-w /etc/ssh/sshd_config -p wa -k sshd_config
-w /etc/ssh/sshd_config.d/ -p wa -k sshd_config
-w /root/.ssh/ -p wa -k ssh_keys_root
-w /home/keiadmin/.ssh/ -p wa -k ssh_keys_admin
## 2. Sudo events
-w /etc/sudoers -p wa -k sudoers
-w /etc/sudoers.d/ -p wa -k sudoers
-a always,exit -F arch=b64 -S execve -F euid=0 -F auid>=1000 -F auid!=unset -k sudo_root
## 3. Privilege / identity changes
-w /etc/passwd -p wa -k identity
-w /etc/group -p wa -k identity
-w /etc/shadow -p wa -k identity
-w /etc/gshadow -p wa -k identity
## 4. Loading / unloading kernel modules
-a always,exit -F arch=b64 -S init_module -S finit_module -S delete_module -k module
## 5. Time changes (detect attempts to skew audit timestamps)
-a always,exit -F arch=b64 -S adjtimex -S settimeofday -S clock_settime -k time
-w /etc/localtime -p wa -k time
## 6. Make the config itself immutable (place LAST)
-e 2
```
`-e 2` locks the ruleset until reboot (tamper-resistant). Load with `sudo augenrules --load && sudo systemctl restart auditd`. Test with `sudo ausearch -k sshd_config | tail`.
**Human-readable summaries:** `sudo aureport -au` (auth events), `aureport -m` (module loads), `aureport -k` (keyed rule hits). Use these in incident response; raw `audit.log` is only for ingest pipelines.
**journald tuning — `/etc/systemd/journald.conf.d/99-kei.conf`:**
```
[Journal]
Storage=persistent
Compress=yes
SystemMaxUse=500M
SystemKeepFree=1G
MaxFileSec=1week
ForwardToSyslog=no
```
`Storage=persistent` creates `/var/log/journal/` — without it, `journalctl` history disappears on reboot. `MaxFileSec=1week` rotates weekly; combine with off-box shipping so you don't lose events.
**Off-box shipping patterns:**
- **systemd-journal-upload** — built-in, ships via HTTPS to a `systemd-journal-remote` receiver. Mutual-TLS recommended.
- **Vector** (<https://vector.dev>) — pull from `journald` source, push to Loki/S3/syslog-TLS. Modern, Rust-native. Uses `/run/log/journal/` + unix socket.
- **rsyslog → remote** — legacy path; useful if you already operate a syslog collector.
Any choice: use TLS, authenticate the receiver, do NOT push cleartext logs across the internet. Logs often contain secrets even when the app tries not to log them.
**Failure-mode handling:** `auditd` can be configured to panic the kernel when the audit queue fills — reasonable for high-compliance, DANGEROUS for general VMs. Default `/etc/audit/auditd.conf` has `disk_full_action = SUSPEND` and `disk_error_action = SUSPEND` — keep these; tune to `HALT` only if regulatory driver requires it.
**Verification (skill Phase 5):**
- `sudo auditctl -l` returns the non-empty rule list.
- `systemctl is-active auditd` = `active`.
- `journalctl --disk-usage` shows a non-zero persistent journal.
- (Optional) an off-box log-receiver shows entries within the last N minutes.
**Forbidden:** deleting `/var/log/audit/audit.log` or `/var/log/journal/*` on a live host (breaks chain-of-custody); running auditd with `-e 0` (unlocked, attacker can disable the kernel audit); shipping logs in cleartext; logging secrets (app-level concern — redact before `logger()`); disabling persistent journald.

View file

@ -0,0 +1,62 @@
# SECURITY — Firewall (ufw default-deny + rate limiting + nftables alt)
**Posture — default-deny-in / allow-out:**
```
ufw default deny incoming
ufw default allow outgoing
ufw default deny routed # do NOT forward unless explicitly routing
ufw limit 22/tcp comment 'ssh (rate-limited: 6 conn / 30s)'
ufw logging medium
ufw --force enable
```
`ufw limit` = per-source-IP brute-force mitigation at the kernel level (iptables `recent` module). Use for SSH — *never* use it for app traffic (false positives on shared-NAT clients).
**Layer ordering (read top-down):**
1. **Cloud Firewall** (Hetzner Cloud Firewall / AWS Security Group / DO Firewall) — drops at the provider edge, BEFORE packets hit the VM. Cheapest layer.
2. **ufw** on the VM — defence in depth; also covers provider-firewall misconfigs and private-network paths.
3. **App-level auth** — sshd keys, TLS client certs, app tokens.
Both the Cloud Firewall AND ufw must agree on the port allow-list. A mismatch means "it works from provider console but not from Tailscale" or vice-versa. Use `_primitives/_rust/firewall-diff/` to compare intended rules (YAML) against running `ufw status`.
**Intended-rules YAML schema (`firewall-intent.yaml`):**
```yaml
default:
incoming: deny
outgoing: allow
routed: deny
rules:
- port: 22
proto: tcp
action: limit
from: any
comment: "ssh (rate-limited)"
- port: 443
proto: tcp
action: allow
from: any
comment: "https / caddy"
- port: 80
proto: tcp
action: allow
from: any
comment: "http / acme-http-01"
```
`firewall-diff` round-trips this against live `ufw status numbered` JSON-parse and prints additions/deletions. Exit 0 iff live ≡ intent.
**Rate limiting patterns:**
- `limit` — built-in; 6 connections / 30 s per IP. Good for SSH.
- Per-app — do it inside the app or a reverse proxy (nginx `limit_req`, Caddy `rate_limit`), not in ufw. Kernel rate-limit doesn't understand HTTP methods.
- ICMP — `ufw default allow outgoing` covers outbound; inbound ICMP should be `allow` (echo) for monitoring, NOT blanket-blocked (blocks path-MTU discovery).
**IPv6:** `/etc/default/ufw``IPV6=yes` (default Debian 12). Verify via `ufw status verbose` shows the (v6) rules. Missing IPv6 rules = a trivial bypass on dual-stack VMs.
**Logging:** `ufw logging medium` writes to `/var/log/ufw.log`. Forward to journald (default on systemd) or an off-box log collector. Logging `high` is too chatty for steady state; use it only during incident response.
**nftables alternative (for hosts that have Docker-installed iptables-nft):**
ufw is a thin wrapper over iptables/nftables; on Docker-heavy hosts, Docker's daemon aggressively rewrites iptables and can bypass ufw. Two options:
1. **DOCKER_OPTS=`--iptables=false`** (and do NAT yourself — advanced).
2. **`ufw-docker`** companion (<https://github.com/chaifeng/ufw-docker>, not bundled in Debian — pin a tagged release, review the script BEFORE install).
On non-Docker hosts, ufw is sufficient. On Docker hosts, EITHER isolate (dedicated host + Cloud Firewall only) OR use `ufw-docker` — don't half-configure.
**Forbidden:** `ufw default allow incoming` "temporarily"; `allow from any to any port 22` without `limit`; skipping the IPv6 rule set; letting Docker silently override ufw without disabling its iptables chain; relying on `ufw` as the ONLY layer when a Cloud Firewall is available.

View file

@ -0,0 +1,62 @@
# SECURITY — Patching (unattended-upgrades + needrestart + reboot window)
**Goal:** security patches applied within 24 h of release, service restarts + kernel reboots happen within a declared maintenance window (NOT ad-hoc at 3 AM UTC on a random Tuesday).
**Install:**
```
sudo apt install -y unattended-upgrades needrestart
```
**`/etc/apt/apt.conf.d/50unattended-upgrades` (essential lines, Debian 12 / Ubuntu 22.04+):**
```
Unattended-Upgrade::Origins-Pattern {
"origin=Debian,codename=${distro_codename}-security";
"origin=Debian,codename=${distro_codename}-updates";
};
Unattended-Upgrade::Automatic-Reboot "false";
Unattended-Upgrade::Automatic-Reboot-Time "04:00";
Unattended-Upgrade::Mail "admin@example.com";
Unattended-Upgrade::MailReport "on-change";
```
`Automatic-Reboot "false"` is the SAFE default — an automatic reboot without coordination kills in-flight requests. Pair with `needrestart` to SURFACE reboot requirement, then schedule the window explicitly (below).
**`/etc/apt/apt.conf.d/20auto-upgrades`:**
```
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Periodic::AutocleanInterval "7";
```
Triggers daily via `/lib/systemd/system/apt-daily.timer` + `apt-daily-upgrade.timer`.
**needrestart:** after each upgrade, prints services that loaded old library versions and need restart. `/etc/needrestart/needrestart.conf`:
```
$nrconf{restart} = 'l'; # list only; do NOT auto-restart services
$nrconf{kernelhints} = -1; # suppress "reboot hint" interactive prompt (non-TTY cron)
```
`nrconf{restart} = 'a'` (auto) is tempting but dangerous — restarting `postgresql` or a stateful app during a migration = corruption.
**Reboot window pattern (declared, env-var-driven):**
```bash
# /etc/systemd/system/kei-reboot-window.service + .timer
# Only reboots if /var/run/reboot-required exists AND the current time
# falls inside the declared window.
[Service]
Type=oneshot
EnvironmentFile=/etc/default/kei-reboot-window
ExecStart=/usr/local/bin/kei-reboot-window
# /etc/default/kei-reboot-window
KEI_REBOOT_DOW="Sun" # day-of-week
KEI_REBOOT_HOUR="04" # 24h, UTC
KEI_REBOOT_MIN="15"
KEI_DRAIN_CMD="" # optional pre-reboot drain (e.g. drain a load-balancer slot)
```
`kei-reboot-window` script checks `[ -f /var/run/reboot-required ]`, verifies it is the declared DOW/hour, runs `$KEI_DRAIN_CMD`, then `systemctl reboot`. Commit the script once; reuse the env file per-host.
**Provider-specific:**
- **Hetzner Cloud / Vultr / UpCloud / DigitalOcean / Linode** — nothing extra; cloud-init already installs the packages per `deploy-vps-generic.md`.
- **AWS EC2**`ec2-instance-connect` may briefly reject SSH during a reboot — tolerate in orchestration retries.
**Auditability:** `unattended-upgrades` logs to `/var/log/unattended-upgrades/unattended-upgrades.log`. Forward via journald (see `security-audit-logging.md`). Package a short summary in the skill Phase 5 report.
**Forbidden:** `Unattended-Upgrade::Automatic-Reboot "true"` on stateful services; `$nrconf{restart} = 'a'` on a database host; silently skipping the reboot window to "avoid downtime" (real fix: HA, not skipped patches); installing `.deb` packages from third-party repos without pinning + signature verification; disabling the `apt-daily.timer` — disables ALL security updates.

View file

@ -0,0 +1,51 @@
# SECURITY — SSH Hardening (sshd_config.d/99-kei.conf)
**Rule:** hardening goes into a drop-in under `/etc/ssh/sshd_config.d/`, NEVER by editing `/etc/ssh/sshd_config` directly. The main file ships with distro-owned defaults; drop-ins win on later-read order and survive package upgrades cleanly.
**Reference file `/etc/ssh/sshd_config.d/99-kei.conf`:**
```
# KeiSeiKit hardened SSH — pinned 2026-04-21, auditable via ssh-check.
Protocol 2
PasswordAuthentication no
ChallengeResponseAuthentication no
KbdInteractiveAuthentication no
PermitRootLogin prohibit-password
PermitEmptyPasswords no
UsePAM yes
MaxAuthTries 3
MaxSessions 4
LoginGraceTime 20
AllowUsers keiadmin
AllowTcpForwarding no
X11Forwarding no
PermitTunnel no
ClientAliveInterval 120
ClientAliveCountMax 2
LogLevel VERBOSE
# Modern crypto only (OpenSSH ≥ 8.9, default Debian 12 / Ubuntu 22.04+):
KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org,sntrup761x25519-sha512@openssh.com
Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,aes128-gcm@openssh.com
MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com
HostKeyAlgorithms ssh-ed25519,rsa-sha2-512,rsa-sha2-256
```
Apply with `sshd -t` (config test) before `systemctl reload ssh`. `reload` NOT `restart` — restart kills existing sessions; reload re-reads config while keeping them.
**Field-by-field rationale:**
- `PasswordAuthentication no` — passwords are the #1 SSH brute-force vector. Keys only.
- `PermitRootLogin prohibit-password` — root only via key, never password. `no` blocks even emergency cloud-console rescue paths on some providers; `prohibit-password` is the pragmatic middle.
- `MaxAuthTries 3` — reduces per-connection key/password attempts; combine with fail2ban for per-IP bans (separate concern).
- `AllowUsers keiadmin` — whitelist is simpler than group-based DENY and audits trivially. Adding users = explicit edit.
- `LogLevel VERBOSE` — logs the key fingerprint used; without it you can't tell which admin logged in after compromise.
- `ClientAliveInterval 120` + `ClientAliveCountMax 2` — idle sessions die in 4 minutes. Lost laptops don't leave open shells.
- `AllowTcpForwarding no` / `PermitTunnel no` — disables SSH-as-VPN. Enable per-use-case via `Match User tunneluser` only.
**Modern KEX/Cipher/MAC lists (2026-04-21):**
- KEX: `sntrup761x25519-sha512@openssh.com` is post-quantum hybrid (default since OpenSSH 9.9) [VERIFIED https://www.openssh.com/releasenotes.html]; `curve25519-sha256` is the classic ECDH.
- Ciphers: AEAD only (`chacha20-poly1305`, `aes*-gcm`). Dropped CBC-mode — vulnerable to Terrapin CVE-2023-48795 without strict-KEX.
- MACs: ETM (Encrypt-Then-MAC) only. Legacy MAC-Then-Encrypt is dropped.
- HostKey: prefer `ssh-ed25519`; keep `rsa-sha2-*` for older client compatibility. Drop `ssh-rsa` (SHA-1, broken).
**Verification (KeiSeiKit primitive):**
`_primitives/_rust/ssh-check/` parses BOTH `sshd_config` AND every `sshd_config.d/*.conf` (in filename sort order, last wins per directive), reports violations of the matrix above with `file:line` precision. Run BEFORE every `systemctl reload ssh` and BEFORE the skill phase-5 verify gate.
**Forbidden:** editing `/etc/ssh/sshd_config` in-place when a drop-in directory exists; `PermitRootLogin yes`; `PasswordAuthentication yes`; accepting any `diffie-hellman-group1-*` / `ssh-rsa` / CBC ciphers; restarting sshd before `sshd -t` passes; relying on fail2ban alone without key-only auth.

View file

@ -0,0 +1,68 @@
# SECURITY — TLS via Caddy (automatic ACME, HTTP-01 / DNS-01)
**Why Caddy:** zero-config TLS. Caddy 2 auto-provisions certificates via Let's Encrypt / ZeroSSL on first request for a domain that resolves to it, auto-renews, and stores state under `/var/lib/caddy/`. Official docs: <https://caddyserver.com/docs/automatic-https> [VERIFIED 2026-04-21].
**One-liner install (Debian/Ubuntu, official repo):**
```
# Pinned to official Cloudsmith repo — NEVER `curl … | bash` a random domain.
sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' \
| sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' \
| sudo tee /etc/apt/sources.list.d/caddy-stable.list
sudo apt update && sudo apt install -y caddy
```
This installs the `caddy` systemd service owned by `caddy:caddy`. **Never run Caddy as root** — it uses `CAP_NET_BIND_SERVICE` ambient capability to bind low ports.
**Minimal `/etc/caddy/Caddyfile`:**
```
{
# Global options
email admin@example.com # ACME account contact (change!)
# auto_https disable_redirects # uncomment only if fronted by another TLS-terminating proxy
}
api.example.com {
encode zstd gzip
log {
output file /var/log/caddy/api.log {
roll_size 10mb
roll_keep 10
}
}
reverse_proxy 127.0.0.1:8080
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains"
X-Content-Type-Options "nosniff"
Referrer-Policy "strict-origin-when-cross-origin"
-Server
}
}
```
`caddy validate --config /etc/caddy/Caddyfile` BEFORE `systemctl reload caddy`. Reload ≠ restart; reload is zero-downtime.
**ACME challenge choice:**
- **HTTP-01** (default) — Caddy binds port 80, LE connects back, serves challenge. Requires: port 80 open to the internet, DNS pointing to the VM. Works for single-host public services.
- **DNS-01** — Caddy writes a TXT record via DNS provider API, doesn't need port 80 open. **Required for wildcard certs** (`*.example.com`) and for LAN-only hosts. Needs a DNS-provider plugin (e.g. `caddy-dns/cloudflare`) compiled into the binary — use `xcaddy build` or the Cloudsmith `caddy-dns-*` packages.
**DNS-01 with Cloudflare (`caddy-dns/cloudflare`):**
```
*.internal.example.com, internal.example.com {
tls {
dns cloudflare {env.CF_API_TOKEN}
}
reverse_proxy 127.0.0.1:8080
}
```
`CF_API_TOKEN` — store in `/etc/caddy/caddy.env` (chmod 0640, `caddy:caddy`), load via systemd drop-in `EnvironmentFile=`. Never bake the token into the Caddyfile (RULE 0.8 — see `domain-has-secrets.md`).
**CT log awareness:** every LE cert is published to Certificate Transparency logs. **Any subdomain you cert is publicly searchable** via crt.sh. Use DNS-01 + wildcard for internal services whose names should not leak.
**Firewall interop (see `security-firewall-ufw.md`):** `ufw allow 80,443/tcp` is required for HTTP-01 and for public HTTPS. Do NOT open 80 if using DNS-01 exclusively and not redirecting HTTP→HTTPS publicly; skip the redirect with `auto_https disable_redirects`.
**Hardening:**
- `HSTS` as shown above — 1 year, include subdomains. Add `preload` only after submitting to the HSTS preload list.
- `-Server` header strip — removes Caddy version disclosure.
- Rate limit via `caddy-ratelimit` module (needs `xcaddy build` with the plugin) for per-IP throttling; otherwise rely on cloud/ufw layer.
**Forbidden:** running Caddy as root; embedding DNS/ACME API tokens in the Caddyfile; using `tls internal` (self-signed, ephemeral CA) for anything reachable from outside localhost; skipping `caddy validate` before reload; self-hosting ACME (step-ca is great, but needs its own runbook — out of scope here).

1
_primitives/_rust/.gitignore vendored Normal file
View file

@ -0,0 +1 @@
target/

608
_primitives/_rust/Cargo.lock generated Normal file
View file

@ -0,0 +1,608 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3
[[package]]
name = "anstream"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "824a212faf96e9acacdbd09febd34438f8f711fb84e09a8916013cd7815ca28d"
dependencies = [
"anstyle",
"anstyle-parse",
"anstyle-query",
"anstyle-wincon",
"colorchoice",
"is_terminal_polyfill",
"utf8parse",
]
[[package]]
name = "anstyle"
version = "1.0.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "940b3a0ca603d1eade50a4846a2afffd5ef57a9feac2c0e2ec2e14f9ead76000"
[[package]]
name = "anstyle-parse"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "52ce7f38b242319f7cabaa6813055467063ecdc9d355bbb4ce0c68908cd8130e"
dependencies = [
"utf8parse",
]
[[package]]
name = "anstyle-query"
version = "1.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "40c48f72fd53cd289104fc64099abca73db4166ad86ea0b4341abe65af83dadc"
dependencies = [
"windows-sys",
]
[[package]]
name = "anstyle-wincon"
version = "3.0.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "291e6a250ff86cd4a820112fb8898808a366d8f9f58ce16d1f538353ad55747d"
dependencies = [
"anstyle",
"once_cell_polyfill",
"windows-sys",
]
[[package]]
name = "anyhow"
version = "1.0.102"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c"
[[package]]
name = "bitflags"
version = "2.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4512299f36f043ab09a583e57bceb5a5aab7a73db1805848e8fef3c9e8c78b3"
[[package]]
name = "cfg-if"
version = "1.0.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
[[package]]
name = "clap"
version = "4.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1ddb117e43bbf7dacf0a4190fef4d345b9bad68dfc649cb349e7d17d28428e51"
dependencies = [
"clap_builder",
"clap_derive",
]
[[package]]
name = "clap_builder"
version = "4.6.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "714a53001bf66416adb0e2ef5ac857140e7dc3a0c48fb28b2f10762fc4b5069f"
dependencies = [
"anstream",
"anstyle",
"clap_lex",
"strsim",
]
[[package]]
name = "clap_derive"
version = "4.6.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f2ce8604710f6733aa641a2b3731eaa1e8b3d9973d5e3565da11800813f997a9"
dependencies = [
"heck",
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "clap_lex"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9"
[[package]]
name = "colorchoice"
version = "1.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1d07550c9036bf2ae0c684c4297d503f838287c83c53686d05370d0e139ae570"
[[package]]
name = "equivalent"
version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "877a4ace8713b0bcf2a4e7eec82529c029f1d0619886d18145fea96c3ffe5c0f"
[[package]]
name = "errno"
version = "0.3.14"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "39cab71617ae0d63f51a36d69f866391735b51691dbda63cf6f96d042b63efeb"
dependencies = [
"libc",
"windows-sys",
]
[[package]]
name = "fastrand"
version = "2.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9f1f227452a390804cdb637b74a86990f2a7d7ba4b7d5693aac9b4dd6defd8d6"
[[package]]
name = "firewall-diff"
version = "0.1.0"
dependencies = [
"clap",
"serde",
"serde_json",
"serde_yaml",
"tempfile",
]
[[package]]
name = "foldhash"
version = "0.1.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d9c4f5dac5e15c24eb999c26181a6ca40b39fe946cbe4c263c7209467bc83af2"
[[package]]
name = "getrandom"
version = "0.4.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0de51e6874e94e7bf76d726fc5d13ba782deca734ff60d5bb2fb2607c7406555"
dependencies = [
"cfg-if",
"libc",
"r-efi",
"wasip2",
"wasip3",
]
[[package]]
name = "hashbrown"
version = "0.15.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9229cfe53dfd69f0609a49f65461bd93001ea1ef889cd5529dd176593f5338a1"
dependencies = [
"foldhash",
]
[[package]]
name = "hashbrown"
version = "0.17.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4f467dd6dccf739c208452f8014c75c18bb8301b050ad1cfb27153803edb0f51"
[[package]]
name = "heck"
version = "0.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea"
[[package]]
name = "id-arena"
version = "2.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3d3067d79b975e8844ca9eb072e16b31c3c1c36928edf9c6789548c524d0d954"
[[package]]
name = "indexmap"
version = "2.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d466e9454f08e4a911e14806c24e16fba1b4c121d1ea474396f396069cf949d9"
dependencies = [
"equivalent",
"hashbrown 0.17.0",
"serde",
"serde_core",
]
[[package]]
name = "is_terminal_polyfill"
version = "1.70.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a6cb138bb79a146c1bd460005623e142ef0181e3d0219cb493e02f7d08a35695"
[[package]]
name = "itoa"
version = "1.0.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682"
[[package]]
name = "leb128fmt"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "09edd9e8b54e49e587e4f6295a7d29c3ea94d469cb40ab8ca70b288248a81db2"
[[package]]
name = "libc"
version = "0.2.185"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "52ff2c0fe9bc6cb6b14a0592c2ff4fa9ceb83eea9db979b0487cd054946a2b8f"
[[package]]
name = "linux-raw-sys"
version = "0.12.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32a66949e030da00e8c7d4434b251670a91556f4144941d37452769c25d58a53"
[[package]]
name = "log"
version = "0.4.29"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5e5032e24019045c762d3c0f28f5b6b8bbf38563a65908389bf7978758920897"
[[package]]
name = "memchr"
version = "2.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f8ca58f447f06ed17d5fc4043ce1b10dd205e060fb3ce5b979b8ed8e59ff3f79"
[[package]]
name = "once_cell"
version = "1.21.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50"
[[package]]
name = "once_cell_polyfill"
version = "1.70.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe"
[[package]]
name = "prettyplease"
version = "0.2.37"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "479ca8adacdd7ce8f1fb39ce9ecccbfe93a3f1344b3d0d97f20bc0196208f62b"
dependencies = [
"proc-macro2",
"syn",
]
[[package]]
name = "proc-macro2"
version = "1.0.106"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934"
dependencies = [
"unicode-ident",
]
[[package]]
name = "quote"
version = "1.0.45"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924"
dependencies = [
"proc-macro2",
]
[[package]]
name = "r-efi"
version = "6.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f8dcc9c7d52a811697d2151c701e0d08956f92b0e24136cf4cf27b57a6a0d9bf"
[[package]]
name = "rustix"
version = "1.1.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b6fe4565b9518b83ef4f91bb47ce29620ca828bd32cb7e408f0062e9930ba190"
dependencies = [
"bitflags",
"errno",
"libc",
"linux-raw-sys",
"windows-sys",
]
[[package]]
name = "ryu"
version = "1.0.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9774ba4a74de5f7b1c1451ed6cd5285a32eddb5cccb8cc655a4e50009e06477f"
[[package]]
name = "semver"
version = "1.0.28"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a7852d02fc848982e0c167ef163aaff9cd91dc640ba85e263cb1ce46fae51cd"
[[package]]
name = "serde"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
dependencies = [
"serde_core",
"serde_derive",
]
[[package]]
name = "serde_core"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
dependencies = [
"serde_derive",
]
[[package]]
name = "serde_derive"
version = "1.0.228"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "serde_json"
version = "1.0.149"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "83fc039473c5595ace860d8c4fafa220ff474b3fc6bfdb4293327f1a37e94d86"
dependencies = [
"itoa",
"memchr",
"serde",
"serde_core",
"zmij",
]
[[package]]
name = "serde_yaml"
version = "0.9.34+deprecated"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6a8b1a1a2ebf674015cc02edccce75287f1a0130d394307b36743c2f5d504b47"
dependencies = [
"indexmap",
"itoa",
"ryu",
"serde",
"unsafe-libyaml",
]
[[package]]
name = "ssh-check"
version = "0.1.0"
dependencies = [
"clap",
"serde",
"serde_json",
"tempfile",
]
[[package]]
name = "strsim"
version = "0.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7da8b5736845d9f2fcb837ea5d9e2628564b3b043a70948a3f0b778838c5fb4f"
[[package]]
name = "syn"
version = "2.0.117"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "tempfile"
version = "3.27.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32497e9a4c7b38532efcdebeef879707aa9f794296a4f0244f6f69e9bc8574bd"
dependencies = [
"fastrand",
"getrandom",
"once_cell",
"rustix",
"windows-sys",
]
[[package]]
name = "unicode-ident"
version = "1.0.24"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75"
[[package]]
name = "unicode-xid"
version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ebc1c04c71510c7f702b52b7c350734c9ff1295c464a03335b00bb84fc54f853"
[[package]]
name = "unsafe-libyaml"
version = "0.2.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "673aac59facbab8a9007c7f6108d11f63b603f7cabff99fabf650fea5c32b861"
[[package]]
name = "utf8parse"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
[[package]]
name = "wasip2"
version = "1.0.3+wasi-0.2.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "20064672db26d7cdc89c7798c48a0fdfac8213434a1186e5ef29fd560ae223d6"
dependencies = [
"wit-bindgen 0.57.1",
]
[[package]]
name = "wasip3"
version = "0.4.0+wasi-0.3.0-rc-2026-01-06"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5428f8bf88ea5ddc08faddef2ac4a67e390b88186c703ce6dbd955e1c145aca5"
dependencies = [
"wit-bindgen 0.51.0",
]
[[package]]
name = "wasm-encoder"
version = "0.244.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "990065f2fe63003fe337b932cfb5e3b80e0b4d0f5ff650e6985b1048f62c8319"
dependencies = [
"leb128fmt",
"wasmparser",
]
[[package]]
name = "wasm-metadata"
version = "0.244.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bb0e353e6a2fbdc176932bbaab493762eb1255a7900fe0fea1a2f96c296cc909"
dependencies = [
"anyhow",
"indexmap",
"wasm-encoder",
"wasmparser",
]
[[package]]
name = "wasmparser"
version = "0.244.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "47b807c72e1bac69382b3a6fb3dbe8ea4c0ed87ff5629b8685ae6b9a611028fe"
dependencies = [
"bitflags",
"hashbrown 0.15.5",
"indexmap",
"semver",
]
[[package]]
name = "windows-link"
version = "0.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f0805222e57f7521d6a62e36fa9163bc891acd422f971defe97d64e70d0a4fe5"
[[package]]
name = "windows-sys"
version = "0.61.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ae137229bcbd6cdf0f7b80a31df61766145077ddf49416a728b02cb3921ff3fc"
dependencies = [
"windows-link",
]
[[package]]
name = "wit-bindgen"
version = "0.51.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d7249219f66ced02969388cf2bb044a09756a083d0fab1e566056b04d9fbcaa5"
dependencies = [
"wit-bindgen-rust-macro",
]
[[package]]
name = "wit-bindgen"
version = "0.57.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1ebf944e87a7c253233ad6766e082e3cd714b5d03812acc24c318f549614536e"
[[package]]
name = "wit-bindgen-core"
version = "0.51.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ea61de684c3ea68cb082b7a88508a8b27fcc8b797d738bfc99a82facf1d752dc"
dependencies = [
"anyhow",
"heck",
"wit-parser",
]
[[package]]
name = "wit-bindgen-rust"
version = "0.51.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b7c566e0f4b284dd6561c786d9cb0142da491f46a9fbed79ea69cdad5db17f21"
dependencies = [
"anyhow",
"heck",
"indexmap",
"prettyplease",
"syn",
"wasm-metadata",
"wit-bindgen-core",
"wit-component",
]
[[package]]
name = "wit-bindgen-rust-macro"
version = "0.51.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0c0f9bfd77e6a48eccf51359e3ae77140a7f50b1e2ebfe62422d8afdaffab17a"
dependencies = [
"anyhow",
"prettyplease",
"proc-macro2",
"quote",
"syn",
"wit-bindgen-core",
"wit-bindgen-rust",
]
[[package]]
name = "wit-component"
version = "0.244.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9d66ea20e9553b30172b5e831994e35fbde2d165325bec84fc43dbf6f4eb9cb2"
dependencies = [
"anyhow",
"bitflags",
"indexmap",
"log",
"serde",
"serde_derive",
"serde_json",
"wasm-encoder",
"wasm-metadata",
"wasmparser",
"wit-parser",
]
[[package]]
name = "wit-parser"
version = "0.244.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ecc8ac4bc1dc3381b7f59c34f00b67e18f910c2c0f50015669dde7def656a736"
dependencies = [
"anyhow",
"id-arena",
"indexmap",
"log",
"semver",
"serde",
"serde_derive",
"serde_json",
"unicode-xid",
"wasmparser",
]
[[package]]
name = "zmij"
version = "1.0.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa"

View file

@ -1,14 +1,25 @@
[workspace]
resolver = "2"
members = ["mock-render", "visual-diff", "tokens-sync", "kei-ledger"]
members = [
"kei-ledger",
"kei-migrate",
"kei-changelog",
"ssh-check",
"firewall-diff",
"mock-render",
"visual-diff",
"tokens-sync",
]
[workspace.package]
edition = "2021"
rust-version = "1.75"
[workspace.dependencies]
clap = { version = "4", features = ["derive"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
serde_yaml = "0.9"
sha2 = "0.10"
image = { version = "0.25", default-features = false, features = ["png"] }

View file

@ -0,0 +1,18 @@
[package]
name = "firewall-diff"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
[[bin]]
name = "firewall-diff"
path = "src/main.rs"
[dependencies]
clap = { workspace = true }
serde = { workspace = true }
serde_yaml = { workspace = true }
serde_json = { workspace = true }
[dev-dependencies]
tempfile = "3"

View file

@ -0,0 +1,193 @@
//! Compare Intent × Live and emit a structured report.
use crate::intent::{Action, Intent, Rule};
use crate::ufw::{Live, LiveRule};
use serde::Serialize;
use std::collections::HashSet;
#[derive(Debug, Clone, Serialize)]
pub struct Report {
pub active_ok: bool,
pub default_mismatches: Vec<String>,
pub missing: Vec<Rule>, // in intent, not in live
pub extra: Vec<LiveRule>, // in live, not in intent
}
impl Report {
pub fn is_clean(&self) -> bool {
self.active_ok
&& self.default_mismatches.is_empty()
&& self.missing.is_empty()
&& self.extra.is_empty()
}
}
pub fn compare(intent: &Intent, live: &Live) -> Report {
let active_ok = live.active;
let mut default_mismatches = Vec::new();
if !matches!(intent.default.incoming, Action::Deny | Action::Reject) {
default_mismatches
.push("intent.default.incoming must be deny/reject for production".to_string());
}
// Build key sets.
let intent_keys: HashSet<String> = intent.rules.iter().map(Rule::key).collect();
let live_keys: HashSet<String> = live.rules.iter().map(LiveRule::key).collect();
let missing: Vec<Rule> = intent
.rules
.iter()
.filter(|r| !live_keys.contains(&r.key()))
.cloned()
.collect();
let extra: Vec<LiveRule> = live
.rules
.iter()
.filter(|r| !intent_keys.contains(&r.key()))
.cloned()
.collect();
Report {
active_ok,
default_mismatches,
missing,
extra,
}
}
pub fn render_human(r: &Report) {
if !r.active_ok {
println!("[FAIL] ufw is not active.");
}
for m in &r.default_mismatches {
println!("[WARN] default: {m}");
}
for m in &r.missing {
println!(
"[MISS] intent rule not live: {}/{} from={} action={:?}",
m.port, m.proto, m.from, m.action
);
}
for e in &r.extra {
println!(
"[EXTRA] live rule not in intent: {}/{} from={} action={:?} family={:?}",
e.port, e.proto, e.from, e.action, e.family
);
}
if r.is_clean() {
println!("firewall-diff: OK — intent ≡ live.");
} else {
println!(
"firewall-diff: {} missing, {} extra, default-issues={}",
r.missing.len(),
r.extra.len(),
r.default_mismatches.len()
);
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::intent::{Action, Defaults, Intent, Rule};
use crate::ufw::{self, Family, Live, LiveRule};
fn intent_fx() -> Intent {
Intent {
default: Defaults {
incoming: Action::Deny,
outgoing: Action::Allow,
routed: Action::Deny,
},
rules: vec![
Rule {
port: 22,
proto: "tcp".into(),
action: Action::Limit,
from: "any".into(),
comment: "ssh".into(),
},
Rule {
port: 443,
proto: "tcp".into(),
action: Action::Allow,
from: "any".into(),
comment: "".into(),
},
],
}
}
fn live_fx(items: &[(u16, &str, Action, &str)]) -> Live {
Live {
active: true,
rules: items
.iter()
.map(|(p, pr, a, f)| LiveRule {
port: *p,
proto: (*pr).into(),
action: a.clone(),
from: (*f).into(),
family: Family::V4,
})
.collect(),
}
}
#[test]
fn exact_match_is_clean() {
let i = intent_fx();
let l = live_fx(&[
(22, "tcp", Action::Limit, "any"),
(443, "tcp", Action::Allow, "any"),
]);
let r = compare(&i, &l);
assert!(r.is_clean(), "{:#?}", r);
}
#[test]
fn missing_rule_surfaced() {
let i = intent_fx();
let l = live_fx(&[(22, "tcp", Action::Limit, "any")]);
let r = compare(&i, &l);
assert_eq!(r.missing.len(), 1);
assert_eq!(r.missing[0].port, 443);
}
#[test]
fn extra_live_rule_surfaced() {
let i = intent_fx();
let l = live_fx(&[
(22, "tcp", Action::Limit, "any"),
(443, "tcp", Action::Allow, "any"),
(8080, "tcp", Action::Allow, "any"),
]);
let r = compare(&i, &l);
assert_eq!(r.extra.len(), 1);
assert_eq!(r.extra[0].port, 8080);
}
#[test]
fn inactive_ufw_fails() {
let i = intent_fx();
let l = Live {
active: false,
rules: vec![],
};
let r = compare(&i, &l);
assert!(!r.is_clean());
assert!(!r.active_ok);
}
#[test]
fn integration_parse_then_diff() {
// Mimic real `ufw status numbered` column padding (double-space gaps).
let text = "Status: active\n\n\
[ 1] 22/tcp LIMIT IN Anywhere\n\
[ 2] 443/tcp ALLOW IN Anywhere\n";
let live = ufw::parse(text).unwrap();
let r = compare(&intent_fx(), &live);
assert!(r.is_clean(), "{:#?}", r);
}
}

View file

@ -0,0 +1,111 @@
//! Intent YAML schema + loader. See `_blocks/security-firewall-ufw.md` for
//! the reference format. Anything missing is treated as "don't care".
use serde::{Deserialize, Serialize};
use std::fs;
use std::path::Path;
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, Eq)]
#[serde(rename_all = "lowercase")]
pub enum Action {
Allow,
Deny,
Limit,
Reject,
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, Eq)]
pub struct Defaults {
#[serde(default = "default_deny")]
pub incoming: Action,
#[serde(default = "default_allow")]
pub outgoing: Action,
#[serde(default = "default_deny")]
pub routed: Action,
}
fn default_deny() -> Action {
Action::Deny
}
fn default_allow() -> Action {
Action::Allow
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, Eq)]
pub struct Rule {
pub port: u16,
#[serde(default = "default_proto")]
pub proto: String,
pub action: Action,
#[serde(default = "default_from")]
pub from: String,
#[serde(default)]
pub comment: String,
}
fn default_proto() -> String {
"tcp".into()
}
fn default_from() -> String {
"any".into()
}
#[derive(Debug, Clone, Deserialize, Serialize, PartialEq, Eq)]
pub struct Intent {
pub default: Defaults,
#[serde(default)]
pub rules: Vec<Rule>,
}
pub fn load(path: &Path) -> Result<Intent, String> {
let body = fs::read_to_string(path).map_err(|e| format!("read {}: {e}", path.display()))?;
serde_yaml::from_str(&body).map_err(|e| format!("yaml: {e}"))
}
impl Rule {
/// Canonical key used to match against a live rule: port/proto/from/action.
pub fn key(&self) -> String {
format!(
"{}/{}::{}::{}",
self.port,
self.proto.to_ascii_lowercase(),
self.from.to_ascii_lowercase(),
format!("{:?}", self.action).to_ascii_lowercase()
)
}
}
#[cfg(test)]
mod tests {
use super::*;
use std::io::Write;
#[test]
fn load_minimal_intent() {
let dir = tempfile::tempdir().unwrap();
let p = dir.path().join("intent.yaml");
let mut f = fs::File::create(&p).unwrap();
writeln!(
f,
r#"default:
incoming: deny
outgoing: allow
routed: deny
rules:
- port: 22
proto: tcp
action: limit
from: any
comment: "ssh"
- port: 443
proto: tcp
action: allow
from: any
"#
)
.unwrap();
let i = load(&p).unwrap();
assert_eq!(i.default.incoming, Action::Deny);
assert_eq!(i.rules.len(), 2);
assert_eq!(i.rules[0].action, Action::Limit);
assert_eq!(i.rules[1].port, 443);
}
}

View file

@ -0,0 +1,101 @@
//! firewall-diff — compare an intended ufw rule set (YAML) against the
//! running firewall (parsed from `ufw status numbered` output).
//!
//! USAGE
//! firewall-diff --intent firewall-intent.yaml --status-file live.txt
//! ufw status numbered | firewall-diff --intent firewall-intent.yaml --stdin
//! firewall-diff --intent firewall-intent.yaml --json
//!
//! The tool does NOT execute `ufw` itself (defensive-only). Feed it the
//! output of `ufw status numbered` or have the skill pipe it in.
//!
//! EXIT
//! 0 intent ≡ live (no diff)
//! 1 usage / parse error
//! 2 differences found (live deviates from intent)
mod diff;
mod intent;
mod ufw;
use clap::Parser;
use std::fs;
use std::io::{self, Read};
use std::path::PathBuf;
use std::process::ExitCode;
#[derive(Parser, Debug)]
#[command(name = "firewall-diff", about = "Diff intended ufw rules (YAML) vs live status.")]
struct Cli {
/// Path to the intent YAML file.
#[arg(long)]
intent: PathBuf,
/// Path to a file holding captured `ufw status numbered` output.
#[arg(long, conflicts_with = "stdin")]
status_file: Option<PathBuf>,
/// Read the ufw status text from stdin (use when piping from the host).
#[arg(long)]
stdin: bool,
/// Emit JSON instead of human text.
#[arg(long)]
json: bool,
}
fn main() -> ExitCode {
let cli = Cli::parse();
let intent = match intent::load(&cli.intent) {
Ok(i) => i,
Err(e) => {
eprintln!("firewall-diff: intent: {e}");
return ExitCode::from(1);
}
};
let status_txt = match (&cli.status_file, cli.stdin) {
(Some(p), false) => match fs::read_to_string(p) {
Ok(s) => s,
Err(e) => {
eprintln!("firewall-diff: read {}: {e}", p.display());
return ExitCode::from(1);
}
},
(None, true) => {
let mut s = String::new();
if let Err(e) = io::stdin().read_to_string(&mut s) {
eprintln!("firewall-diff: stdin: {e}");
return ExitCode::from(1);
}
s
}
_ => {
eprintln!("firewall-diff: need --status-file <path> or --stdin");
return ExitCode::from(1);
}
};
let live = match ufw::parse(&status_txt) {
Ok(l) => l,
Err(e) => {
eprintln!("firewall-diff: parse ufw status: {e}");
return ExitCode::from(1);
}
};
let report = diff::compare(&intent, &live);
if cli.json {
println!("{}", serde_json::to_string_pretty(&report).unwrap_or_default());
} else {
diff::render_human(&report);
}
if report.is_clean() {
ExitCode::SUCCESS
} else {
ExitCode::from(2)
}
}

View file

@ -0,0 +1,173 @@
//! Parse `ufw status numbered` output.
//!
//! Typical shape (Ubuntu 22.04, ufw 0.36):
//!
//! Status: active
//!
//! To Action From
//! -- ------ ----
//! [ 1] 22/tcp LIMIT IN Anywhere
//! [ 2] 443/tcp ALLOW IN Anywhere
//! [ 3] 22/tcp (v6) LIMIT IN Anywhere (v6)
//!
//! We normalise "(v6)" to a separate family tag but key rules on port/proto
//! only (v6 and v4 rules with the same port/proto are treated as duplicates
//! of intent, which is usually the desired behaviour for parity checks).
use crate::intent::Action;
use serde::Serialize;
#[derive(Debug, Clone, Serialize, PartialEq, Eq)]
pub struct LiveRule {
pub port: u16,
pub proto: String,
pub action: Action,
pub from: String,
pub family: Family,
}
#[derive(Debug, Clone, Serialize, PartialEq, Eq)]
pub enum Family {
V4,
V6,
}
#[derive(Debug, Clone, Serialize)]
pub struct Live {
pub active: bool,
pub rules: Vec<LiveRule>,
}
pub fn parse(text: &str) -> Result<Live, String> {
let mut active = false;
let mut rules = Vec::new();
for raw in text.lines() {
let line = raw.trim();
if line.is_empty() {
continue;
}
if let Some(rest) = line.strip_prefix("Status:") {
active = rest.trim().eq_ignore_ascii_case("active");
continue;
}
if line.starts_with("To") || line.starts_with("--") {
continue;
}
if let Some(r) = parse_rule(line) {
rules.push(r);
}
}
if text.trim().is_empty() {
return Err("could not detect an `ufw status` block (empty input)".into());
}
Ok(Live { active, rules })
}
/// Parse one numbered rule line. Returns None if the line is not a rule.
fn parse_rule(line: &str) -> Option<LiveRule> {
// Strip leading "[ N]" if present.
let body = if let Some(idx) = line.find(']') {
line[idx + 1..].trim()
} else {
line
};
// Columns: <to> <ACTION IN|OUT|FWD> <from>
// We split on 2+ whitespace runs which ufw pads with.
let parts: Vec<&str> = body.split(" ").filter(|s| !s.is_empty()).map(str::trim).collect();
if parts.len() < 3 {
return None;
}
let to = parts[0];
let action_raw = parts[1];
let from = parts[2];
let (to_clean, family) = if to.contains("(v6)") {
(to.replace("(v6)", "").trim().to_string(), Family::V6)
} else {
(to.to_string(), Family::V4)
};
let (port, proto) = split_port_proto(&to_clean)?;
let action = parse_action(action_raw)?;
Some(LiveRule {
port,
proto,
action,
from: from.replace("(v6)", "").trim().to_string(),
family,
})
}
fn split_port_proto(tok: &str) -> Option<(u16, String)> {
// "22/tcp" | "53" | "443/udp"
if let Some((port_s, proto_s)) = tok.split_once('/') {
Some((port_s.parse().ok()?, proto_s.to_ascii_lowercase()))
} else {
Some((tok.parse().ok()?, "tcp".into()))
}
}
fn parse_action(raw: &str) -> Option<Action> {
let up = raw.to_ascii_uppercase();
if up.starts_with("ALLOW") {
Some(Action::Allow)
} else if up.starts_with("DENY") {
Some(Action::Deny)
} else if up.starts_with("LIMIT") {
Some(Action::Limit)
} else if up.starts_with("REJECT") {
Some(Action::Reject)
} else {
None
}
}
impl LiveRule {
pub fn key(&self) -> String {
let from = if self.from.eq_ignore_ascii_case("Anywhere") {
"any"
} else {
&self.from
};
format!(
"{}/{}::{}::{}",
self.port,
self.proto,
from.to_ascii_lowercase(),
format!("{:?}", self.action).to_ascii_lowercase()
)
}
}
#[cfg(test)]
mod tests {
use super::*;
const SAMPLE: &str = r#"
Status: active
To Action From
-- ------ ----
[ 1] 22/tcp LIMIT IN Anywhere
[ 2] 443/tcp ALLOW IN Anywhere
[ 3] 22/tcp (v6) LIMIT IN Anywhere (v6)
"#;
#[test]
fn parses_active_and_rules() {
let l = parse(SAMPLE).unwrap();
assert!(l.active);
assert_eq!(l.rules.len(), 3);
assert_eq!(l.rules[0].port, 22);
assert_eq!(l.rules[0].proto, "tcp");
assert_eq!(l.rules[0].action, Action::Limit);
assert_eq!(l.rules[2].family, Family::V6);
}
#[test]
fn inactive_status_rejects_only_if_no_rules() {
let l = parse("Status: inactive\n").unwrap();
assert!(!l.active);
assert!(l.rules.is_empty());
}
}

View file

@ -0,0 +1,17 @@
[package]
name = "ssh-check"
version = "0.1.0"
edition.workspace = true
rust-version.workspace = true
[[bin]]
name = "ssh-check"
path = "src/main.rs"
[dependencies]
clap = { workspace = true }
serde = { workspace = true }
serde_json = { workspace = true }
[dev-dependencies]
tempfile = "3"

View file

@ -0,0 +1,213 @@
//! Evaluate the hardened rule matrix against a merged sshd_config view.
use crate::parse::Merged;
use crate::rules::{Expect, Rule};
use serde::Serialize;
#[derive(Debug, Clone, PartialEq, Eq, Serialize)]
pub enum Severity {
Ok,
Warn,
Fail,
}
impl Severity {
pub fn label(&self) -> &'static str {
match self {
Severity::Ok => "OK",
Severity::Warn => "WARN",
Severity::Fail => "FAIL",
}
}
}
#[derive(Debug, Clone, Serialize)]
pub struct Finding {
pub directive: String,
pub severity: Severity,
pub source: String,
pub note: String,
}
pub fn evaluate(merged: &Merged, rules: &[Rule]) -> Vec<Finding> {
let mut out = Vec::with_capacity(rules.len());
for r in rules {
out.push(eval_rule(merged, r));
}
out
}
fn eval_rule(merged: &Merged, r: &Rule) -> Finding {
let occ = merged.effective.get(r.directive);
match (occ, r.required) {
(None, true) => Finding {
directive: r.directive.into(),
severity: Severity::Fail,
source: "(missing)".into(),
note: format!("required directive absent — {}", r.rationale),
},
(None, false) => Finding {
directive: r.directive.into(),
severity: Severity::Warn,
source: "(missing)".into(),
note: format!("recommended: {}", r.rationale),
},
(Some(o), _) => {
let ok = value_matches(&o.value, &r.expect);
Finding {
directive: r.directive.into(),
severity: if ok { Severity::Ok } else { Severity::Fail },
source: o.source.clone(),
note: if ok {
"ok".into()
} else {
format!("value '{}' violates policy — {}", o.value, r.rationale)
},
}
}
}
}
fn value_matches(value: &str, expect: &Expect) -> bool {
let v = value.trim().to_ascii_lowercase();
match expect {
Expect::Equals(target) => v == target.to_ascii_lowercase(),
Expect::OneOf(list) => list.iter().any(|s| v == s.to_ascii_lowercase()),
Expect::MaxInt(max) => v.parse::<u32>().map(|n| n <= *max).unwrap_or(false),
Expect::ContainsAll(tokens) => tokens.iter().all(|t| v.contains(&t.to_ascii_lowercase())),
Expect::DeniesAny(tokens) => {
let parts: Vec<&str> = v.split(',').map(str::trim).collect();
!tokens
.iter()
.any(|t| parts.iter().any(|p| p == &t.to_ascii_lowercase()))
}
Expect::AllowedUsersSubset(allow) => {
let parts: Vec<String> = v
.split_whitespace()
.map(|s| s.to_string())
.collect();
!parts.is_empty() && parts.iter().all(|u| allow.contains(u))
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::parse::{Merged, Occurrence};
use crate::rules::hardened_matrix;
use std::collections::BTreeMap;
fn merged(pairs: &[(&str, &str)]) -> Merged {
let mut m = Merged {
effective: BTreeMap::new(),
all: BTreeMap::new(),
};
for (k, v) in pairs {
let occ = Occurrence {
value: (*v).to_string(),
source: "test:1".into(),
};
m.effective.insert((*k).to_string(), occ.clone());
m.all.insert((*k).to_string(), vec![occ]);
}
m
}
#[test]
fn hardened_baseline_passes() {
let rules = hardened_matrix(&["keiadmin".into()]);
let mg = merged(&[
("passwordauthentication", "no"),
("permitrootlogin", "prohibit-password"),
("maxauthtries", "3"),
("allowusers", "keiadmin"),
("ciphers", "chacha20-poly1305@openssh.com,aes256-gcm@openssh.com"),
("macs", "hmac-sha2-512-etm@openssh.com"),
("hostkeyalgorithms", "ssh-ed25519,rsa-sha2-512"),
]);
let findings = evaluate(&mg, &rules);
let fails: Vec<_> = findings.iter().filter(|f| f.severity == Severity::Fail).collect();
assert!(fails.is_empty(), "unexpected fails: {fails:#?}");
}
#[test]
fn password_auth_yes_fails() {
let rules = hardened_matrix(&["keiadmin".into()]);
let mg = merged(&[
("passwordauthentication", "yes"),
("permitrootlogin", "no"),
("maxauthtries", "3"),
("allowusers", "keiadmin"),
]);
let findings = evaluate(&mg, &rules);
let f = findings
.iter()
.find(|f| f.directive == "passwordauthentication")
.unwrap();
assert_eq!(f.severity, Severity::Fail);
}
#[test]
fn cbc_cipher_fails() {
let rules = hardened_matrix(&["keiadmin".into()]);
let mg = merged(&[
("passwordauthentication", "no"),
("permitrootlogin", "no"),
("maxauthtries", "3"),
("allowusers", "keiadmin"),
("ciphers", "aes256-cbc,chacha20-poly1305@openssh.com"),
]);
let findings = evaluate(&mg, &rules);
let f = findings.iter().find(|f| f.directive == "ciphers").unwrap();
assert_eq!(f.severity, Severity::Fail);
}
#[test]
fn allow_users_not_in_whitelist_fails() {
let rules = hardened_matrix(&["keiadmin".into()]);
let mg = merged(&[
("passwordauthentication", "no"),
("permitrootlogin", "no"),
("maxauthtries", "3"),
("allowusers", "root attacker"),
]);
let findings = evaluate(&mg, &rules);
let f = findings.iter().find(|f| f.directive == "allowusers").unwrap();
assert_eq!(f.severity, Severity::Fail);
}
#[test]
fn missing_required_directive_fails() {
let rules = hardened_matrix(&["keiadmin".into()]);
let mg = merged(&[
("permitrootlogin", "no"),
("maxauthtries", "3"),
("allowusers", "keiadmin"),
]);
let findings = evaluate(&mg, &rules);
let f = findings
.iter()
.find(|f| f.directive == "passwordauthentication")
.unwrap();
assert_eq!(f.severity, Severity::Fail);
assert_eq!(f.source, "(missing)");
}
#[test]
fn maxauthtries_too_high_fails() {
let rules = hardened_matrix(&["keiadmin".into()]);
let mg = merged(&[
("passwordauthentication", "no"),
("permitrootlogin", "no"),
("maxauthtries", "10"),
("allowusers", "keiadmin"),
]);
let findings = evaluate(&mg, &rules);
let f = findings
.iter()
.find(|f| f.directive == "maxauthtries")
.unwrap();
assert_eq!(f.severity, Severity::Fail);
}
}

View file

@ -0,0 +1,102 @@
//! ssh-check — pre-deploy sshd_config linter for KeiSeiKit.
//!
//! Reads /etc/ssh/sshd_config + every /etc/ssh/sshd_config.d/*.conf (or
//! user-supplied paths), merges directives via last-wins precedence, and
//! reports violations of the hardened-baseline rule matrix.
//!
//! USAGE
//! ssh-check # default system paths
//! ssh-check --config /etc/ssh/sshd_config --drop-in /etc/ssh/sshd_config.d
//! ssh-check --json # JSON output for CI
//! ssh-check --allow-user admin # extra allowed user
//!
//! EXIT
//! 0 no violations
//! 1 usage / parse error
//! 2 violations found
mod check;
mod parse;
mod rules;
use clap::Parser;
use std::path::PathBuf;
use std::process::ExitCode;
#[derive(Parser, Debug)]
#[command(
name = "ssh-check",
about = "Lint sshd_config + drop-ins against the KeiSeiKit hardened baseline."
)]
struct Cli {
/// Main sshd_config file.
#[arg(long, default_value = "/etc/ssh/sshd_config")]
config: PathBuf,
/// Drop-in directory (sshd_config.d). Pass empty string to skip.
#[arg(long, default_value = "/etc/ssh/sshd_config.d")]
drop_in: PathBuf,
/// Usernames that are acceptable in AllowUsers (repeatable).
#[arg(long = "allow-user")]
allow_user: Vec<String>,
/// Emit JSON instead of human text.
#[arg(long)]
json: bool,
}
fn main() -> ExitCode {
let cli = Cli::parse();
let merged = match parse::load_merged(&cli.config, &cli.drop_in) {
Ok(m) => m,
Err(e) => {
eprintln!("ssh-check: {e}");
return ExitCode::from(1);
}
};
let allow_users: Vec<String> = if cli.allow_user.is_empty() {
vec!["keiadmin".into()]
} else {
cli.allow_user
};
let matrix = rules::hardened_matrix(&allow_users);
let findings = check::evaluate(&merged, &matrix);
if cli.json {
let out = serde_json::to_string_pretty(&findings).unwrap_or_default();
println!("{out}");
} else {
render_human(&findings);
}
if findings.iter().any(|f| f.severity != check::Severity::Ok) {
ExitCode::from(2)
} else {
ExitCode::SUCCESS
}
}
fn render_human(findings: &[check::Finding]) {
let mut bad = 0usize;
for f in findings {
if f.severity == check::Severity::Ok {
continue;
}
bad += 1;
println!(
"[{sev:<5}] {directive:<28} {source} ({note})",
sev = f.severity.label(),
directive = f.directive,
source = f.source,
note = f.note
);
}
if bad == 0 {
println!("ssh-check: OK — hardened baseline satisfied.");
} else {
println!("ssh-check: {bad} violation(s).");
}
}

View file

@ -0,0 +1,127 @@
//! sshd_config parser — read main file + drop-ins, merge with last-wins
//! precedence per OpenSSH rules (main file first, then drop-ins in
//! filename-sort order; first occurrence of a directive wins in sshd,
//! BUT we surface ALL occurrences to report duplicates).
use std::collections::BTreeMap;
use std::fs;
use std::path::{Path, PathBuf};
/// A single directive occurrence (name, value, source path, line number).
#[derive(Debug, Clone)]
pub struct Occurrence {
pub value: String,
pub source: String, // "<file>:<line>"
}
/// Merged view: directive name (lowercased) → first-occurrence value +
/// every occurrence for duplicate detection.
#[derive(Debug, Default)]
pub struct Merged {
pub effective: BTreeMap<String, Occurrence>,
pub all: BTreeMap<String, Vec<Occurrence>>,
}
pub fn load_merged(main: &Path, drop_in: &Path) -> Result<Merged, String> {
let mut files: Vec<PathBuf> = Vec::new();
if main.exists() {
files.push(main.to_path_buf());
} else {
return Err(format!("main config not found: {}", main.display()));
}
// Drop-in dir is optional; pass empty path to skip.
if !drop_in.as_os_str().is_empty() && drop_in.is_dir() {
let mut dropins: Vec<PathBuf> = fs::read_dir(drop_in)
.map_err(|e| format!("read {}: {e}", drop_in.display()))?
.filter_map(|e| e.ok().map(|e| e.path()))
.filter(|p| p.extension().map(|s| s == "conf").unwrap_or(false))
.collect();
dropins.sort();
files.extend(dropins);
}
let mut merged = Merged::default();
for path in files {
let body =
fs::read_to_string(&path).map_err(|e| format!("read {}: {e}", path.display()))?;
for (lineno, raw) in body.lines().enumerate() {
if let Some((k, v)) = parse_line(raw) {
let occ = Occurrence {
value: v,
source: format!("{}:{}", path.display(), lineno + 1),
};
merged
.all
.entry(k.clone())
.or_default()
.push(occ.clone());
// First occurrence wins in OpenSSH — do NOT overwrite.
merged.effective.entry(k).or_insert(occ);
}
}
}
Ok(merged)
}
/// Parse one config line. Returns (lowercased_directive, raw_value) or None
/// for comments / blanks / Include (we don't recurse includes by design —
/// the skill wires explicit paths).
fn parse_line(raw: &str) -> Option<(String, String)> {
let stripped = raw.split('#').next().unwrap_or("").trim();
if stripped.is_empty() {
return None;
}
let mut parts = stripped.splitn(2, char::is_whitespace);
let name = parts.next()?.trim().to_ascii_lowercase();
let value = parts.next().unwrap_or("").trim().to_string();
if name == "include" || name == "match" {
return None;
}
Some((name, value))
}
#[cfg(test)]
mod tests {
use super::*;
fn write(dir: &Path, name: &str, body: &str) -> PathBuf {
let p = dir.join(name);
fs::write(&p, body).unwrap();
p
}
#[test]
fn parses_directives_and_ignores_comments() {
let dir = tempfile::tempdir().unwrap();
let main = write(dir.path(), "sshd_config", "# header\nPort 22\nPasswordAuthentication no\n");
let m = load_merged(&main, Path::new("")).unwrap();
assert_eq!(m.effective["port"].value, "22");
assert_eq!(m.effective["passwordauthentication"].value, "no");
}
#[test]
fn drop_in_does_not_override_main_effective_value() {
// OpenSSH: first occurrence wins. Main is read first.
let dir = tempfile::tempdir().unwrap();
let main = write(dir.path(), "sshd_config", "Port 22\n");
let d = dir.path().join("sshd_config.d");
fs::create_dir(&d).unwrap();
write(&d, "99-kei.conf", "Port 2222\n");
let m = load_merged(&main, &d).unwrap();
assert_eq!(m.effective["port"].value, "22");
assert_eq!(m.all["port"].len(), 2, "both occurrences recorded");
}
#[test]
fn include_and_match_are_skipped() {
let dir = tempfile::tempdir().unwrap();
let main = write(
dir.path(),
"sshd_config",
"Include /etc/ssh/foo.d/*.conf\nMatch User root\n\tPasswordAuthentication yes\n",
);
let m = load_merged(&main, Path::new("")).unwrap();
assert!(!m.effective.contains_key("include"));
assert!(!m.effective.contains_key("match"));
}
}

View file

@ -0,0 +1,128 @@
//! Hardened SSH baseline — rule matrix. See block
//! `_blocks/security-ssh-hardening.md` for rationale per directive.
#[derive(Debug, Clone)]
pub enum Expect {
/// Value must equal (case-insensitive) one of the given strings.
OneOf(Vec<&'static str>),
/// Value must equal the given string (case-insensitive).
Equals(&'static str),
/// Value must be a numeric literal ≤ given bound.
MaxInt(u32),
/// Value must contain ALL of the given tokens (comma-split, case-insensitive).
ContainsAll(Vec<&'static str>),
/// Value must NOT contain ANY of the given tokens.
DeniesAny(Vec<&'static str>),
/// Value must be present and non-empty; dynamic equality deferred to check.rs.
AllowedUsersSubset(Vec<String>),
}
#[derive(Debug, Clone)]
pub struct Rule {
pub directive: &'static str,
pub required: bool,
pub expect: Expect,
pub rationale: &'static str,
}
pub fn hardened_matrix(allow_users: &[String]) -> Vec<Rule> {
vec![
Rule {
directive: "passwordauthentication",
required: true,
expect: Expect::Equals("no"),
rationale: "Passwords are the #1 brute-force vector; keys only.",
},
Rule {
directive: "permitrootlogin",
required: true,
expect: Expect::OneOf(vec!["no", "prohibit-password"]),
rationale: "Root via key only (or not at all).",
},
Rule {
directive: "permitemptypasswords",
required: false,
expect: Expect::Equals("no"),
rationale: "Empty passwords never.",
},
Rule {
directive: "challengeresponseauthentication",
required: false,
expect: Expect::Equals("no"),
rationale: "Disables keyboard-interactive fallback.",
},
Rule {
directive: "kbdinteractiveauthentication",
required: false,
expect: Expect::Equals("no"),
rationale: "OpenSSH 8.7+ directive; supersedes ChallengeResponseAuthentication.",
},
Rule {
directive: "maxauthtries",
required: true,
expect: Expect::MaxInt(3),
rationale: "Limits per-connection key attempts; combine with fail2ban.",
},
Rule {
directive: "x11forwarding",
required: false,
expect: Expect::Equals("no"),
rationale: "Not needed on servers; attack surface.",
},
Rule {
directive: "allowtcpforwarding",
required: false,
expect: Expect::OneOf(vec!["no", "local"]),
rationale: "Blocks SSH-as-VPN; enable per Match block if needed.",
},
Rule {
directive: "permittunnel",
required: false,
expect: Expect::Equals("no"),
rationale: "Blocks tun(4) tunnel device.",
},
Rule {
directive: "clientaliveinterval",
required: false,
expect: Expect::MaxInt(300),
rationale: "Idle sessions terminated after a few minutes.",
},
Rule {
directive: "loglevel",
required: false,
expect: Expect::OneOf(vec!["verbose", "debug1", "debug2", "debug3"]),
rationale: "VERBOSE logs key fingerprints for audit.",
},
Rule {
directive: "allowusers",
required: true,
expect: Expect::AllowedUsersSubset(allow_users.to_vec()),
rationale: "Explicit admin whitelist.",
},
Rule {
directive: "ciphers",
required: false,
expect: Expect::DeniesAny(vec![
"aes128-cbc",
"aes192-cbc",
"aes256-cbc",
"3des-cbc",
"blowfish-cbc",
"rijndael-cbc@lysator.liu.se",
]),
rationale: "CBC ciphers vulnerable to Terrapin / padding oracles.",
},
Rule {
directive: "macs",
required: false,
expect: Expect::ContainsAll(vec!["etm"]),
rationale: "ETM (Encrypt-Then-MAC) only; legacy MAC is broken.",
},
Rule {
directive: "hostkeyalgorithms",
required: false,
expect: Expect::DeniesAny(vec!["ssh-rsa", "ssh-dss"]),
rationale: "ssh-rsa = SHA-1 signature, deprecated. Use rsa-sha2-*.",
},
]
}

240
_primitives/harden-base.sh Executable file
View file

@ -0,0 +1,240 @@
#!/usr/bin/env bash
# harden-base.sh — idempotent Debian/Ubuntu baseline hardening.
# Runs ON THE TARGET VPS (not the local workstation). Ports generic
# patterns from ~/Projects/vortex/control/setup/setup.sh:13-53 — strips
# Vortex-specific Xray/sing-box/Wireguard steps.
#
# USAGE
# curl -fsSL <your-raw-url>/harden-base.sh | sudo bash -s -- [options]
# OR
# scp harden-base.sh keiadmin@host:/tmp/
# ssh keiadmin@host "sudo bash /tmp/harden-base.sh"
#
# OPTIONS
# --admin-user <name> default: keiadmin
# --ssh-port <n> default: 22 (opens in ufw + enforces in sshd drop-in)
# --allow-port <n/proto> repeatable. e.g. --allow-port 443/tcp --allow-port 80/tcp
# --no-caddy skip Caddy install (default: skip — install via its own block)
# --no-reboot default (never reboots; surfaces needrestart hints only)
# --skip <step> repeatable; known steps: apt, ssh, ufw, fail2ban, auditd, unattended
#
# IDEMPOTENCY
# Every step is `test → configure → reload`. Re-run = diff-and-apply.
# Detects the sshd_config.d/99-kei.conf + audit rules files; overwrites with
# known-good content; never destroys /etc/ssh/sshd_config itself.
#
# ENV
# No secrets read. SECRETS SINGLE SOURCE (RULE 0.8) → harden-base does not
# touch tokens/keys.
#
# EXIT
# 0 ok
# 1 usage / platform not supported
# 2 hardening step failed (stderr)
set -euo pipefail
log() { printf '[%s] [harden-base] %s\n' "$(date '+%H:%M:%S')" "$*" >&2; }
die() { log "ERROR: $*"; exit "${2:-2}"; }
# ------------------------------------------------------------------ args
ADMIN_USER="keiadmin"
SSH_PORT="22"
ALLOW_PORTS=()
SKIPS=()
NO_CADDY=1 # default on; Caddy install is its own block
while [ $# -gt 0 ]; do
case "$1" in
--admin-user) ADMIN_USER="$2"; shift 2 ;;
--ssh-port) SSH_PORT="$2"; shift 2 ;;
--allow-port) ALLOW_PORTS+=("$2"); shift 2 ;;
--no-caddy) NO_CADDY=1; shift ;;
--no-reboot) shift ;; # accepted for clarity; we never auto-reboot
--skip) SKIPS+=("$2"); shift 2 ;;
-h|--help) cat <<EOF >&2
harden-base.sh — Debian/Ubuntu baseline hardening.
OPTIONS
--admin-user <name> default: keiadmin
--ssh-port <n> default: 22
--allow-port <n/proto> repeatable (e.g. 443/tcp, 80/tcp)
--no-caddy (default) skip Caddy install
--skip <step> apt|ssh|ufw|fail2ban|auditd|unattended
EOF
exit 0 ;;
*) die "unknown flag '$1'" 1 ;;
esac
done
# ------------------------------------------------------------------ guards
[ "$(id -u)" -eq 0 ] || die "must run as root (sudo)." 1
. /etc/os-release 2>/dev/null || die "cannot read /etc/os-release" 1
case "${ID:-}" in
debian|ubuntu) : ;;
*) die "only Debian/Ubuntu supported (got ID=${ID:-unknown})" 1 ;;
esac
skipped() {
local step="$1"
for s in "${SKIPS[@]}"; do [ "$s" = "$step" ] && return 0; done
return 1
}
# ------------------------------------------------------------------ step: apt
step_apt() {
skipped apt && { log "skip apt"; return; }
log "apt update + base packages…"
export DEBIAN_FRONTEND=noninteractive
apt-get update -qq
apt-get install -y -qq \
ufw fail2ban unattended-upgrades needrestart auditd audispd-plugins \
curl wget jq ca-certificates openssh-server
}
# ------------------------------------------------------------------ step: admin user
step_admin_user() {
if id "$ADMIN_USER" >/dev/null 2>&1; then
log "user '$ADMIN_USER' exists"
else
log "creating '$ADMIN_USER' (sudo, bash, NOPASSWD)"
useradd -m -s /bin/bash -G sudo "$ADMIN_USER"
install -d -m 0700 -o "$ADMIN_USER" -g "$ADMIN_USER" "/home/$ADMIN_USER/.ssh"
fi
install -d -m 0755 /etc/sudoers.d
cat >/etc/sudoers.d/90-keiadmin <<EOF
$ADMIN_USER ALL=(ALL) NOPASSWD:ALL
EOF
chmod 0440 /etc/sudoers.d/90-keiadmin
visudo -cf /etc/sudoers.d/90-keiadmin >/dev/null
}
# ------------------------------------------------------------------ step: ssh
step_ssh() {
skipped ssh && { log "skip ssh"; return; }
log "ssh: writing /etc/ssh/sshd_config.d/99-kei.conf (port=$SSH_PORT, user=$ADMIN_USER)…"
install -d -m 0755 /etc/ssh/sshd_config.d
cat >/etc/ssh/sshd_config.d/99-kei.conf <<EOF
# GENERATED by harden-base.sh — idempotent. Edit intent, re-run the script.
Port $SSH_PORT
Protocol 2
PasswordAuthentication no
ChallengeResponseAuthentication no
KbdInteractiveAuthentication no
PermitRootLogin prohibit-password
PermitEmptyPasswords no
UsePAM yes
MaxAuthTries 3
MaxSessions 4
LoginGraceTime 20
AllowUsers $ADMIN_USER
AllowTcpForwarding no
X11Forwarding no
PermitTunnel no
ClientAliveInterval 120
ClientAliveCountMax 2
LogLevel VERBOSE
KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org,sntrup761x25519-sha512@openssh.com
Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,aes128-gcm@openssh.com
MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com
HostKeyAlgorithms ssh-ed25519,rsa-sha2-512,rsa-sha2-256
EOF
sshd -t
systemctl reload ssh 2>/dev/null || systemctl reload sshd
}
# ------------------------------------------------------------------ step: ufw
step_ufw() {
skipped ufw && { log "skip ufw"; return; }
log "ufw: default-deny-in + allow-out + ssh rate-limit…"
ufw --force reset >/dev/null
ufw default deny incoming
ufw default allow outgoing
ufw default deny routed
ufw limit "$SSH_PORT/tcp" comment "ssh (rate-limited)"
for p in "${ALLOW_PORTS[@]}"; do
log " allow $p"
ufw allow "$p"
done
ufw logging medium
ufw --force enable
}
# ------------------------------------------------------------------ step: fail2ban
step_fail2ban() {
skipped fail2ban && { log "skip fail2ban"; return; }
log "fail2ban: writing /etc/fail2ban/jail.local (sshd jail)…"
cat >/etc/fail2ban/jail.local <<EOF
[DEFAULT]
bantime = 3600
findtime = 600
maxretry = 5
backend = systemd
[sshd]
enabled = true
port = $SSH_PORT
EOF
systemctl enable --now fail2ban
systemctl restart fail2ban
}
# ------------------------------------------------------------------ step: auditd
step_auditd() {
skipped auditd && { log "skip auditd"; return; }
log "auditd: writing /etc/audit/rules.d/99-kei.rules…"
install -d -m 0750 /etc/audit/rules.d
cat >/etc/audit/rules.d/99-kei.rules <<'EOF'
# GENERATED by harden-base.sh — idempotent baseline.
-w /etc/ssh/sshd_config -p wa -k sshd_config
-w /etc/ssh/sshd_config.d/ -p wa -k sshd_config
-w /root/.ssh/ -p wa -k ssh_keys_root
-w /etc/sudoers -p wa -k sudoers
-w /etc/sudoers.d/ -p wa -k sudoers
-a always,exit -F arch=b64 -S execve -F euid=0 -F auid>=1000 -F auid!=unset -k sudo_root
-w /etc/passwd -p wa -k identity
-w /etc/group -p wa -k identity
-w /etc/shadow -p wa -k identity
-w /etc/gshadow -p wa -k identity
-a always,exit -F arch=b64 -S init_module -S finit_module -S delete_module -k module
-a always,exit -F arch=b64 -S adjtimex -S settimeofday -S clock_settime -k time
-w /etc/localtime -p wa -k time
-e 2
EOF
augenrules --load >/dev/null
systemctl enable --now auditd
}
# ------------------------------------------------------------------ step: unattended-upgrades + needrestart
step_unattended() {
skipped unattended && { log "skip unattended"; return; }
log "unattended-upgrades + needrestart (list-only)…"
cat >/etc/apt/apt.conf.d/20auto-upgrades <<'EOF'
APT::Periodic::Update-Package-Lists "1";
APT::Periodic::Unattended-Upgrade "1";
APT::Periodic::AutocleanInterval "7";
EOF
cat >/etc/apt/apt.conf.d/50unattended-upgrades.kei <<'EOF'
Unattended-Upgrade::Origins-Pattern {
"origin=Debian,codename=${distro_codename}-security";
"origin=Debian,codename=${distro_codename}-updates";
"origin=Ubuntu,archive=${distro_codename}-security";
};
Unattended-Upgrade::Automatic-Reboot "false";
Unattended-Upgrade::MailReport "on-change";
EOF
# needrestart: list services, suppress TTY prompts (non-TTY cron safe).
if [ -f /etc/needrestart/needrestart.conf ]; then
sed -i "s/^#\?\$nrconf{restart}.*/\$nrconf{restart} = 'l';/" /etc/needrestart/needrestart.conf
sed -i "s/^#\?\$nrconf{kernelhints}.*/\$nrconf{kernelhints} = -1;/" /etc/needrestart/needrestart.conf
fi
}
# ------------------------------------------------------------------ main
log "start: admin=$ADMIN_USER ssh=$SSH_PORT extra-ports=${ALLOW_PORTS[*]:-none}"
step_apt
step_admin_user
step_ssh
step_ufw
step_fail2ban
step_auditd
step_unattended
log "done. Next: verify via _primitives/_rust/ssh-check + firewall-diff."

164
_primitives/provision-hetzner.sh Executable file
View file

@ -0,0 +1,164 @@
#!/usr/bin/env bash
# provision-hetzner.sh — idempotent Hetzner Cloud server provisioning.
# Wraps the `hcloud` CLI. Install path:
# $HOME/.claude/agents/_primitives/provision-hetzner.sh
#
# USAGE
# provision-hetzner.sh create <name> [--type cx22|cax11] [--location fsn1] \
# [--image debian-12] [--ssh-key <id>] \
# [--firewall <name>] [--user-data <file>]
# provision-hetzner.sh status <name>
# provision-hetzner.sh destroy <name> [--force]
# provision-hetzner.sh list
#
# ENV (RULE 0.8 — secrets single source)
# HCLOUD_TOKEN — Hetzner API token (REQUIRED). Source:
# $(grep ^HCLOUD_TOKEN ~/.claude/secrets/.env | cut -d= -f2)
#
# EXIT
# 0 ok
# 1 usage / missing args / missing deps / unknown command
# 2 hcloud API error (non-idempotent path — inspect stderr)
#
# IDEMPOTENCY
# `create <name>` on an existing server prints its IP + exits 0.
# `destroy <name>` on a missing server exits 0 (nothing to do).
set -euo pipefail
log() { printf '[%s] [provision-hetzner] %s\n' "$(date '+%H:%M:%S')" "$*" >&2; }
die() { log "ERROR: $*"; exit "${2:-2}"; }
check_deps() {
command -v hcloud >/dev/null 2>&1 || \
die "hcloud CLI missing. Install: brew install hcloud (macOS) | https://github.com/hetznercloud/cli/releases" 1
command -v jq >/dev/null 2>&1 || die "jq missing. brew install jq" 1
[ -n "${HCLOUD_TOKEN:-}" ] || die "HCLOUD_TOKEN not set. Source ~/.claude/secrets/.env first." 1
}
# Print server JSON if it exists, empty string otherwise. Never fails.
server_json() {
local name="$1"
hcloud server describe "$name" -o json 2>/dev/null || true
}
cmd_list() {
check_deps
hcloud server list -o 'columns=id,name,status,ipv4,location,server_type,created'
}
cmd_status() {
check_deps
local name="${1:-}"; [ -n "$name" ] || die "status: <name> required" 1
local json; json=$(server_json "$name")
if [ -z "$json" ]; then
echo "absent"
return 0
fi
printf 'name=%s\nstatus=%s\nipv4=%s\nlocation=%s\ntype=%s\n' \
"$(jq -r .name <<<"$json")" \
"$(jq -r .status <<<"$json")" \
"$(jq -r '.public_net.ipv4.ip // "-"' <<<"$json")" \
"$(jq -r .datacenter.location.name <<<"$json")" \
"$(jq -r .server_type.name <<<"$json")"
}
cmd_create() {
check_deps
local name="${1:-}"; shift || true
[ -n "$name" ] || die "create: <name> required" 1
local type="cx22" location="fsn1" image="debian-12"
local ssh_key="" firewall="" user_data=""
while [ $# -gt 0 ]; do
case "$1" in
--type) type="$2"; shift 2 ;;
--location) location="$2"; shift 2 ;;
--image) image="$2"; shift 2 ;;
--ssh-key) ssh_key="$2"; shift 2 ;;
--firewall) firewall="$2"; shift 2 ;;
--user-data) user_data="$2"; shift 2 ;;
*) die "create: unknown flag '$1'" 1 ;;
esac
done
# Idempotent fast-path: if the server already exists, just print its IP.
local existing; existing=$(server_json "$name")
if [ -n "$existing" ]; then
local ip; ip=$(jq -r '.public_net.ipv4.ip // "-"' <<<"$existing")
log "server '$name' already exists → $ip (no-op)"
echo "$ip"
return 0
fi
local args=(server create
--name "$name"
--type "$type"
--image "$image"
--location "$location"
--label "project=kei"
)
[ -n "$ssh_key" ] && args+=(--ssh-key "$ssh_key")
[ -n "$firewall" ] && args+=(--firewall "$firewall")
[ -n "$user_data" ] && { [ -r "$user_data" ] || die "user-data not readable: $user_data" 1; args+=(--user-data-from-file "$user_data"); }
log "creating '$name' ($type @ $location, image=$image)…"
hcloud "${args[@]}" -o json >/tmp/provision-hetzner-$$.json
local ip; ip=$(jq -r '.server.public_net.ipv4.ip' /tmp/provision-hetzner-$$.json)
rm -f /tmp/provision-hetzner-$$.json
[ "$ip" != "null" ] && [ -n "$ip" ] || die "create returned no IPv4 (check stderr)"
log "created '$name' → $ip"
echo "$ip"
}
cmd_destroy() {
check_deps
local name="${1:-}"; shift || true
[ -n "$name" ] || die "destroy: <name> required" 1
local force=""
[ "${1:-}" = "--force" ] && force=1
local existing; existing=$(server_json "$name")
if [ -z "$existing" ]; then
log "server '$name' absent (no-op)"
return 0
fi
if [ -z "$force" ]; then
printf 'Destroy server "%s"? [y/N] ' "$name" >&2
read -r ans
[ "$ans" = "y" ] || [ "$ans" = "Y" ] || { log "aborted"; return 1; }
fi
log "deleting '$name'…"
hcloud server delete "$name" >&2
log "deleted '$name'"
}
main() {
local cmd="${1:-}"; shift || true
case "$cmd" in
create) cmd_create "$@" ;;
destroy) cmd_destroy "$@" ;;
status) cmd_status "$@" ;;
list) cmd_list "$@" ;;
-h|--help|"") cat <<EOF >&2
provision-hetzner.sh — idempotent Hetzner Cloud server provisioning.
USAGE
provision-hetzner.sh create <name> [--type cx22|cax11] [--location fsn1] \\
[--image debian-12] [--ssh-key <id>] \\
[--firewall <name>] [--user-data <file>]
provision-hetzner.sh status <name>
provision-hetzner.sh destroy <name> [--force]
provision-hetzner.sh list
ENV
HCLOUD_TOKEN (required) — load via: source ~/.claude/secrets/.env
EOF
[ "$cmd" = "-h" ] || [ "$cmd" = "--help" ] && exit 0 || exit 1
;;
*) die "unknown command '$cmd'. Run --help." 1 ;;
esac
}
main "$@"

196
_primitives/provision-vultr.sh Executable file
View file

@ -0,0 +1,196 @@
#!/usr/bin/env bash
# provision-vultr.sh — idempotent Vultr VPS provisioning.
# Wraps the `vultr-cli` v3. Install path:
# $HOME/.claude/agents/_primitives/provision-vultr.sh
#
# USAGE
# provision-vultr.sh create <label> [--plan vc2-1c-1gb] [--region ams] \
# [--os-id 2136] [--ssh-key <id>] \
# [--firewall <group-id>] [--user-data <file>]
# provision-vultr.sh status <label>
# provision-vultr.sh destroy <label> [--force]
# provision-vultr.sh list
#
# ENV (RULE 0.8 — secrets single source)
# VULTR_API_KEY — Vultr API key (REQUIRED). Source:
# $(grep ^VULTR_API_KEY ~/.claude/secrets/.env | cut -d= -f2)
#
# NOTES
# * vultr-cli v3: `vultr-cli instance create …` (not `server`).
# * --os-id 2136 = Debian 12 x86_64 (subject to change; verify via
# `vultr-cli os list | grep Debian`). We do NOT hard-code the ID.
# * Vultr identifies instances by UUID; we use the human-friendly `label`
# field for idempotency. Labels must be unique within the account.
#
# EXIT
# 0 ok
# 1 usage / missing args / missing deps / unknown command
# 2 vultr API error
set -euo pipefail
log() { printf '[%s] [provision-vultr] %s\n' "$(date '+%H:%M:%S')" "$*" >&2; }
die() { log "ERROR: $*"; exit "${2:-2}"; }
check_deps() {
command -v vultr-cli >/dev/null 2>&1 || \
die "vultr-cli missing. Install: brew install vultr/vultr-cli/vultr-cli | https://github.com/vultr/vultr-cli" 1
command -v jq >/dev/null 2>&1 || die "jq missing. brew install jq" 1
[ -n "${VULTR_API_KEY:-}" ] || die "VULTR_API_KEY not set. Source ~/.claude/secrets/.env first." 1
}
# Return JSON of instance with matching label, or empty string.
instance_json_by_label() {
local label="$1"
vultr-cli instance list -o json 2>/dev/null \
| jq -c --arg l "$label" '.instances[] | select(.label == $l)' \
| head -n1
}
cmd_list() {
check_deps
vultr-cli instance list -o json \
| jq -r '.instances[] | [.label, .region, .plan, .status, .main_ip] | @tsv'
}
cmd_status() {
check_deps
local label="${1:-}"; [ -n "$label" ] || die "status: <label> required" 1
local json; json=$(instance_json_by_label "$label")
if [ -z "$json" ]; then
echo "absent"
return 0
fi
printf 'label=%s\nid=%s\nstatus=%s\npower=%s\nip=%s\nregion=%s\nplan=%s\n' \
"$(jq -r .label <<<"$json")" \
"$(jq -r .id <<<"$json")" \
"$(jq -r .status <<<"$json")" \
"$(jq -r .power_status <<<"$json")" \
"$(jq -r .main_ip <<<"$json")" \
"$(jq -r .region <<<"$json")" \
"$(jq -r .plan <<<"$json")"
}
resolve_debian_12_os() {
# Return the OS id for "Debian 12 x64" (subject to Vultr catalog updates).
vultr-cli os list -o json \
| jq -r '.os[] | select(.name | test("Debian 12.*x64"; "i")) | .id' \
| head -n1
}
cmd_create() {
check_deps
local label="${1:-}"; shift || true
[ -n "$label" ] || die "create: <label> required" 1
local plan="vc2-1c-1gb" region="ams" os_id=""
local ssh_key="" firewall="" user_data=""
while [ $# -gt 0 ]; do
case "$1" in
--plan) plan="$2"; shift 2 ;;
--region) region="$2"; shift 2 ;;
--os-id) os_id="$2"; shift 2 ;;
--ssh-key) ssh_key="$2"; shift 2 ;;
--firewall) firewall="$2"; shift 2 ;;
--user-data) user_data="$2"; shift 2 ;;
*) die "create: unknown flag '$1'" 1 ;;
esac
done
# Idempotency.
local existing; existing=$(instance_json_by_label "$label")
if [ -n "$existing" ]; then
local ip; ip=$(jq -r '.main_ip // "-"' <<<"$existing")
log "instance '$label' already exists → $ip (no-op)"
echo "$ip"
return 0
fi
[ -z "$os_id" ] && { os_id=$(resolve_debian_12_os) || true; }
[ -n "$os_id" ] || die "cannot resolve Debian 12 OS id. Pass --os-id explicitly." 1
local args=(instance create
--region "$region"
--plan "$plan"
--os "$os_id"
--label "$label"
--tags "project=kei"
)
[ -n "$ssh_key" ] && args+=(--ssh-keys "$ssh_key")
[ -n "$firewall" ] && args+=(--firewall-group-id "$firewall")
if [ -n "$user_data" ]; then
[ -r "$user_data" ] || die "user-data not readable: $user_data" 1
# vultr-cli expects base64 for userdata.
args+=(--userdata "$(base64 < "$user_data" | tr -d '\n')")
fi
log "creating '$label' ($plan @ $region, os=$os_id)…"
vultr-cli "${args[@]}" -o json >/tmp/provision-vultr-$$.json
local ip; ip=$(jq -r '.instance.main_ip' /tmp/provision-vultr-$$.json)
rm -f /tmp/provision-vultr-$$.json
# Vultr assigns IP asynchronously — re-poll if empty.
if [ "$ip" = "" ] || [ "$ip" = "null" ] || [ "$ip" = "0.0.0.0" ]; then
log "IP pending — polling instance status up to 60s…"
for _ in $(seq 1 30); do
sleep 2
ip=$(instance_json_by_label "$label" | jq -r '.main_ip // ""')
[ -n "$ip" ] && [ "$ip" != "0.0.0.0" ] && break
done
fi
[ -n "$ip" ] && [ "$ip" != "0.0.0.0" ] || die "create: no IPv4 after 60s poll"
log "created '$label' → $ip"
echo "$ip"
}
cmd_destroy() {
check_deps
local label="${1:-}"; shift || true
[ -n "$label" ] || die "destroy: <label> required" 1
local force=""
[ "${1:-}" = "--force" ] && force=1
local existing; existing=$(instance_json_by_label "$label")
if [ -z "$existing" ]; then
log "instance '$label' absent (no-op)"
return 0
fi
local id; id=$(jq -r .id <<<"$existing")
if [ -z "$force" ]; then
printf 'Destroy instance "%s" (%s)? [y/N] ' "$label" "$id" >&2
read -r ans
[ "$ans" = "y" ] || [ "$ans" = "Y" ] || { log "aborted"; return 1; }
fi
log "deleting '$label' ($id)…"
vultr-cli instance delete "$id" >&2
log "deleted '$label'"
}
main() {
local cmd="${1:-}"; shift || true
case "$cmd" in
create) cmd_create "$@" ;;
destroy) cmd_destroy "$@" ;;
status) cmd_status "$@" ;;
list) cmd_list "$@" ;;
-h|--help|"") cat <<EOF >&2
provision-vultr.sh — idempotent Vultr VPS provisioning.
USAGE
provision-vultr.sh create <label> [--plan vc2-1c-1gb] [--region ams] \\
[--os-id <id>] [--ssh-key <id>] \\
[--firewall <group-id>] [--user-data <file>]
provision-vultr.sh status <label>
provision-vultr.sh destroy <label> [--force]
provision-vultr.sh list
ENV
VULTR_API_KEY (required) — load via: source ~/.claude/secrets/.env
EOF
[ "$cmd" = "-h" ] || [ "$cmd" = "--help" ] && exit 0 || exit 1
;;
*) die "unknown command '$cmd'. Run --help." 1 ;;
esac
}
main "$@"

View file

@ -0,0 +1,130 @@
---
name: vm-provision
description: End-to-end VPS provisioning — select provider → plan → provision → harden → verify (ssh-check + firewall-diff hard-gate) → handoff. 6 phases, ≥6 AskUserQuestion calls, defensive-only. Stops if either verification primitive fails.
argument-hint: <optional one-line intent, e.g. "staging api hetzner eu">
---
# /vm-provision — 6-Phase VPS Pipeline (index)
You turn a short intent ("staging API in EU") into a **hardened, verified
VPS** ready to host an app. Six phases. Every provider choice, plan detail,
and fix is surfaced as an `AskUserQuestion` click — no silent defaults.
This `SKILL.md` is the INDEX. Each phase lives in its own file, executed in
order. Never skip a phase. Never re-order phases.
---
## Pipeline overview
| Phase | File | Purpose | AskUserQuestion |
|---|---|---|---|
| 1 | [phase-1-select-provider.md](phase-1-select-provider.md) | Provider + region + plan + ARM/x86 | 2× |
| 2 | [phase-2-plan.md](phase-2-plan.md) | Plan Mode doc: ports, TLS, admin user | 1× |
| 3 | [phase-3-provision.md](phase-3-provision.md) | Provision + SSH first contact | 1× |
| 4 | [phase-4-harden.md](phase-4-harden.md) | Run `harden-base.sh` over SSH | 1× |
| 5 | [phase-5-verify.md](phase-5-verify.md) | `ssh-check` + `firewall-diff` **HARD GATE** | 1× |
| 6 | [phase-6-handoff.md](phase-6-handoff.md) | Artifact list + optional `/web-deploy` | — (final report) |
**Minimum AskUserQuestion count across a complete pipeline: 6+** — pure-
click contract. Only the intent argument and per-port customisations are
typed.
---
## Hard-Gate Invariant (LOAD-BEARING)
> **No application is deployed onto a VM that has not passed BOTH
> `ssh-check` (exit 0) and `firewall-diff` (exit 0) in Phase 5.**
Enforced by Phase 5:
- `ssh-check --config /etc/ssh/sshd_config --drop-in /etc/ssh/sshd_config.d` → exit 0.
- `ufw status numbered | firewall-diff --intent firewall-intent.yaml --stdin` → exit 0.
- Any non-zero exit → STOP the pipeline; loop back to Phase 4 after the user
approves a remediation path.
The verify step is DEFENSIVE ONLY (read + parse). It never scans the host
for open CVEs or probes third-party endpoints.
---
## Variables the pipeline produces
| Name | Set in | Meaning |
|---|---|---|
| `INTENT` | arg | 1-line user description of the target VM |
| `PROVIDER` | Phase 1 | hetzner / vultr / digitalocean / upcloud / linode |
| `REGION` | Phase 1 | provider-specific region code |
| `PLAN` | Phase 1 | cx22 / cax11 / vc2-1c-1gb / … |
| `ARCH` | Phase 1 | x86_64 / arm64 |
| `ADMIN_USER` | Phase 2 | default `keiadmin` |
| `SSH_PORT` | Phase 2 | default 22; custom permitted |
| `APP_PORTS` | Phase 2 | e.g. `[443/tcp, 80/tcp]` |
| `TLS_HOST` | Phase 2 | optional FQDN for Caddy |
| `VM_IP` | Phase 3 | IPv4 of the created VM |
| `VM_NAME` | Phase 3 | provider resource label |
| `HARDENED` | Phase 4 | true when harden-base.sh exited 0 |
| `SSH_CHECK_OK` | Phase 5 | exit 0 of `ssh-check` |
| `FW_DIFF_OK` | Phase 5 | exit 0 of `firewall-diff` |
| `HANDOFF_TO` | Phase 6 | next skill (e.g. `/web-deploy`) or `none` |
---
## Final report (emit after Phase 6)
```
=== /VM-PROVISION REPORT ===
Intent: <first 80 chars of INTENT>
Provider: <PROVIDER> / region=<REGION> / plan=<PLAN> / arch=<ARCH>
VM: <VM_NAME> @ <VM_IP>
Admin: <ADMIN_USER> (ssh port <SSH_PORT>)
Ports: <APP_PORTS>
TLS: <TLS_HOST or "none">
Hardened: <HARDENED>
Verification: ssh-check=<PASS/FAIL> firewall-diff=<PASS/FAIL>
Handoff: <HANDOFF_TO>
Artifacts: <terraform state path | cloud-init.yaml path>
```
---
## Rules (enforced at every phase)
- **Pure-click contract.** Only `INTENT` (argument) and custom port values
(Phase 2.c) are typed. Every other decision is an `AskUserQuestion`.
- **Hard gate (Phase 5).** `ssh-check` AND `firewall-diff` must exit 0
before Phase 6. Neither can be skipped.
- **RULE -1 NO DOWNGRADE.** Any phase that fails returns 2-3 constructive
paths, never "can't be done".
- **RULE 0.8 Secrets Single Source.** All provider tokens come from
`~/.claude/secrets/.env` (or per-project `secrets/*.env`). NEVER read
a token from the conversation, NEVER write one to a file.
- **RULE 0.4 NO HALLUCINATION.** Provider specifics (prices, region codes,
plan IDs) must be fetched at time of use, not recalled. Cite source.
- **RULE 0.5 Plan Mode First.** Phase 2 writes the plan; no provisioning
happens before the user clicks "approve".
- **Defensive-only.** No scanning tools, no CVE probes, no third-party
attack surface analysis. Pure config linting.
- **Surgical changes.** Harden only the VM being provisioned. Never touch
the caller's workstation config.
- **Constructor Pattern (RULE ZERO).** Each phase file ≤ 200 LOC;
generated cloud-init / Caddyfile artefacts never exceed 200 LOC — split
into role-specific files if they would.
---
## References
- [phase-1-select-provider.md](phase-1-select-provider.md) · [phase-2-plan.md](phase-2-plan.md) · [phase-3-provision.md](phase-3-provision.md) · [phase-4-harden.md](phase-4-harden.md) · [phase-5-verify.md](phase-5-verify.md) · [phase-6-handoff.md](phase-6-handoff.md)
- `_blocks/deploy-hetzner-cloud.md` — Hetzner Cloud specifics (Phase 1)
- `_blocks/deploy-vps-generic.md` — provider-agnostic cloud-init + TF skeleton (Phase 1/3)
- `_blocks/security-ssh-hardening.md` — sshd drop-in baseline (Phase 4/5)
- `_blocks/security-firewall-ufw.md` — ufw intent schema (Phase 2/5)
- `_blocks/security-tls-caddy.md` — TLS (Phase 6 handoff)
- `_blocks/security-audit-logging.md` — auditd baseline (Phase 4)
- `_blocks/security-patching.md` — unattended-upgrades (Phase 4)
- `_primitives/provision-hetzner.sh` · `_primitives/provision-vultr.sh` — provisioners (Phase 3)
- `_primitives/harden-base.sh` — hardening script (Phase 4)
- `_primitives/_rust/ssh-check/` · `_primitives/_rust/firewall-diff/` — verify gate (Phase 5)
- `skills/web-deploy/SKILL.md` — optional Phase 6 handoff

View file

@ -0,0 +1,103 @@
# Phase 1 — Select Provider + Region + Plan
> Goal: lock `PROVIDER`, `REGION`, `PLAN`, `ARCH` via two AskUserQuestion
> calls. No provisioning yet — this is pure decision.
> **Verify criterion:** all four variables set; provider credentials (one
> env-var name) identified in `~/.claude/secrets/.env`.
---
## 1.a — First AskUserQuestion (4 options max)
**Provider?** (single-select, stored as `PROVIDER`):
- **Hetzner Cloud** — cheapest EU, CX22 x86 / CAX11 ARM64 both €3.79/mo
[VERIFIED `_blocks/deploy-hetzner-cloud.md`]. Requires `HCLOUD_TOKEN`.
- **Vultr** — broad region list, HF compute, $5-10/mo tiers. Requires
`VULTR_API_KEY`.
- **DigitalOcean** — strong US presence, simple API. Requires
`DIGITALOCEAN_TOKEN`. Uses `deploy-vps-generic.md` cloud-init.
- **UpCloud** — preferred for RU-routed workloads (Finnish ASN). Requires
`UPCLOUD_USERNAME` + `UPCLOUD_PASSWORD`.
If the intent argument mentions a provider already, pre-select it.
**Credential check BEFORE the click:** read `~/.claude/secrets/.env`; if the
chosen provider's env var is absent, surface a ONE-line remediation:
> "Provider X needs `<VAR>` in `~/.claude/secrets/.env`. Add it and
> re-invoke — I don't accept tokens pasted into chat (RULE 0.8)."
Do NOT proceed until the token is in place.
---
## 1.b — Second AskUserQuestion (region + plan + arch, 3 Q's)
Send three questions in one `AskUserQuestion` call. Options are
provider-specific; generate them from the following matrix (do NOT
hallucinate codes — re-verify against the provider doc link on each run):
**Region** (stored as `REGION`):
- Hetzner: `fsn1` (Falkenstein DE), `nbg1` (Nürnberg DE), `hel1` (Helsinki
FI), `ash` (Ashburn US), `hil` (Hillsboro US), `sin` (Singapore)
[VERIFIED https://docs.hetzner.com/cloud/general/locations].
- Vultr: `ams` (Amsterdam), `fra` (Frankfurt), `ewr` (Newark), `lax`
(LA), `nrt` (Tokyo), `sgp` (Singapore).
- DigitalOcean: `nyc1/2/3`, `sfo3`, `ams3`, `fra1`, `lon1`, `sgp1`.
- UpCloud: `de-fra1`, `fi-hel1`, `fi-hel2`, `us-nyc1`, `sg-sin1`.
Pick the closest region to the user's stated audience. Prefer the EU when
the user doesn't specify (lower GDPR exposure).
**Plan** (stored as `PLAN`):
- Hetzner x86: `cx22` (2 vCPU / 4 GB / 40 GB / €3.79/mo), `cx32` (4 vCPU /
8 GB / €6.79/mo).
- Hetzner ARM: `cax11` (2 vCPU / 4 GB / €3.79/mo), `cax21` (4 vCPU / 8 GB
/ €6.49/mo).
- Vultr: `vc2-1c-1gb` ($6/mo), `vc2-2c-4gb` ($24/mo), `vhp-1c-2gb-amd`
($14/mo).
- DigitalOcean: `s-1vcpu-1gb` ($6/mo), `s-2vcpu-2gb` ($18/mo).
- UpCloud: `1xCPU-1GB`, `2xCPU-2GB`.
Quote only the plans you can verify against the provider's live pricing
at call-time; do not embed stale pricing as fact.
**Arch** (stored as `ARCH`):
- `x86_64` — default; works with every Debian 12 image.
- `arm64` — Hetzner `cax*`, AWS Graviton, Oracle Ampere. ~25% cheaper.
Rust builds run natively; Node/Python binary wheels may need extra
install steps.
---
## 1.c — Verify criterion
Before moving to Phase 2:
- [ ] `PROVIDER`, `REGION`, `PLAN`, `ARCH` all set.
- [ ] The provider credential env-var EXISTS in `~/.claude/secrets/.env`
(we only read the env-var name, never the value).
- [ ] The user clicked OK, not "back".
Emit one-liner:
`Phase 1 done: PROVIDER=<x> REGION=<y> PLAN=<z> ARCH=<a>. Credentials ref: $<VAR>.`
Proceed to Phase 2.
---
## 1.d — Constructive-fail paths
If the user says "I don't know":
- **(A)** Default to Hetzner CX22 fsn1 x86 (cheapest EU). 1-click.
- **(B)** Clone an existing project's provider (ask which project,
pattern-match from `~/.claude/projects/*/memory/*.md`).
- **(C)** Defer provisioning — emit a decision memo and exit cleanly.
Never pick silently.

View file

@ -0,0 +1,137 @@
# Phase 2 — Plan Mode Doc
> Goal: produce a written, user-approved plan (RULE 0.5) that enumerates
> every apt change to the VM before any packet leaves the workstation.
> **Verify criterion:** user clicked "approve" on the plan; plan artefact
> exists at `<run-dir>/plan.md`.
---
## 2.a — Synthesise the plan
Write `<run-dir>/plan.md` (where `<run-dir>` is `./.keisei/vm-provision/<timestamp>/`) with
EXACTLY these sections — no more, no less:
```markdown
# VM-Provision Plan — <timestamp>
## Intent
<INTENT one-line>
## Target
- Provider: <PROVIDER>
- Region: <REGION>
- Plan: <PLAN> (<arch>)
- VM name: kei-<env>-<role> # derived, ASK if ambiguous
## Access
- Admin user: <ADMIN_USER> # default keiadmin
- SSH port: <SSH_PORT> # default 22
- SSH pubkey: <path> # read from ~/.ssh/id_*.pub
## Ports to allow (ufw + provider cloud firewall)
<APP_PORTS list>
## TLS
- Host: <TLS_HOST or none>
- Method: <HTTP-01 | DNS-01 | none>
## Hardening steps (harden-base.sh)
- apt update + upgrade
- install: ufw fail2ban unattended-upgrades needrestart auditd audispd-plugins
- write /etc/ssh/sshd_config.d/99-kei.conf
- ufw default-deny-in + rate-limit ssh + allow APP_PORTS
- fail2ban sshd jail
- auditd baseline ruleset (/etc/audit/rules.d/99-kei.rules)
- unattended-upgrades (AUTO reboot = FALSE)
## Verification (hard gate before handoff)
- ssh-check → exit 0
- firewall-diff (intent YAML vs live ufw) → exit 0
## Rollback
- `_primitives/provision-<provider>.sh destroy <VM_NAME>` — 1-command destroy.
- TF state: <path or "none CLI-driven">
## Cost estimate
<Plan price per month from PROVIDER pricing page; CITE>
```
Cite the source for every price/region/plan detail. Numbers NOT cited =
NO-GO per RULE 0.4.
---
## 2.b — Build the `firewall-intent.yaml`
Write `<run-dir>/firewall-intent.yaml`:
```yaml
default:
incoming: deny
outgoing: allow
routed: deny
rules:
- port: <SSH_PORT>
proto: tcp
action: limit
from: any
comment: "ssh (rate-limited)"
# one entry per APP_PORTS:
- port: 443
proto: tcp
action: allow
from: any
```
This file is the **source of truth** the Phase 5 `firewall-diff` will
compare against live `ufw status numbered` output. Drift = Phase 5 fail.
---
## 2.c — AskUserQuestion (customise ports, TLS, admin name)
One `AskUserQuestion` call with up to 4 questions:
1. **Admin user?** (stored as `ADMIN_USER`)
- `keiadmin` (default)
- Custom (user types — only free-text in Phase 2)
2. **SSH port?** (stored as `SSH_PORT`)
- `22` (default; simpler)
- `2222` (obscurity; not security, but reduces log noise)
- Custom
3. **Application ports to open?** (multi-select, stored as `APP_PORTS`)
- `443/tcp` — HTTPS (most apps)
- `80/tcp` — HTTP (only if ACME HTTP-01 or redirect)
- `none` — tunneled via Tailscale / private net only
4. **TLS?** (stored as `TLS_HOST` + method)
- Caddy HTTP-01 (need 80/tcp + 443/tcp + DNS pointing to VM)
- Caddy DNS-01 (no port 80 needed; need DNS provider API token)
- None (app provides its own TLS or is behind a proxy)
---
## 2.d — Present the plan for approval
Render `plan.md` in chat. Ask ONE final AskUserQuestion:
**Proceed with this plan?**
- Approve → Phase 3.
- Iterate → loop back to 2.c with the user's change request.
- Abort → emit plan-only artefact and exit (`HANDOFF_TO=none`).
---
## 2.e — Verify criterion
- [ ] `plan.md` written to `<run-dir>/plan.md`.
- [ ] `firewall-intent.yaml` written to `<run-dir>/firewall-intent.yaml`.
- [ ] User clicked "Approve".
Emit:
`Phase 2 done: plan @ <run-dir>/plan.md. <len(APP_PORTS)> ports, TLS=<method>.`
Proceed to Phase 3.

View file

@ -0,0 +1,107 @@
# Phase 3 — Provision + SSH First Contact
> Goal: create the VM via the right `_primitives/provision-<provider>.sh`,
> wait for `cloud-init` to finish, establish SSH as `ADMIN_USER`.
> **Verify criterion:** `VM_IP` resolves to a live sshd that accepts the
> admin key; `cloud-init status --wait` = `done`.
---
## 3.a — Render cloud-init user-data
Copy `_blocks/deploy-vps-generic.md`'s `cloud-init.yaml` template to
`<run-dir>/cloud-init.yaml`, substituting:
- `${env}`, `${role}` from Phase 2's derived VM name.
- `${ADMIN_PUBKEY}` — read `~/.ssh/id_ed25519.pub` (or ask Phase 2.c which
pubkey). **NEVER** read private keys; pubkeys only.
Render once; do not parameterise further — surgical changes only.
---
## 3.b — Choose provisioner + run
Dispatch by `PROVIDER`:
- `hetzner``_primitives/provision-hetzner.sh create <VM_NAME> --type <PLAN> --location <REGION> --user-data <run-dir>/cloud-init.yaml`
- `vultr``_primitives/provision-vultr.sh create <VM_NAME> --plan <PLAN> --region <REGION> --user-data <run-dir>/cloud-init.yaml`
- `digitalocean` / `upcloud` — use each provider's official CLI directly
(no wrapper primitive yet); CITE the command in the plan before running.
Both primitives are idempotent — a second invocation with the same name
prints the existing IP and exits 0. Re-runs after a network blip do NOT
create duplicates.
Capture stdout (just the IPv4) into `VM_IP`.
---
## 3.c — SSH first contact (TOFU)
```bash
for i in $(seq 1 60); do
ssh -o ConnectTimeout=3 \
-o StrictHostKeyChecking=accept-new \
-o UserKnownHostsFile=~/.ssh/known_hosts \
"${ADMIN_USER}@${VM_IP}" "cloud-init status --wait" && break
sleep 5
done
```
- `StrictHostKeyChecking=accept-new` is TOFU for the FIRST connect only.
After this, subsequent connects use strict mode (default).
- 60 × 5 s = 5 min timeout; long enough for cloud-init on any of the
supported providers.
- `cloud-init status --wait` blocks until cloud-init finishes — no
time-based sleep.
If the loop exhausts without a successful SSH: STOP. Pull provider
console logs (`hcloud server ssh-log <name>` / vultr console screenshot)
and surface the failure mode:
- DNS/IP issue → wait + retry (1 constructive path).
- Wrong pubkey → revoke the VM (`provision-<p>.sh destroy`), fix Phase 2,
retry.
- Cloud-init crashed on first boot → enable rescue mode via provider
console, read `/var/log/cloud-init-output.log`, fix template, retry.
---
## 3.d — AskUserQuestion (confirm IP + ready to harden)
One `AskUserQuestion`:
**VM is up at `<VM_IP>`. Cloud-init finished, admin SSH works.**
- Proceed to hardening (Phase 4).
- Pause (inspect the VM first; re-invoke skill when ready).
- Abort + destroy (calls `destroy` on the provisioner, returns to Phase 2).
---
## 3.e — Verify criterion
- [ ] `VM_IP` set.
- [ ] `cloud-init status` returns `done` (not `error`, not `disabled`).
- [ ] `ssh ${ADMIN_USER}@${VM_IP} 'true'` exits 0.
- [ ] `known_hosts` contains the VM's host key (pinned for future connects).
Emit:
`Phase 3 done: <VM_NAME> up @ <VM_IP>, admin=<ADMIN_USER>, cloud-init=done.`
Proceed to Phase 4.
---
## 3.f — Constructive-fail paths
- **Create returned no IP (provisioner exit 2).** Root cause likely API
outage or quota. Paths: (A) retry after 2 min; (B) try sibling region;
(C) fall through to an alternate provider (loops back to Phase 1).
- **cloud-init errored.** Pull logs via rescue; typical causes: bad yaml
indentation, unreachable apt mirror. Fix template; re-provision fresh
(destroy the broken VM first — partial state = harder to reason about).
- **SSH never responded.** Check provider firewall / cloud-init user
creation — some provider images rename `root``debian` and our
`keiadmin` sudoers file didn't take. Remediation: add the provider's
default user to the admin whitelist for 1 run, then switch.

View file

@ -0,0 +1,109 @@
# Phase 4 — Harden via `harden-base.sh`
> Goal: run `_primitives/harden-base.sh` on the VM, over SSH, idempotently.
> **Verify criterion:** script exited 0; `systemctl is-active` returns
> `active` for `ssh`, `ufw`, `fail2ban`, `auditd`.
---
## 4.a — Ship the script
The script lives on the workstation; copy to the VM and run with `sudo`:
```bash
scp _primitives/harden-base.sh "${ADMIN_USER}@${VM_IP}:/tmp/harden-base.sh"
ssh "${ADMIN_USER}@${VM_IP}" "sudo bash /tmp/harden-base.sh \
--admin-user ${ADMIN_USER} \
--ssh-port ${SSH_PORT} \
$(for p in ${APP_PORTS[@]}; do echo --allow-port $p; done)"
```
Why not `curl … | bash`? Because that depends on a hosted URL AND a
trusted TLS cert. `scp` the file you already audited locally. Lower
surface area, reproducible.
The script is **idempotent** — safe to re-run. Re-runs converge the VM to
the declared state; missing directives get rewritten, extra ones are left
alone.
---
## 4.b — Stream logs
`harden-base.sh` logs to stderr with timestamps. Capture to
`<run-dir>/harden.log`:
```bash
ssh "${ADMIN_USER}@${VM_IP}" "sudo bash /tmp/harden-base.sh …" 2> >(tee <run-dir>/harden.log >&2)
```
If the script exits non-zero: STOP. Do NOT proceed to Phase 5. Surface
the last 30 lines of `<run-dir>/harden.log` + ask the user to choose:
- (A) **Fix locally + re-ship** — edit the primitive (if bug is there) or
adjust flags. Commit the fix under `checkpoint:` before retry.
- (B) **Patch the VM manually** — user logs in, fixes, we re-run the
script to ensure idempotency.
- (C) **Destroy + reprovision** — when remediation risk > cost of a
fresh VM (2 min on Hetzner).
---
## 4.c — Post-hardening live-check
After exit 0, SSH back in and confirm:
```bash
ssh "${ADMIN_USER}@${VM_IP}" "
set -e
systemctl is-active ssh ufw fail2ban auditd unattended-upgrades.service 2>/dev/null || true
ufw status | head -20
sudo auditctl -l | head -10
"
```
All four services must be `active`. `auditctl -l` must show the baseline
rules (sshd_config, sudoers, identity, module, time). Record the output
in `<run-dir>/post-harden.txt`.
---
## 4.d — AskUserQuestion (ready to verify?)
One `AskUserQuestion`:
**Hardening applied. Four services active; auditd rules loaded.**
- Run verification gate (Phase 5).
- Apply one more pass (typo in `APP_PORTS`, extra user, etc. — loops 4.a
with a delta).
- Pause (leave the VM in current state).
---
## 4.e — Verify criterion
- [ ] `harden-base.sh` exited 0.
- [ ] `ssh / ufw / fail2ban / auditd` all `active`.
- [ ] `<run-dir>/harden.log` + `<run-dir>/post-harden.txt` captured.
Emit:
`Phase 4 done: 4/4 services active. Log: <run-dir>/harden.log.`
Proceed to Phase 5 (hard gate).
---
## 4.f — Non-obvious failure modes
- **`systemctl reload ssh` fails because `sshd -t` rejects the drop-in.**
Usually a custom `SSH_PORT` collides with ufw still configured for 22.
Fix: ensure ufw rule + sshd Port match BEFORE reload. `harden-base.sh`
writes both in one pass, but if an out-of-band edit happened between
runs, you get this.
- **fail2ban service flaps.** Usually a systemd-journal backend mismatch
on very old Debian. Verify `backend = systemd` in
`/etc/fail2ban/jail.local` (script sets this).
- **auditd refuses `-e 2`.** Means an earlier rules load is still
mastered; `augenrules --load` forces reload. Already in the script.
None of these require a Level-2 escalation — all three have known fixes.

View file

@ -0,0 +1,112 @@
# Phase 5 — Verification Hard Gate (`ssh-check` + `firewall-diff`)
> Goal: fail-closed verification. Phase 6 refuses to run unless BOTH
> `ssh-check` AND `firewall-diff` exit 0.
> **Verify criterion:** `SSH_CHECK_OK = true` AND `FW_DIFF_OK = true`.
---
## 5.a — Pull config artefacts from the VM
```bash
scp "${ADMIN_USER}@${VM_IP}:/etc/ssh/sshd_config" <run-dir>/sshd_config
ssh "${ADMIN_USER}@${VM_IP}" "sudo tar -C /etc/ssh -cf - sshd_config.d" \
| tar -C <run-dir>/ -xf -
ssh "${ADMIN_USER}@${VM_IP}" "sudo ufw status numbered" > <run-dir>/ufw-status.txt
```
The ufw status requires `sudo` on most distros — the admin user has it
via `NOPASSWD:ALL` from `harden-base.sh`. If `sudo` requires TTY, prefix
`sudo -n` and surface the failure.
All captured files are READ ONLY, for `ssh-check` / `firewall-diff` to
parse. We NEVER push config back from the workstation.
---
## 5.b — Run `ssh-check`
```bash
_primitives/_rust/ssh-check/target/release/ssh-check \
--config <run-dir>/sshd_config \
--drop-in <run-dir>/sshd_config.d \
--allow-user "${ADMIN_USER}" \
--json > <run-dir>/ssh-check.json
SSH_EXIT=$?
```
Exit 0 → `SSH_CHECK_OK=true`. Exit 2 → `SSH_CHECK_OK=false` and
`<run-dir>/ssh-check.json` lists the violating directives with
`file:line` precision. Exit 1 → usage/parse error; surface the stderr and
loop back to Phase 4.
---
## 5.c — Run `firewall-diff`
```bash
_primitives/_rust/firewall-diff/target/release/firewall-diff \
--intent <run-dir>/firewall-intent.yaml \
--status-file <run-dir>/ufw-status.txt \
--json > <run-dir>/firewall-diff.json
FW_EXIT=$?
```
Exit 0 → `FW_DIFF_OK=true`. Exit 2 → the JSON lists `missing` (in intent,
not live) and `extra` (in live, not intent) rules; `default_mismatches`
flags a non-deny inbound policy.
---
## 5.d — Decision tree
| `ssh-check` | `firewall-diff` | Action |
|---|---|---|
| 0 | 0 | Proceed to Phase 6. |
| 2 | 0 | Loop to 4.a with the sshd_config.d fix + re-ship `harden-base.sh`. |
| 0 | 2 | Ask user: apply the `missing`/`extra` deltas via `ufw` commands, or update `firewall-intent.yaml` (the intent was wrong). ONE AskUserQuestion. |
| 2 | 2 | Both failed — show both JSON reports; recommend a single fresh `harden-base.sh` re-run first (common-mode fix), then re-verify. |
| 1 | 1 | Workstation issue (missing binary, bad path) — NOT a VM problem. Rebuild the Rust primitives (`cargo build --release` in `_primitives/_rust/`). |
---
## 5.e — The AskUserQuestion
Exactly ONE AskUserQuestion, gated on the decision tree above:
**Verification results:** `ssh-check=<PASS|FAIL>`,
`firewall-diff=<PASS|FAIL>`. Pick one:
- **Proceed** (only shown when both PASS) → Phase 6.
- **Fix and retry** → loop to Phase 4 (or to 5.c if intent YAML is wrong).
- **Ignore and proceed****BLOCKED.** The hard-gate invariant refuses
this path per `SKILL.md`. You can abort, but you cannot bypass.
---
## 5.f — Verify criterion
- [ ] `ssh-check` exit 0.
- [ ] `firewall-diff` exit 0.
- [ ] `<run-dir>/ssh-check.json` and `<run-dir>/firewall-diff.json` saved.
Emit:
`Phase 5 done: hard-gate PASSED. Artefacts in <run-dir>/.`
Proceed to Phase 6.
---
## 5.g — Non-obvious pitfalls
- **sshd_config.d drop-in not loaded.** Debian 12's default
`/etc/ssh/sshd_config` includes the `.d` directory via an `Include`
directive. We don't follow `Include` on purpose (security — includes
can escape the intended tree). Pass `--drop-in` explicitly.
- **ufw status shows IPv6 rules as duplicates.** Intent is IPv4-only by
default; `firewall-diff`'s normalisation treats `(v6)` rules with same
port/proto as "expected" and does not flag them. If you need strict
v6-only rules, open a separate intent file.
- **`MaxAuthTries` at 6 or 10** (Debian default). `harden-base.sh` sets
3. If a previous manual edit raised it and we re-ran without rewriting,
ssh-check will FAIL `maxauthtries`. Fix: re-run `harden-base.sh`.

View file

@ -0,0 +1,123 @@
# Phase 6 — Handoff + Final Report
> Goal: emit a single, complete report and (optionally) hand off to
> `/web-deploy` or `/auth-setup`. No further mutation to the VM from this
> skill.
> **Verify criterion:** final report emitted; all Phase-1..5 artefacts
> listed with absolute paths; next-skill dispatch (if any) announced.
---
## 6.a — Artefact ledger
Collect and surface:
- `<run-dir>/plan.md` — Phase 2
- `<run-dir>/cloud-init.yaml` — Phase 3 input
- `<run-dir>/firewall-intent.yaml` — Phase 2 source of truth
- `<run-dir>/harden.log` — Phase 4 stderr
- `<run-dir>/post-harden.txt` — Phase 4 systemctl snapshot
- `<run-dir>/sshd_config` + `sshd_config.d/` — Phase 5 input (captured)
- `<run-dir>/ufw-status.txt` — Phase 5 input (captured)
- `<run-dir>/ssh-check.json` — Phase 5 output
- `<run-dir>/firewall-diff.json` — Phase 5 output
Every path must exist on disk before emitting the report. Missing
artefact = bug in an earlier phase; STOP and surface the gap.
---
## 6.b — Final report
```
=== /VM-PROVISION REPORT ===
Intent: <first 80 chars of INTENT>
Provider: <PROVIDER> / region=<REGION> / plan=<PLAN> / arch=<ARCH>
VM: <VM_NAME> @ <VM_IP>
Admin: <ADMIN_USER> (ssh port <SSH_PORT>)
Ports: <APP_PORTS joined>
TLS: <TLS_HOST or "none">
Hardened: <HARDENED>
Verification: ssh-check=PASS firewall-diff=PASS
Handoff: <HANDOFF_TO>
Artefacts:
- <run-dir>/plan.md
- <run-dir>/cloud-init.yaml
- <run-dir>/firewall-intent.yaml
- <run-dir>/harden.log
- <run-dir>/post-harden.txt
- <run-dir>/sshd_config (+ sshd_config.d/)
- <run-dir>/ufw-status.txt
- <run-dir>/ssh-check.json
- <run-dir>/firewall-diff.json
AskUserQuestion count: <N, should be 6>
```
No prose after the ledger. The report is the contract.
---
## 6.c — Handoff (no AskUserQuestion; next-skill dispatch inferred)
If `TLS_HOST` was set AND the caller's intent mentions deploying an app
— dispatch to `/web-deploy` with the VM IP and admin credentials
(by env-var reference only, RULE 0.8). Surface:
> `Handoff → /web-deploy <VM_IP> --admin <ADMIN_USER> --tls <TLS_HOST>`
If the intent mentions auth / identity — surface:
> `Handoff → /auth-setup <VM_IP>`
Otherwise: `HANDOFF_TO=none`. User invokes the next skill manually when
ready.
**Never** run the next skill automatically — the user already clicked
their way through 6 phases; handing off to another multi-phase skill
without a pause is hostile UX.
---
## 6.d — Memory save (RULE memory-protocol)
Append to `memory/{project-or-infra}.md`:
```markdown
### VM provisioned: <VM_NAME> (YYYY-MM-DD) [E1]
- Provider: <PROVIDER> <PLAN> @ <REGION>
- IP: <VM_IP>
- Admin: <ADMIN_USER>
- Hardened: harden-base.sh rev <git-sha>
- Verify: ssh-check + firewall-diff both PASS
- Cost: <X>/month (cited @ <date>)
- Artefacts: <run-dir>/
```
Evidence grade E1 — facts are direct observations (we ran the commands,
we have the exit codes, we can re-verify on demand).
If the project file doesn't exist yet, create `memory/{slug}.md` and add
a single line to `MEMORY.md` under the right section.
---
## 6.e — Verify criterion
- [ ] Report emitted.
- [ ] All 9+ artefacts exist on disk at absolute paths.
- [ ] `memory/{project}.md` updated (or created) with the provision entry.
- [ ] `HANDOFF_TO` announced (or `none`).
---
## 6.f — Rollback instructions (always include in the report)
```
# destroy the VM + all its resources (idempotent)
_primitives/provision-<PROVIDER>.sh destroy <VM_NAME> --force
# purge local artefacts (plan, logs, captured configs)
rm -rf <run-dir>
```
Keep them visible — Future-Us will appreciate the 1-command path back.