feat(blocks): 7 VM + security blocks
- deploy-hetzner-cloud.md — CX22/CAX11 (€3.79/mo verified), hcloud TF - deploy-vps-generic.md — provider-agnostic cloud-init + SSH first-contact - security-ssh-hardening.md — sshd_config.d/99-kei.conf baseline matrix - security-firewall-ufw.md — ufw default-deny + rate limiting + intent YAML - security-tls-caddy.md — Caddy 2 auto-ACME, HTTP-01 / DNS-01, systemd - security-audit-logging.md — auditd rules + journald persistence - security-patching.md — unattended-upgrades + needrestart + reboot window All blocks reference RULE 0.8 env-var-only secrets and cite provider specifics per RULE 0.4. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
48d4dd0733
commit
19cbdbd689
7 changed files with 449 additions and 0 deletions
50
_blocks/deploy-hetzner-cloud.md
Normal file
50
_blocks/deploy-hetzner-cloud.md
Normal file
|
|
@ -0,0 +1,50 @@
|
|||
# DEPLOY — Hetzner Cloud (CX22 / CAX11 + TF + Cloud Firewall)
|
||||
|
||||
**Why Hetzner:** cheapest EU VPS with reputable network. CX22 (x86, 2 vCPU / 4 GB / 40 GB) = **€3.79/mo + VAT**; CAX11 (Ampere ARM64, 2 vCPU / 4 GB / 40 GB) = **€3.79/mo + VAT**. Prices verified on <https://www.hetzner.com/cloud/> [VERIFIED 2026-04-21]. Hourly billing caps at the monthly rate — safe to spin down for tests.
|
||||
|
||||
**Terraform provider:** `hetznercloud/hcloud` (official). Pin version:
|
||||
```hcl
|
||||
terraform {
|
||||
required_providers {
|
||||
hcloud = { source = "hetznercloud/hcloud", version = "~> 1.49" }
|
||||
}
|
||||
}
|
||||
provider "hcloud" { token = var.hcloud_token }
|
||||
```
|
||||
Token via env: `export HCLOUD_TOKEN=$(grep ^HCLOUD_TOKEN ~/.claude/secrets/.env | cut -d= -f2)`. **NEVER commit the token** (RULE 0.8 — see `domain-has-secrets.md`).
|
||||
|
||||
**Minimal `hcloud_server` resource:**
|
||||
```hcl
|
||||
resource "hcloud_server" "node" {
|
||||
name = "kei-${var.env}-${var.role}"
|
||||
image = "debian-12"
|
||||
server_type = var.arch == "arm64" ? "cax11" : "cx22"
|
||||
location = var.location # fsn1 / nbg1 / hel1 / ash / hil / sin
|
||||
ssh_keys = [hcloud_ssh_key.admin.id]
|
||||
user_data = file("${path.module}/cloud-init.yaml")
|
||||
firewalls { firewall_id = hcloud_firewall.base.id }
|
||||
labels = { project = "kei", env = var.env }
|
||||
}
|
||||
```
|
||||
`ssh_keys` is **mandatory** — passing it disables the root password e-mail path.
|
||||
|
||||
**Cloud Firewall (stateful, IN by default DENY):**
|
||||
```hcl
|
||||
resource "hcloud_firewall" "base" {
|
||||
name = "kei-base"
|
||||
rule { direction = "in" protocol = "tcp" port = "22" source_ips = var.admin_cidrs }
|
||||
rule { direction = "in" protocol = "icmp" source_ips = ["0.0.0.0/0", "::/0"] }
|
||||
# Add app ports (443, 80) only when an app is deployed behind the node.
|
||||
}
|
||||
```
|
||||
Attach to the server via `firewalls { firewall_id = … }`. Cloud Firewall is the FIRST line of defense — it drops traffic before it hits the VM's ufw (see `security-firewall-ufw.md`). Both layers MUST agree.
|
||||
|
||||
**Locations:** `fsn1` (Falkenstein DE), `nbg1` (Nürnberg DE), `hel1` (Helsinki FI), `ash` (Ashburn US), `hil` (Hillsboro US), `sin` (Singapore). Pick region closest to users; ARM64 `cax*` available in EU only [VERIFIED 2026-04-21].
|
||||
|
||||
**Snapshots + rescue:** `hcloud_snapshot` for golden images; `hcloud server enable-rescue` before SSH lockout recovery. Back up `user_data` and TF state (remote backend: S3-compatible such as R2).
|
||||
|
||||
**Primitives provided by KeiSeiKit:**
|
||||
- `_primitives/provision-hetzner.sh` — wrapper around `hcloud` CLI, idempotent create/destroy, checks existing server by name first.
|
||||
- Complement with `_primitives/harden-base.sh` run over SSH after first boot.
|
||||
|
||||
**Forbidden:** hcloud token in `.tf` or `.tfvars` committed to git; Cloud Firewall with port 22 open to `0.0.0.0/0`; creating servers with `keep_disk = false` then snapshotting (destroys data); using Hetzner Storage Boxes for anything needing low latency (they're SFTP-over-WAN).
|
||||
79
_blocks/deploy-vps-generic.md
Normal file
79
_blocks/deploy-vps-generic.md
Normal file
|
|
@ -0,0 +1,79 @@
|
|||
# DEPLOY — Generic VPS (provider-agnostic cloud-init + ssh-first-contact)
|
||||
|
||||
**Target providers:** DigitalOcean Droplets, Vultr, UpCloud, Linode/Akamai. Each has slightly different Terraform providers + CLIs, but the Day-0 contract is identical: **boot a Debian/Ubuntu image with a cloud-init user-data blob; add one admin SSH key; nothing else.**
|
||||
|
||||
**Day-0 cloud-init blob (`cloud-init.yaml`) — universal:**
|
||||
```yaml
|
||||
#cloud-config
|
||||
hostname: kei-${env}-${role}
|
||||
timezone: UTC
|
||||
package_update: true
|
||||
package_upgrade: true
|
||||
packages:
|
||||
- ufw
|
||||
- fail2ban
|
||||
- unattended-upgrades
|
||||
- auditd
|
||||
- needrestart
|
||||
- curl
|
||||
- jq
|
||||
users:
|
||||
- name: keiadmin
|
||||
groups: sudo
|
||||
shell: /bin/bash
|
||||
sudo: ALL=(ALL) NOPASSWD:ALL
|
||||
ssh_authorized_keys:
|
||||
- ${ADMIN_PUBKEY}
|
||||
ssh_pwauth: false
|
||||
disable_root: true
|
||||
write_files:
|
||||
- path: /etc/ssh/sshd_config.d/99-kei.conf
|
||||
permissions: '0644'
|
||||
content: |
|
||||
PasswordAuthentication no
|
||||
PermitRootLogin no
|
||||
MaxAuthTries 3
|
||||
AllowUsers keiadmin
|
||||
ClientAliveInterval 120
|
||||
ClientAliveCountMax 2
|
||||
runcmd:
|
||||
- [ systemctl, restart, ssh ]
|
||||
- [ ufw, default, deny, incoming ]
|
||||
- [ ufw, default, allow, outgoing ]
|
||||
- [ ufw, allow, 22/tcp ]
|
||||
- [ ufw, --force, enable ]
|
||||
```
|
||||
The blob is intentionally provider-neutral. Provider-specific bits (private-network bring-up, metadata service quirks) go in a short appendix the provisioner appends. See `_primitives/harden-base.sh` for post-boot hardening re-runs.
|
||||
|
||||
**SSH-first-contact (`ssh-first-contact.sh` pattern):**
|
||||
```bash
|
||||
# Wait for cloud-init to finish AND sshd to be ready on the new IP.
|
||||
for i in $(seq 1 60); do
|
||||
ssh -o ConnectTimeout=3 -o StrictHostKeyChecking=accept-new \
|
||||
"keiadmin@$IP" "cloud-init status --wait" && break
|
||||
sleep 5
|
||||
done
|
||||
ssh "keiadmin@$IP" "sudo test -f /var/lib/cloud/instance/boot-finished"
|
||||
```
|
||||
`StrictHostKeyChecking=accept-new` is OK only for the FIRST contact (TOFU). Store the fingerprint to `~/.ssh/known_hosts`; subsequent connects use default strict mode. Never use `StrictHostKeyChecking=no` — accepts MitM silently.
|
||||
|
||||
**Terraform skeleton (provider-agnostic via vars):**
|
||||
```hcl
|
||||
variable "provider_kind" {} # "digitalocean" | "vultr" | "upcloud" | "linode"
|
||||
variable "region" {}
|
||||
variable "size_slug" {} # provider-specific size id
|
||||
variable "admin_pubkey" {} # raw ssh-ed25519 …
|
||||
locals {
|
||||
user_data = templatefile("${path.module}/cloud-init.yaml", { ADMIN_PUBKEY = var.admin_pubkey })
|
||||
}
|
||||
# ... then a module-per-provider resource that all read `local.user_data`
|
||||
```
|
||||
Keep TF state **local per-env-per-dev by default**; upgrade to remote backend (R2, S3, Terraform Cloud) only when ≥ 2 humans share state.
|
||||
|
||||
**Per-provider gotchas (verified 2026-04-21):**
|
||||
- **DigitalOcean:** Marketplace "Docker" images skip unattended-upgrades — start from plain Debian 12 instead. IPv6 requires `ipv6 = true` on the droplet.
|
||||
- **Vultr:** `vultr-cli` needs `VULTR_API_KEY`; default firewall is OPEN — attach a firewall group or rely solely on ufw.
|
||||
- **UpCloud:** IPs rotate on full stop+start unless you request `floating_ip`. Finnish ASN often preferred over Hetzner in RU-routed workloads (see `project-vortex.md`).
|
||||
- **Linode:** cloud-init runs before disk resize on some plans → `growpart` may need a rerun on first `ssh`.
|
||||
|
||||
**Forbidden:** baking the admin private key into an AMI/snapshot; reusing one SSH keypair across envs; letting cloud-init pull scripts from a mutable URL (`curl … | bash` in `runcmd:` — pin to a hash); running `apt-get dist-upgrade -y` in `runcmd` without `needrestart` to surface pending reboots.
|
||||
77
_blocks/security-audit-logging.md
Normal file
77
_blocks/security-audit-logging.md
Normal file
|
|
@ -0,0 +1,77 @@
|
|||
# SECURITY — Audit Logging (auditd + journald forwarding)
|
||||
|
||||
**Goal:** every privileged action (sudo, ssh login, sensitive file edit) leaves a tamper-evident trail that survives the VM being reimaged.
|
||||
|
||||
**Stack:**
|
||||
- `auditd` — Linux kernel audit framework, writes to `/var/log/audit/audit.log` in human-unfriendly but machine-parseable K/V format.
|
||||
- `journald` — systemd's binary journal (`/var/log/journal/`), captures stdout/stderr of every service plus syslog stream.
|
||||
- **Off-box shipping** (optional but recommended) — forward journald to a remote log collector (Loki, Vector, rsyslog+TLS). Local logs are destroyed on reimage.
|
||||
|
||||
**Install + enable:**
|
||||
```
|
||||
sudo apt install -y auditd audispd-plugins
|
||||
sudo systemctl enable --now auditd
|
||||
```
|
||||
|
||||
**Reference `/etc/audit/rules.d/99-kei.rules`:**
|
||||
```
|
||||
# KeiSeiKit audit baseline — pinned 2026-04-21. Loaded by augenrules on boot.
|
||||
## 1. SSH events
|
||||
-w /etc/ssh/sshd_config -p wa -k sshd_config
|
||||
-w /etc/ssh/sshd_config.d/ -p wa -k sshd_config
|
||||
-w /root/.ssh/ -p wa -k ssh_keys_root
|
||||
-w /home/keiadmin/.ssh/ -p wa -k ssh_keys_admin
|
||||
|
||||
## 2. Sudo events
|
||||
-w /etc/sudoers -p wa -k sudoers
|
||||
-w /etc/sudoers.d/ -p wa -k sudoers
|
||||
-a always,exit -F arch=b64 -S execve -F euid=0 -F auid>=1000 -F auid!=unset -k sudo_root
|
||||
|
||||
## 3. Privilege / identity changes
|
||||
-w /etc/passwd -p wa -k identity
|
||||
-w /etc/group -p wa -k identity
|
||||
-w /etc/shadow -p wa -k identity
|
||||
-w /etc/gshadow -p wa -k identity
|
||||
|
||||
## 4. Loading / unloading kernel modules
|
||||
-a always,exit -F arch=b64 -S init_module -S finit_module -S delete_module -k module
|
||||
|
||||
## 5. Time changes (detect attempts to skew audit timestamps)
|
||||
-a always,exit -F arch=b64 -S adjtimex -S settimeofday -S clock_settime -k time
|
||||
-w /etc/localtime -p wa -k time
|
||||
|
||||
## 6. Make the config itself immutable (place LAST)
|
||||
-e 2
|
||||
```
|
||||
`-e 2` locks the ruleset until reboot (tamper-resistant). Load with `sudo augenrules --load && sudo systemctl restart auditd`. Test with `sudo ausearch -k sshd_config | tail`.
|
||||
|
||||
**Human-readable summaries:** `sudo aureport -au` (auth events), `aureport -m` (module loads), `aureport -k` (keyed rule hits). Use these in incident response; raw `audit.log` is only for ingest pipelines.
|
||||
|
||||
**journald tuning — `/etc/systemd/journald.conf.d/99-kei.conf`:**
|
||||
```
|
||||
[Journal]
|
||||
Storage=persistent
|
||||
Compress=yes
|
||||
SystemMaxUse=500M
|
||||
SystemKeepFree=1G
|
||||
MaxFileSec=1week
|
||||
ForwardToSyslog=no
|
||||
```
|
||||
`Storage=persistent` creates `/var/log/journal/` — without it, `journalctl` history disappears on reboot. `MaxFileSec=1week` rotates weekly; combine with off-box shipping so you don't lose events.
|
||||
|
||||
**Off-box shipping patterns:**
|
||||
- **systemd-journal-upload** — built-in, ships via HTTPS to a `systemd-journal-remote` receiver. Mutual-TLS recommended.
|
||||
- **Vector** (<https://vector.dev>) — pull from `journald` source, push to Loki/S3/syslog-TLS. Modern, Rust-native. Uses `/run/log/journal/` + unix socket.
|
||||
- **rsyslog → remote** — legacy path; useful if you already operate a syslog collector.
|
||||
|
||||
Any choice: use TLS, authenticate the receiver, do NOT push cleartext logs across the internet. Logs often contain secrets even when the app tries not to log them.
|
||||
|
||||
**Failure-mode handling:** `auditd` can be configured to panic the kernel when the audit queue fills — reasonable for high-compliance, DANGEROUS for general VMs. Default `/etc/audit/auditd.conf` has `disk_full_action = SUSPEND` and `disk_error_action = SUSPEND` — keep these; tune to `HALT` only if regulatory driver requires it.
|
||||
|
||||
**Verification (skill Phase 5):**
|
||||
- `sudo auditctl -l` returns the non-empty rule list.
|
||||
- `systemctl is-active auditd` = `active`.
|
||||
- `journalctl --disk-usage` shows a non-zero persistent journal.
|
||||
- (Optional) an off-box log-receiver shows entries within the last N minutes.
|
||||
|
||||
**Forbidden:** deleting `/var/log/audit/audit.log` or `/var/log/journal/*` on a live host (breaks chain-of-custody); running auditd with `-e 0` (unlocked, attacker can disable the kernel audit); shipping logs in cleartext; logging secrets (app-level concern — redact before `logger()`); disabling persistent journald.
|
||||
62
_blocks/security-firewall-ufw.md
Normal file
62
_blocks/security-firewall-ufw.md
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
# SECURITY — Firewall (ufw default-deny + rate limiting + nftables alt)
|
||||
|
||||
**Posture — default-deny-in / allow-out:**
|
||||
```
|
||||
ufw default deny incoming
|
||||
ufw default allow outgoing
|
||||
ufw default deny routed # do NOT forward unless explicitly routing
|
||||
ufw limit 22/tcp comment 'ssh (rate-limited: 6 conn / 30s)'
|
||||
ufw logging medium
|
||||
ufw --force enable
|
||||
```
|
||||
`ufw limit` = per-source-IP brute-force mitigation at the kernel level (iptables `recent` module). Use for SSH — *never* use it for app traffic (false positives on shared-NAT clients).
|
||||
|
||||
**Layer ordering (read top-down):**
|
||||
1. **Cloud Firewall** (Hetzner Cloud Firewall / AWS Security Group / DO Firewall) — drops at the provider edge, BEFORE packets hit the VM. Cheapest layer.
|
||||
2. **ufw** on the VM — defence in depth; also covers provider-firewall misconfigs and private-network paths.
|
||||
3. **App-level auth** — sshd keys, TLS client certs, app tokens.
|
||||
|
||||
Both the Cloud Firewall AND ufw must agree on the port allow-list. A mismatch means "it works from provider console but not from Tailscale" or vice-versa. Use `_primitives/_rust/firewall-diff/` to compare intended rules (YAML) against running `ufw status`.
|
||||
|
||||
**Intended-rules YAML schema (`firewall-intent.yaml`):**
|
||||
```yaml
|
||||
default:
|
||||
incoming: deny
|
||||
outgoing: allow
|
||||
routed: deny
|
||||
rules:
|
||||
- port: 22
|
||||
proto: tcp
|
||||
action: limit
|
||||
from: any
|
||||
comment: "ssh (rate-limited)"
|
||||
- port: 443
|
||||
proto: tcp
|
||||
action: allow
|
||||
from: any
|
||||
comment: "https / caddy"
|
||||
- port: 80
|
||||
proto: tcp
|
||||
action: allow
|
||||
from: any
|
||||
comment: "http / acme-http-01"
|
||||
```
|
||||
`firewall-diff` round-trips this against live `ufw status numbered` JSON-parse and prints additions/deletions. Exit 0 iff live ≡ intent.
|
||||
|
||||
**Rate limiting patterns:**
|
||||
- `limit` — built-in; 6 connections / 30 s per IP. Good for SSH.
|
||||
- Per-app — do it inside the app or a reverse proxy (nginx `limit_req`, Caddy `rate_limit`), not in ufw. Kernel rate-limit doesn't understand HTTP methods.
|
||||
- ICMP — `ufw default allow outgoing` covers outbound; inbound ICMP should be `allow` (echo) for monitoring, NOT blanket-blocked (blocks path-MTU discovery).
|
||||
|
||||
**IPv6:** `/etc/default/ufw` → `IPV6=yes` (default Debian 12). Verify via `ufw status verbose` shows the (v6) rules. Missing IPv6 rules = a trivial bypass on dual-stack VMs.
|
||||
|
||||
**Logging:** `ufw logging medium` writes to `/var/log/ufw.log`. Forward to journald (default on systemd) or an off-box log collector. Logging `high` is too chatty for steady state; use it only during incident response.
|
||||
|
||||
**nftables alternative (for hosts that have Docker-installed iptables-nft):**
|
||||
ufw is a thin wrapper over iptables/nftables; on Docker-heavy hosts, Docker's daemon aggressively rewrites iptables and can bypass ufw. Two options:
|
||||
1. **DOCKER_OPTS=`--iptables=false`** (and do NAT yourself — advanced).
|
||||
2. **`ufw-docker`** companion (<https://github.com/chaifeng/ufw-docker>, not bundled in Debian — pin a tagged release, review the script BEFORE install).
|
||||
|
||||
On non-Docker hosts, ufw is sufficient. On Docker hosts, EITHER isolate (dedicated host + Cloud Firewall only) OR use `ufw-docker` — don't half-configure.
|
||||
|
||||
**Forbidden:** `ufw default allow incoming` "temporarily"; `allow from any to any port 22` without `limit`; skipping the IPv6 rule set; letting Docker silently override ufw without disabling its iptables chain; relying on `ufw` as the ONLY layer when a Cloud Firewall is available.
|
||||
62
_blocks/security-patching.md
Normal file
62
_blocks/security-patching.md
Normal file
|
|
@ -0,0 +1,62 @@
|
|||
# SECURITY — Patching (unattended-upgrades + needrestart + reboot window)
|
||||
|
||||
**Goal:** security patches applied within 24 h of release, service restarts + kernel reboots happen within a declared maintenance window (NOT ad-hoc at 3 AM UTC on a random Tuesday).
|
||||
|
||||
**Install:**
|
||||
```
|
||||
sudo apt install -y unattended-upgrades needrestart
|
||||
```
|
||||
|
||||
**`/etc/apt/apt.conf.d/50unattended-upgrades` (essential lines, Debian 12 / Ubuntu 22.04+):**
|
||||
```
|
||||
Unattended-Upgrade::Origins-Pattern {
|
||||
"origin=Debian,codename=${distro_codename}-security";
|
||||
"origin=Debian,codename=${distro_codename}-updates";
|
||||
};
|
||||
Unattended-Upgrade::Automatic-Reboot "false";
|
||||
Unattended-Upgrade::Automatic-Reboot-Time "04:00";
|
||||
Unattended-Upgrade::Mail "admin@example.com";
|
||||
Unattended-Upgrade::MailReport "on-change";
|
||||
```
|
||||
`Automatic-Reboot "false"` is the SAFE default — an automatic reboot without coordination kills in-flight requests. Pair with `needrestart` to SURFACE reboot requirement, then schedule the window explicitly (below).
|
||||
|
||||
**`/etc/apt/apt.conf.d/20auto-upgrades`:**
|
||||
```
|
||||
APT::Periodic::Update-Package-Lists "1";
|
||||
APT::Periodic::Unattended-Upgrade "1";
|
||||
APT::Periodic::AutocleanInterval "7";
|
||||
```
|
||||
Triggers daily via `/lib/systemd/system/apt-daily.timer` + `apt-daily-upgrade.timer`.
|
||||
|
||||
**needrestart:** after each upgrade, prints services that loaded old library versions and need restart. `/etc/needrestart/needrestart.conf`:
|
||||
```
|
||||
$nrconf{restart} = 'l'; # list only; do NOT auto-restart services
|
||||
$nrconf{kernelhints} = -1; # suppress "reboot hint" interactive prompt (non-TTY cron)
|
||||
```
|
||||
`nrconf{restart} = 'a'` (auto) is tempting but dangerous — restarting `postgresql` or a stateful app during a migration = corruption.
|
||||
|
||||
**Reboot window pattern (declared, env-var-driven):**
|
||||
```bash
|
||||
# /etc/systemd/system/kei-reboot-window.service + .timer
|
||||
# Only reboots if /var/run/reboot-required exists AND the current time
|
||||
# falls inside the declared window.
|
||||
[Service]
|
||||
Type=oneshot
|
||||
EnvironmentFile=/etc/default/kei-reboot-window
|
||||
ExecStart=/usr/local/bin/kei-reboot-window
|
||||
|
||||
# /etc/default/kei-reboot-window
|
||||
KEI_REBOOT_DOW="Sun" # day-of-week
|
||||
KEI_REBOOT_HOUR="04" # 24h, UTC
|
||||
KEI_REBOOT_MIN="15"
|
||||
KEI_DRAIN_CMD="" # optional pre-reboot drain (e.g. drain a load-balancer slot)
|
||||
```
|
||||
`kei-reboot-window` script checks `[ -f /var/run/reboot-required ]`, verifies it is the declared DOW/hour, runs `$KEI_DRAIN_CMD`, then `systemctl reboot`. Commit the script once; reuse the env file per-host.
|
||||
|
||||
**Provider-specific:**
|
||||
- **Hetzner Cloud / Vultr / UpCloud / DigitalOcean / Linode** — nothing extra; cloud-init already installs the packages per `deploy-vps-generic.md`.
|
||||
- **AWS EC2** — `ec2-instance-connect` may briefly reject SSH during a reboot — tolerate in orchestration retries.
|
||||
|
||||
**Auditability:** `unattended-upgrades` logs to `/var/log/unattended-upgrades/unattended-upgrades.log`. Forward via journald (see `security-audit-logging.md`). Package a short summary in the skill Phase 5 report.
|
||||
|
||||
**Forbidden:** `Unattended-Upgrade::Automatic-Reboot "true"` on stateful services; `$nrconf{restart} = 'a'` on a database host; silently skipping the reboot window to "avoid downtime" (real fix: HA, not skipped patches); installing `.deb` packages from third-party repos without pinning + signature verification; disabling the `apt-daily.timer` — disables ALL security updates.
|
||||
51
_blocks/security-ssh-hardening.md
Normal file
51
_blocks/security-ssh-hardening.md
Normal file
|
|
@ -0,0 +1,51 @@
|
|||
# SECURITY — SSH Hardening (sshd_config.d/99-kei.conf)
|
||||
|
||||
**Rule:** hardening goes into a drop-in under `/etc/ssh/sshd_config.d/`, NEVER by editing `/etc/ssh/sshd_config` directly. The main file ships with distro-owned defaults; drop-ins win on later-read order and survive package upgrades cleanly.
|
||||
|
||||
**Reference file `/etc/ssh/sshd_config.d/99-kei.conf`:**
|
||||
```
|
||||
# KeiSeiKit hardened SSH — pinned 2026-04-21, auditable via ssh-check.
|
||||
Protocol 2
|
||||
PasswordAuthentication no
|
||||
ChallengeResponseAuthentication no
|
||||
KbdInteractiveAuthentication no
|
||||
PermitRootLogin prohibit-password
|
||||
PermitEmptyPasswords no
|
||||
UsePAM yes
|
||||
MaxAuthTries 3
|
||||
MaxSessions 4
|
||||
LoginGraceTime 20
|
||||
AllowUsers keiadmin
|
||||
AllowTcpForwarding no
|
||||
X11Forwarding no
|
||||
PermitTunnel no
|
||||
ClientAliveInterval 120
|
||||
ClientAliveCountMax 2
|
||||
LogLevel VERBOSE
|
||||
# Modern crypto only (OpenSSH ≥ 8.9, default Debian 12 / Ubuntu 22.04+):
|
||||
KexAlgorithms curve25519-sha256,curve25519-sha256@libssh.org,sntrup761x25519-sha512@openssh.com
|
||||
Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,aes128-gcm@openssh.com
|
||||
MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com
|
||||
HostKeyAlgorithms ssh-ed25519,rsa-sha2-512,rsa-sha2-256
|
||||
```
|
||||
Apply with `sshd -t` (config test) before `systemctl reload ssh`. `reload` NOT `restart` — restart kills existing sessions; reload re-reads config while keeping them.
|
||||
|
||||
**Field-by-field rationale:**
|
||||
- `PasswordAuthentication no` — passwords are the #1 SSH brute-force vector. Keys only.
|
||||
- `PermitRootLogin prohibit-password` — root only via key, never password. `no` blocks even emergency cloud-console rescue paths on some providers; `prohibit-password` is the pragmatic middle.
|
||||
- `MaxAuthTries 3` — reduces per-connection key/password attempts; combine with fail2ban for per-IP bans (separate concern).
|
||||
- `AllowUsers keiadmin` — whitelist is simpler than group-based DENY and audits trivially. Adding users = explicit edit.
|
||||
- `LogLevel VERBOSE` — logs the key fingerprint used; without it you can't tell which admin logged in after compromise.
|
||||
- `ClientAliveInterval 120` + `ClientAliveCountMax 2` — idle sessions die in 4 minutes. Lost laptops don't leave open shells.
|
||||
- `AllowTcpForwarding no` / `PermitTunnel no` — disables SSH-as-VPN. Enable per-use-case via `Match User tunneluser` only.
|
||||
|
||||
**Modern KEX/Cipher/MAC lists (2026-04-21):**
|
||||
- KEX: `sntrup761x25519-sha512@openssh.com` is post-quantum hybrid (default since OpenSSH 9.9) [VERIFIED https://www.openssh.com/releasenotes.html]; `curve25519-sha256` is the classic ECDH.
|
||||
- Ciphers: AEAD only (`chacha20-poly1305`, `aes*-gcm`). Dropped CBC-mode — vulnerable to Terrapin CVE-2023-48795 without strict-KEX.
|
||||
- MACs: ETM (Encrypt-Then-MAC) only. Legacy MAC-Then-Encrypt is dropped.
|
||||
- HostKey: prefer `ssh-ed25519`; keep `rsa-sha2-*` for older client compatibility. Drop `ssh-rsa` (SHA-1, broken).
|
||||
|
||||
**Verification (KeiSeiKit primitive):**
|
||||
`_primitives/_rust/ssh-check/` parses BOTH `sshd_config` AND every `sshd_config.d/*.conf` (in filename sort order, last wins per directive), reports violations of the matrix above with `file:line` precision. Run BEFORE every `systemctl reload ssh` and BEFORE the skill phase-5 verify gate.
|
||||
|
||||
**Forbidden:** editing `/etc/ssh/sshd_config` in-place when a drop-in directory exists; `PermitRootLogin yes`; `PasswordAuthentication yes`; accepting any `diffie-hellman-group1-*` / `ssh-rsa` / CBC ciphers; restarting sshd before `sshd -t` passes; relying on fail2ban alone without key-only auth.
|
||||
68
_blocks/security-tls-caddy.md
Normal file
68
_blocks/security-tls-caddy.md
Normal file
|
|
@ -0,0 +1,68 @@
|
|||
# SECURITY — TLS via Caddy (automatic ACME, HTTP-01 / DNS-01)
|
||||
|
||||
**Why Caddy:** zero-config TLS. Caddy 2 auto-provisions certificates via Let's Encrypt / ZeroSSL on first request for a domain that resolves to it, auto-renews, and stores state under `/var/lib/caddy/`. Official docs: <https://caddyserver.com/docs/automatic-https> [VERIFIED 2026-04-21].
|
||||
|
||||
**One-liner install (Debian/Ubuntu, official repo):**
|
||||
```
|
||||
# Pinned to official Cloudsmith repo — NEVER `curl … | bash` a random domain.
|
||||
sudo apt install -y debian-keyring debian-archive-keyring apt-transport-https curl
|
||||
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/gpg.key' \
|
||||
| sudo gpg --dearmor -o /usr/share/keyrings/caddy-stable-archive-keyring.gpg
|
||||
curl -1sLf 'https://dl.cloudsmith.io/public/caddy/stable/debian.deb.txt' \
|
||||
| sudo tee /etc/apt/sources.list.d/caddy-stable.list
|
||||
sudo apt update && sudo apt install -y caddy
|
||||
```
|
||||
This installs the `caddy` systemd service owned by `caddy:caddy`. **Never run Caddy as root** — it uses `CAP_NET_BIND_SERVICE` ambient capability to bind low ports.
|
||||
|
||||
**Minimal `/etc/caddy/Caddyfile`:**
|
||||
```
|
||||
{
|
||||
# Global options
|
||||
email admin@example.com # ACME account contact (change!)
|
||||
# auto_https disable_redirects # uncomment only if fronted by another TLS-terminating proxy
|
||||
}
|
||||
|
||||
api.example.com {
|
||||
encode zstd gzip
|
||||
log {
|
||||
output file /var/log/caddy/api.log {
|
||||
roll_size 10mb
|
||||
roll_keep 10
|
||||
}
|
||||
}
|
||||
reverse_proxy 127.0.0.1:8080
|
||||
header {
|
||||
Strict-Transport-Security "max-age=31536000; includeSubDomains"
|
||||
X-Content-Type-Options "nosniff"
|
||||
Referrer-Policy "strict-origin-when-cross-origin"
|
||||
-Server
|
||||
}
|
||||
}
|
||||
```
|
||||
`caddy validate --config /etc/caddy/Caddyfile` BEFORE `systemctl reload caddy`. Reload ≠ restart; reload is zero-downtime.
|
||||
|
||||
**ACME challenge choice:**
|
||||
- **HTTP-01** (default) — Caddy binds port 80, LE connects back, serves challenge. Requires: port 80 open to the internet, DNS pointing to the VM. Works for single-host public services.
|
||||
- **DNS-01** — Caddy writes a TXT record via DNS provider API, doesn't need port 80 open. **Required for wildcard certs** (`*.example.com`) and for LAN-only hosts. Needs a DNS-provider plugin (e.g. `caddy-dns/cloudflare`) compiled into the binary — use `xcaddy build` or the Cloudsmith `caddy-dns-*` packages.
|
||||
|
||||
**DNS-01 with Cloudflare (`caddy-dns/cloudflare`):**
|
||||
```
|
||||
*.internal.example.com, internal.example.com {
|
||||
tls {
|
||||
dns cloudflare {env.CF_API_TOKEN}
|
||||
}
|
||||
reverse_proxy 127.0.0.1:8080
|
||||
}
|
||||
```
|
||||
`CF_API_TOKEN` — store in `/etc/caddy/caddy.env` (chmod 0640, `caddy:caddy`), load via systemd drop-in `EnvironmentFile=`. Never bake the token into the Caddyfile (RULE 0.8 — see `domain-has-secrets.md`).
|
||||
|
||||
**CT log awareness:** every LE cert is published to Certificate Transparency logs. **Any subdomain you cert is publicly searchable** via crt.sh. Use DNS-01 + wildcard for internal services whose names should not leak.
|
||||
|
||||
**Firewall interop (see `security-firewall-ufw.md`):** `ufw allow 80,443/tcp` is required for HTTP-01 and for public HTTPS. Do NOT open 80 if using DNS-01 exclusively and not redirecting HTTP→HTTPS publicly; skip the redirect with `auto_https disable_redirects`.
|
||||
|
||||
**Hardening:**
|
||||
- `HSTS` as shown above — 1 year, include subdomains. Add `preload` only after submitting to the HSTS preload list.
|
||||
- `-Server` header strip — removes Caddy version disclosure.
|
||||
- Rate limit via `caddy-ratelimit` module (needs `xcaddy build` with the plugin) for per-IP throttling; otherwise rely on cloud/ufw layer.
|
||||
|
||||
**Forbidden:** running Caddy as root; embedding DNS/ACME API tokens in the Caddyfile; using `tls internal` (self-signed, ephemeral CA) for anything reachable from outside localhost; skipping `caddy validate` before reload; self-hosting ACME (step-ca is great, but needs its own runbook — out of scope here).
|
||||
Loading…
Reference in a new issue