$ man 7 threat-models
[$ ] Threat models for self-hosted privacy infrastructure
// NAME
threat-models — 7 named threat-model dossiers, one per workload mapped in /playbook. Each dossier names the adversaries, the assets, the controls that move the cost curve, and the anti-patterns that look like security but are not.
// METHOD
A threat model is a four-tuple: who is the adversary, what is the asset, what controls shift the cost of the attack, what anti-patterns are mistaken for security. Every dossier below is written in that shape. The dossiers are not exhaustive; they cover the operator-side reality of the workloads XMRHost hosts. They do not cover application-layer threat models (those live in the application's own documentation).
// CLUSTERS
tor-relay ·matrix-homeserver ·journalism ·scraping ·ai-inference ·crypto-node ·vpn
// TOR-RELAY
[$ ] Tor relay operator
// playbook: /playbook/tor-relay
// adversaries
- AS-level passive observer (collects exit-stream metadata for traffic-correlation)
- abuse-mail volume from automated scanners + opportunistic complainants
- Sybil-coordinated relay deployment that aims to dominate the operator's circuit-construction probability
- civil-court correspondence from rightsholders (exit relay only)
// assets
- the relay's reputation in the consensus (Stable / Fast / Guard / HSDir flags)
- operator anonymity (the relay's IP + ASN are public; the operator's identity should not be)
- upstream provider relationship (a compromised provider relationship terminates the relay)
// controls (what actually moves the cost curve)
- MyFamily declaration linking the operator's relays — circuit construction avoids them as a triple
- ContactInfo published for abuse routing; abuse-mail auto-responder set up before going live
- ExitPolicy reduce-policy or stricter for exits; never run an exit on a provider whose AUP doesn't explicitly permit it
- BandwidthRate / BandwidthBurst bounded so a misconfig doesn't saturate the upstream
- guard-AS diversity check — the relay's guards should not all be in the operator's AS
// anti-patterns (look like security, are not)
- running an exit on a 'we don't really care' provider — the AUP catches up the moment a complaint arrives
- operating an exit AND a hidden service on the same physical host — adds a correlation surface
- running with default ContactInfo unchanged — abuse routing fails silently
// MATRIX-HOMESERVER
[$ ] Matrix homeserver operator
// playbook: /playbook/forum
// adversaries
- registration-spam economy (federated, automated; account-creation flood is the dominant load)
- AS-level adversary intercepting federation traffic (server-server API is plaintext SNI; TLS but the federation graph is not hidden)
- compromised peer homeserver pushing malicious events into rooms the operator's server is in
- data-subpoena to the operator regarding a specific user's message history
// assets
- user-account database (passwords, access tokens, device keys)
- room state database (membership, room-version, power-level history)
- media repository (photos, videos, files attached to messages)
- federation peer list + delegated authority for those peers
// controls (what actually moves the cost curve)
- registration-policy — registration disabled, invite-only, or token-gated; never open without captcha + rate limit
- federation allowlist for sensitive deployments (matrix.xmrhost.io only federates with peers on a vetted list)
- room-version pinning to the latest version that fixes known auth issues
- per-room access-token rotation on a documented cadence
- media-repository quota + admin-API moderation tools wired up before launch
// anti-patterns (look like security, are not)
- open registration without captcha or rate limit — the registration-spam economy exists; expect to be hit within hours
- federating with the entire matrix.org graph by default on a private deployment — leaks event metadata to peers you have no relationship with
- running Synapse with a SQLite backend at any non-trivial scale — operational failure mode, not a security failure mode, but degrades into one fast
// JOURNALISM
[$ ] Journalism / source-protection intake
// playbook: /playbook/journalism
// adversaries
- state-level adversary attempting to identify a source from intake metadata
- civil-litigation discovery against the publication (subpoena to the host)
- physical seizure of the host (rare but the threat model that drives offshore choice)
- intake-server compromise leading to source-side malware delivery
// assets
- source identity (the threat model centres on this — failure means a source is identified)
- submitted material (provenance + integrity from intake to newsroom)
- the publication's reputation as an intake operator (a single failure ends the program)
// controls (what actually moves the cost curve)
- intake on a Tor v3 hidden service — no clearnet IP for the source to leak
- minimal logging — Tor logs at notice level only, no source-identifying fields anywhere
- isolated airgap workflow for material handling — material exits the intake host on physical media
- operator does not retain source IP / browser-fingerprint metadata — and proves this by publishing the auditable config
- regular reproducible-deployment audits so a config drift is caught before an incident
// anti-patterns (look like security, are not)
- hosting intake on a clearnet domain reachable via DNS — defeats the point
- running intake on the same host as anything else — adds correlation surface and increases the blast radius of a single misconfiguration
- logging IP at any layer (web server, application, database) — a single noisy log line is a deanonymisation event
- using the publication's main email address as the intake — bypasses every control
// SCRAPING
[$ ] Scraping infrastructure operator
// playbook: /playbook/scraping
// adversaries
- target-site rate-limiting / CAPTCHA / IP-block cascade
- civil cease-and-desist correspondence from target operators
- cloud-provider AUP enforcement (the most common operational failure mode)
- aggregated-IP-reputation services that age IPs as 'datacenter' even when they aren't actively scraping
// assets
- IP reputation (a single block on a single IP cascades to the rest of the cluster on shared-vendor reputation services)
- scraping configuration / playwright scripts (theft = competitor advantage)
- scraped-data corpus (varies; threat model depends on whether the data is sensitive)
// controls (what actually moves the cost curve)
- stable ASN — pick an offshore provider where the upstream IP block has not been used by a high-volume scraper before
- deterministic per-target rate limit — never the maximum the target tolerates, always 50% of it
- per-target user-agent rotation that resembles real-traffic mix, NOT randomised per request
- respectful robots.txt handling — scraping a site that explicitly disallows you is not 'ethical scraping'
// anti-patterns (look like security, are not)
- running scraping infrastructure on a major cloud-vendor IP block — IP reputation is already pre-burned
- ignoring target rate-limit signals (429 / Retry-After) and just retrying — rate-limit infrastructure interprets this as adversarial
- rotating IPs within the same /24 to dodge a per-IP block — block-list infrastructure aggregates at /24 and /16
// AI-INFERENCE
[$ ] AI inference offshore operator
// playbook: /playbook/ai-inference
// adversaries
- US export-control regime (BIS rules apply to certain AI model deployments)
- tenant-side credential leak (vLLM API key compromise = unbounded inference cost)
- model-licence enforcement (some open-weight models have non-commercial clauses)
- abuse routing — the inference endpoint is a CSAM / spam / phishing-content surface if exposed
// assets
- model weights (storage cost; some are large enough to be hard to redownload)
- tenant API keys + tenant data flowing through inference
- GPU compute hours (consumption, billing fraud)
// controls (what actually moves the cost curve)
- tenant-side API-key authentication on every endpoint — never an open inference port
- rate limit + token-budget per tenant — bound the cost of a leaked key
- explicit model-licence audit before serving a given model commercially
- egress filtering on the inference host — no inference output should be making outbound HTTP requests
- abuse-report mailbox routed correctly; expect at least one report per quarter on a non-trivial deployment
// anti-patterns (look like security, are not)
- exposing the vLLM / Ollama HTTP server on a public IP without auth — the bot-scrape economy will find it within hours
- ignoring model-licence terms because 'it's open source' — many open-weight models are NOT open-source-licensed
- hosting models trained on proprietary data without a licence — attackable on multiple fronts
// CRYPTO-NODE
[$ ] Crypto node operator (BTC / ETH / XMR / Lightning / staking)
// playbook: /playbook/crypto-node
// adversaries
- wallet-key theft (attacker pivots from host compromise to wallet drain)
- validator-slashing (PoS networks: misconfiguration loses staked funds)
- AS-level intercept of unencrypted RPC traffic
- regulatory designation of the host's IP block (rare; relevant for sanctions-list interactions)
// assets
- wallet private keys (for nodes with attached spend authority — most run wallets read-only or signing-offline)
- stake collateral (PoS validators, Lokinet service nodes)
- node-side metadata (mempool, recent broadcasts) — privacy-relevant for some workloads
// controls (what actually moves the cost curve)
- wallet keys NEVER on the same host as the public-facing node, period — air-gap or hardware-token signing
- RPC bound to localhost or to a Tor hidden service, never the public IP
- validator-key custody on a hardware token; signing-only keys for the operating host
- monitor the node for slashing-eligible behaviour at the protocol level; do not rely on host-level monitoring alone
// anti-patterns (look like security, are not)
- co-locating spend keys on a public-internet-reachable host because 'it's faster' — single security failure = total wallet drain
- exposing RPC on the public IP because the wallet UI 'needs' it — the wallet UI does not need it
- running validators on a single host without a clear failover plan — the slashing condition you fail will be the one you didn't plan for
// VPN
[$ ] Self-hosted personal VPN operator
// playbook: /playbook/vpn
// adversaries
- AS-level adversary correlating VPN traffic timing with destination traffic timing
- civil-discovery process to the VPN host (the same data-subpoena threat model as any other host)
- exit-IP block-list — the VPN's egress IP gets recognised as a VPN by sites that block VPNs
// assets
- VPN configuration + client keys (one client = one key pair; rotation matters)
- the operator's anonymity (running a VPN is not itself anonymising)
- no traffic logs to seize (this is the design property)
// controls (what actually moves the cost curve)
- WireGuard or OpenVPN with per-client keypair — never a shared static-key model
- no traffic logging at any layer (host firewall, kernel, application) — provable by published config
- kill-switch on client side (if the tunnel drops, the client does not fall back to clearnet) — this is a client-side control, not a server-side one
- egress IP rotation if the destination-block-list problem materialises
// anti-patterns (look like security, are not)
- running a personal VPN as 'a VPN service' for friends — increases the data-subpoena surface materially
- logging connection metadata 'for performance debugging' — the moment you log, you are subject to subpoena for the log
- running the VPN exit on a residential connection — most ISPs will terminate the line for ToS violation
// SEE ALSO
$ ls /usr/share/doc/xmrhost
- /playbook — workload-targeted manpages with plan recommendations.
- /tor — Tor pillar (definitional + plan-side).
- /hardening — hardening pillar (kernel + sshd + auditd).
- /why-monero — payment-rail threat model.
- /glossary — definitions used above.