$ man 7 threat-models

[$ ] Threat models for self-hosted privacy infrastructure

// NAME

threat-models — 7 named threat-model dossiers, one per workload mapped in /playbook. Each dossier names the adversaries, the assets, the controls that move the cost curve, and the anti-patterns that look like security but are not.

// METHOD

A threat model is a four-tuple: who is the adversary, what is the asset, what controls shift the cost of the attack, what anti-patterns are mistaken for security. Every dossier below is written in that shape. The dossiers are not exhaustive; they cover the operator-side reality of the workloads XMRHost hosts. They do not cover application-layer threat models (those live in the application's own documentation).

// CLUSTERS

tor-relay ·matrix-homeserver ·journalism ·scraping ·ai-inference ·crypto-node ·vpn

// TOR-RELAY

[$ ] Tor relay operator

// playbook: /playbook/tor-relay

// adversaries

AS-level passive observer (collects exit-stream metadata for traffic-correlation)
abuse-mail volume from automated scanners + opportunistic complainants
Sybil-coordinated relay deployment that aims to dominate the operator's circuit-construction probability
civil-court correspondence from rightsholders (exit relay only)

// assets

the relay's reputation in the consensus (Stable / Fast / Guard / HSDir flags)
operator anonymity (the relay's IP + ASN are public; the operator's identity should not be)
upstream provider relationship (a compromised provider relationship terminates the relay)

// controls (what actually moves the cost curve)

MyFamily declaration linking the operator's relays — circuit construction avoids them as a triple
ContactInfo published for abuse routing; abuse-mail auto-responder set up before going live
ExitPolicy reduce-policy or stricter for exits; never run an exit on a provider whose AUP doesn't explicitly permit it
BandwidthRate / BandwidthBurst bounded so a misconfig doesn't saturate the upstream
guard-AS diversity check — the relay's guards should not all be in the operator's AS

// anti-patterns (look like security, are not)

running an exit on a 'we don't really care' provider — the AUP catches up the moment a complaint arrives
operating an exit AND a hidden service on the same physical host — adds a correlation surface
running with default ContactInfo unchanged — abuse routing fails silently

// MATRIX-HOMESERVER

[$ ] Matrix homeserver operator

// playbook: /playbook/forum

// adversaries

registration-spam economy (federated, automated; account-creation flood is the dominant load)
AS-level adversary intercepting federation traffic (server-server API is plaintext SNI; TLS but the federation graph is not hidden)
compromised peer homeserver pushing malicious events into rooms the operator's server is in
data-subpoena to the operator regarding a specific user's message history

// assets

user-account database (passwords, access tokens, device keys)
room state database (membership, room-version, power-level history)
media repository (photos, videos, files attached to messages)
federation peer list + delegated authority for those peers

// controls (what actually moves the cost curve)

registration-policy — registration disabled, invite-only, or token-gated; never open without captcha + rate limit
federation allowlist for sensitive deployments (matrix.xmrhost.io only federates with peers on a vetted list)
room-version pinning to the latest version that fixes known auth issues
per-room access-token rotation on a documented cadence
media-repository quota + admin-API moderation tools wired up before launch

// anti-patterns (look like security, are not)

open registration without captcha or rate limit — the registration-spam economy exists; expect to be hit within hours
federating with the entire matrix.org graph by default on a private deployment — leaks event metadata to peers you have no relationship with
running Synapse with a SQLite backend at any non-trivial scale — operational failure mode, not a security failure mode, but degrades into one fast

// JOURNALISM

[$ ] Journalism / source-protection intake

// playbook: /playbook/journalism

// adversaries

state-level adversary attempting to identify a source from intake metadata
civil-litigation discovery against the publication (subpoena to the host)
physical seizure of the host (rare but the threat model that drives offshore choice)
intake-server compromise leading to source-side malware delivery

// assets

source identity (the threat model centres on this — failure means a source is identified)
submitted material (provenance + integrity from intake to newsroom)
the publication's reputation as an intake operator (a single failure ends the program)

// controls (what actually moves the cost curve)

intake on a Tor v3 hidden service — no clearnet IP for the source to leak
minimal logging — Tor logs at notice level only, no source-identifying fields anywhere
isolated airgap workflow for material handling — material exits the intake host on physical media
operator does not retain source IP / browser-fingerprint metadata — and proves this by publishing the auditable config
regular reproducible-deployment audits so a config drift is caught before an incident

// anti-patterns (look like security, are not)

hosting intake on a clearnet domain reachable via DNS — defeats the point
running intake on the same host as anything else — adds correlation surface and increases the blast radius of a single misconfiguration
logging IP at any layer (web server, application, database) — a single noisy log line is a deanonymisation event
using the publication's main email address as the intake — bypasses every control

// SCRAPING

[$ ] Scraping infrastructure operator

// playbook: /playbook/scraping

// adversaries

target-site rate-limiting / CAPTCHA / IP-block cascade
civil cease-and-desist correspondence from target operators
cloud-provider AUP enforcement (the most common operational failure mode)
aggregated-IP-reputation services that age IPs as 'datacenter' even when they aren't actively scraping

// assets

IP reputation (a single block on a single IP cascades to the rest of the cluster on shared-vendor reputation services)
scraping configuration / playwright scripts (theft = competitor advantage)
scraped-data corpus (varies; threat model depends on whether the data is sensitive)

// controls (what actually moves the cost curve)

stable ASN — pick an offshore provider where the upstream IP block has not been used by a high-volume scraper before
deterministic per-target rate limit — never the maximum the target tolerates, always 50% of it
per-target user-agent rotation that resembles real-traffic mix, NOT randomised per request
respectful robots.txt handling — scraping a site that explicitly disallows you is not 'ethical scraping'

// anti-patterns (look like security, are not)

running scraping infrastructure on a major cloud-vendor IP block — IP reputation is already pre-burned
ignoring target rate-limit signals (429 / Retry-After) and just retrying — rate-limit infrastructure interprets this as adversarial
rotating IPs within the same /24 to dodge a per-IP block — block-list infrastructure aggregates at /24 and /16

// AI-INFERENCE

[$ ] AI inference offshore operator

// playbook: /playbook/ai-inference

// adversaries

US export-control regime (BIS rules apply to certain AI model deployments)
tenant-side credential leak (vLLM API key compromise = unbounded inference cost)
model-licence enforcement (some open-weight models have non-commercial clauses)
abuse routing — the inference endpoint is a CSAM / spam / phishing-content surface if exposed

// assets

model weights (storage cost; some are large enough to be hard to redownload)
tenant API keys + tenant data flowing through inference
GPU compute hours (consumption, billing fraud)

// controls (what actually moves the cost curve)

tenant-side API-key authentication on every endpoint — never an open inference port
rate limit + token-budget per tenant — bound the cost of a leaked key
explicit model-licence audit before serving a given model commercially
egress filtering on the inference host — no inference output should be making outbound HTTP requests
abuse-report mailbox routed correctly; expect at least one report per quarter on a non-trivial deployment

// anti-patterns (look like security, are not)

exposing the vLLM / Ollama HTTP server on a public IP without auth — the bot-scrape economy will find it within hours
ignoring model-licence terms because 'it's open source' — many open-weight models are NOT open-source-licensed
hosting models trained on proprietary data without a licence — attackable on multiple fronts

// CRYPTO-NODE

[$ ] Crypto node operator (BTC / ETH / XMR / Lightning / staking)

// playbook: /playbook/crypto-node

// adversaries

wallet-key theft (attacker pivots from host compromise to wallet drain)
validator-slashing (PoS networks: misconfiguration loses staked funds)
AS-level intercept of unencrypted RPC traffic
regulatory designation of the host's IP block (rare; relevant for sanctions-list interactions)

// assets

wallet private keys (for nodes with attached spend authority — most run wallets read-only or signing-offline)
stake collateral (PoS validators, Lokinet service nodes)
node-side metadata (mempool, recent broadcasts) — privacy-relevant for some workloads

// controls (what actually moves the cost curve)

wallet keys NEVER on the same host as the public-facing node, period — air-gap or hardware-token signing
RPC bound to localhost or to a Tor hidden service, never the public IP
validator-key custody on a hardware token; signing-only keys for the operating host
monitor the node for slashing-eligible behaviour at the protocol level; do not rely on host-level monitoring alone

// anti-patterns (look like security, are not)

co-locating spend keys on a public-internet-reachable host because 'it's faster' — single security failure = total wallet drain
exposing RPC on the public IP because the wallet UI 'needs' it — the wallet UI does not need it
running validators on a single host without a clear failover plan — the slashing condition you fail will be the one you didn't plan for

// VPN

[$ ] Self-hosted personal VPN operator

// playbook: /playbook/vpn

// adversaries

AS-level adversary correlating VPN traffic timing with destination traffic timing
civil-discovery process to the VPN host (the same data-subpoena threat model as any other host)
exit-IP block-list — the VPN's egress IP gets recognised as a VPN by sites that block VPNs

// assets

VPN configuration + client keys (one client = one key pair; rotation matters)
the operator's anonymity (running a VPN is not itself anonymising)
no traffic logs to seize (this is the design property)

// controls (what actually moves the cost curve)

WireGuard or OpenVPN with per-client keypair — never a shared static-key model
no traffic logging at any layer (host firewall, kernel, application) — provable by published config
kill-switch on client side (if the tunnel drops, the client does not fall back to clearnet) — this is a client-side control, not a server-side one
egress IP rotation if the destination-block-list problem materialises

// anti-patterns (look like security, are not)

running a personal VPN as 'a VPN service' for friends — increases the data-subpoena surface materially
logging connection metadata 'for performance debugging' — the moment you log, you are subject to subpoena for the log
running the VPN exit on a residential connection — most ISPs will terminate the line for ToS violation

// SEE ALSO

$ ls /usr/share/doc/xmrhost

/playbook — workload-targeted manpages with plan recommendations.
/tor — Tor pillar (definitional + plan-side).
/hardening — hardening pillar (kernel + sshd + auditd).
/why-monero — payment-rail threat model.
/glossary — definitions used above.