[$ xmrhost] _

$ man 7 threat-models

[$ ] Threat models for self-hosted privacy infrastructure

// NAME

threat-models — 7 named threat-model dossiers, one per workload mapped in /playbook. Each dossier names the adversaries, the assets, the controls that move the cost curve, and the anti-patterns that look like security but are not.

// METHOD

A threat model is a four-tuple: who is the adversary, what is the asset, what controls shift the cost of the attack, what anti-patterns are mistaken for security. Every dossier below is written in that shape. The dossiers are not exhaustive; they cover the operator-side reality of the workloads XMRHost hosts. They do not cover application-layer threat models (those live in the application's own documentation).

// CLUSTERS

tor-relay ·matrix-homeserver ·journalism ·scraping ·ai-inference ·crypto-node ·vpn

// TOR-RELAY

[$ ] Tor relay operator

// playbook: /playbook/tor-relay

// adversaries

  • AS-level passive observer (collects exit-stream metadata for traffic-correlation)
  • abuse-mail volume from automated scanners + opportunistic complainants
  • Sybil-coordinated relay deployment that aims to dominate the operator's circuit-construction probability
  • civil-court correspondence from rightsholders (exit relay only)

// assets

  • the relay's reputation in the consensus (Stable / Fast / Guard / HSDir flags)
  • operator anonymity (the relay's IP + ASN are public; the operator's identity should not be)
  • upstream provider relationship (a compromised provider relationship terminates the relay)

// controls (what actually moves the cost curve)

  • MyFamily declaration linking the operator's relays — circuit construction avoids them as a triple
  • ContactInfo published for abuse routing; abuse-mail auto-responder set up before going live
  • ExitPolicy reduce-policy or stricter for exits; never run an exit on a provider whose AUP doesn't explicitly permit it
  • BandwidthRate / BandwidthBurst bounded so a misconfig doesn't saturate the upstream
  • guard-AS diversity check — the relay's guards should not all be in the operator's AS

// anti-patterns (look like security, are not)

  • running an exit on a 'we don't really care' provider — the AUP catches up the moment a complaint arrives
  • operating an exit AND a hidden service on the same physical host — adds a correlation surface
  • running with default ContactInfo unchanged — abuse routing fails silently

// MATRIX-HOMESERVER

[$ ] Matrix homeserver operator

// playbook: /playbook/forum

// adversaries

  • registration-spam economy (federated, automated; account-creation flood is the dominant load)
  • AS-level adversary intercepting federation traffic (server-server API is plaintext SNI; TLS but the federation graph is not hidden)
  • compromised peer homeserver pushing malicious events into rooms the operator's server is in
  • data-subpoena to the operator regarding a specific user's message history

// assets

  • user-account database (passwords, access tokens, device keys)
  • room state database (membership, room-version, power-level history)
  • media repository (photos, videos, files attached to messages)
  • federation peer list + delegated authority for those peers

// controls (what actually moves the cost curve)

  • registration-policy — registration disabled, invite-only, or token-gated; never open without captcha + rate limit
  • federation allowlist for sensitive deployments (matrix.xmrhost.io only federates with peers on a vetted list)
  • room-version pinning to the latest version that fixes known auth issues
  • per-room access-token rotation on a documented cadence
  • media-repository quota + admin-API moderation tools wired up before launch

// anti-patterns (look like security, are not)

  • open registration without captcha or rate limit — the registration-spam economy exists; expect to be hit within hours
  • federating with the entire matrix.org graph by default on a private deployment — leaks event metadata to peers you have no relationship with
  • running Synapse with a SQLite backend at any non-trivial scale — operational failure mode, not a security failure mode, but degrades into one fast

// JOURNALISM

[$ ] Journalism / source-protection intake

// playbook: /playbook/journalism

// adversaries

  • state-level adversary attempting to identify a source from intake metadata
  • civil-litigation discovery against the publication (subpoena to the host)
  • physical seizure of the host (rare but the threat model that drives offshore choice)
  • intake-server compromise leading to source-side malware delivery

// assets

  • source identity (the threat model centres on this — failure means a source is identified)
  • submitted material (provenance + integrity from intake to newsroom)
  • the publication's reputation as an intake operator (a single failure ends the program)

// controls (what actually moves the cost curve)

  • intake on a Tor v3 hidden service — no clearnet IP for the source to leak
  • minimal logging — Tor logs at notice level only, no source-identifying fields anywhere
  • isolated airgap workflow for material handling — material exits the intake host on physical media
  • operator does not retain source IP / browser-fingerprint metadata — and proves this by publishing the auditable config
  • regular reproducible-deployment audits so a config drift is caught before an incident

// anti-patterns (look like security, are not)

  • hosting intake on a clearnet domain reachable via DNS — defeats the point
  • running intake on the same host as anything else — adds correlation surface and increases the blast radius of a single misconfiguration
  • logging IP at any layer (web server, application, database) — a single noisy log line is a deanonymisation event
  • using the publication's main email address as the intake — bypasses every control

// SCRAPING

[$ ] Scraping infrastructure operator

// playbook: /playbook/scraping

// adversaries

  • target-site rate-limiting / CAPTCHA / IP-block cascade
  • civil cease-and-desist correspondence from target operators
  • cloud-provider AUP enforcement (the most common operational failure mode)
  • aggregated-IP-reputation services that age IPs as 'datacenter' even when they aren't actively scraping

// assets

  • IP reputation (a single block on a single IP cascades to the rest of the cluster on shared-vendor reputation services)
  • scraping configuration / playwright scripts (theft = competitor advantage)
  • scraped-data corpus (varies; threat model depends on whether the data is sensitive)

// controls (what actually moves the cost curve)

  • stable ASN — pick an offshore provider where the upstream IP block has not been used by a high-volume scraper before
  • deterministic per-target rate limit — never the maximum the target tolerates, always 50% of it
  • per-target user-agent rotation that resembles real-traffic mix, NOT randomised per request
  • respectful robots.txt handling — scraping a site that explicitly disallows you is not 'ethical scraping'

// anti-patterns (look like security, are not)

  • running scraping infrastructure on a major cloud-vendor IP block — IP reputation is already pre-burned
  • ignoring target rate-limit signals (429 / Retry-After) and just retrying — rate-limit infrastructure interprets this as adversarial
  • rotating IPs within the same /24 to dodge a per-IP block — block-list infrastructure aggregates at /24 and /16

// AI-INFERENCE

[$ ] AI inference offshore operator

// playbook: /playbook/ai-inference

// adversaries

  • US export-control regime (BIS rules apply to certain AI model deployments)
  • tenant-side credential leak (vLLM API key compromise = unbounded inference cost)
  • model-licence enforcement (some open-weight models have non-commercial clauses)
  • abuse routing — the inference endpoint is a CSAM / spam / phishing-content surface if exposed

// assets

  • model weights (storage cost; some are large enough to be hard to redownload)
  • tenant API keys + tenant data flowing through inference
  • GPU compute hours (consumption, billing fraud)

// controls (what actually moves the cost curve)

  • tenant-side API-key authentication on every endpoint — never an open inference port
  • rate limit + token-budget per tenant — bound the cost of a leaked key
  • explicit model-licence audit before serving a given model commercially
  • egress filtering on the inference host — no inference output should be making outbound HTTP requests
  • abuse-report mailbox routed correctly; expect at least one report per quarter on a non-trivial deployment

// anti-patterns (look like security, are not)

  • exposing the vLLM / Ollama HTTP server on a public IP without auth — the bot-scrape economy will find it within hours
  • ignoring model-licence terms because 'it's open source' — many open-weight models are NOT open-source-licensed
  • hosting models trained on proprietary data without a licence — attackable on multiple fronts

// CRYPTO-NODE

[$ ] Crypto node operator (BTC / ETH / XMR / Lightning / staking)

// playbook: /playbook/crypto-node

// adversaries

  • wallet-key theft (attacker pivots from host compromise to wallet drain)
  • validator-slashing (PoS networks: misconfiguration loses staked funds)
  • AS-level intercept of unencrypted RPC traffic
  • regulatory designation of the host's IP block (rare; relevant for sanctions-list interactions)

// assets

  • wallet private keys (for nodes with attached spend authority — most run wallets read-only or signing-offline)
  • stake collateral (PoS validators, Lokinet service nodes)
  • node-side metadata (mempool, recent broadcasts) — privacy-relevant for some workloads

// controls (what actually moves the cost curve)

  • wallet keys NEVER on the same host as the public-facing node, period — air-gap or hardware-token signing
  • RPC bound to localhost or to a Tor hidden service, never the public IP
  • validator-key custody on a hardware token; signing-only keys for the operating host
  • monitor the node for slashing-eligible behaviour at the protocol level; do not rely on host-level monitoring alone

// anti-patterns (look like security, are not)

  • co-locating spend keys on a public-internet-reachable host because 'it's faster' — single security failure = total wallet drain
  • exposing RPC on the public IP because the wallet UI 'needs' it — the wallet UI does not need it
  • running validators on a single host without a clear failover plan — the slashing condition you fail will be the one you didn't plan for

// VPN

[$ ] Self-hosted personal VPN operator

// playbook: /playbook/vpn

// adversaries

  • AS-level adversary correlating VPN traffic timing with destination traffic timing
  • civil-discovery process to the VPN host (the same data-subpoena threat model as any other host)
  • exit-IP block-list — the VPN's egress IP gets recognised as a VPN by sites that block VPNs

// assets

  • VPN configuration + client keys (one client = one key pair; rotation matters)
  • the operator's anonymity (running a VPN is not itself anonymising)
  • no traffic logs to seize (this is the design property)

// controls (what actually moves the cost curve)

  • WireGuard or OpenVPN with per-client keypair — never a shared static-key model
  • no traffic logging at any layer (host firewall, kernel, application) — provable by published config
  • kill-switch on client side (if the tunnel drops, the client does not fall back to clearnet) — this is a client-side control, not a server-side one
  • egress IP rotation if the destination-block-list problem materialises

// anti-patterns (look like security, are not)

  • running a personal VPN as 'a VPN service' for friends — increases the data-subpoena surface materially
  • logging connection metadata 'for performance debugging' — the moment you log, you are subject to subpoena for the log
  • running the VPN exit on a residential connection — most ISPs will terminate the line for ToS violation

// SEE ALSO

$ ls /usr/share/doc/xmrhost

  • /playbook — workload-targeted manpages with plan recommendations.
  • /tor — Tor pillar (definitional + plan-side).
  • /hardening — hardening pillar (kernel + sshd + auditd).
  • /why-monero — payment-rail threat model.
  • /glossary — definitions used above.