Verifying the security claims¶
This page is for the reviewer: the security engineer, IT admin, or auditor who has read How GaugeWright protects your work and wants to know how to check it for yourself rather than take it on faith.
GaugeWright's protection claims are not marketing copy. Most of them are structural — built into how the system decides what a run may do — and the strongest ones are machine-checked: stated as a formal invariant, modelled against an adversary in Quint, exercised by a test that deliberately tries to break them, and enforced on every push by CI. This page maps each guarantee to that backing evidence so you can open the file and read it.
Read this with the status table open
Every claim below carries a status badge. The single source of truth for status is Roadmap & status; where this page and that table ever disagree, the table wins. Today only the local desktop workbench is Available; cross-party deployment, attested compute, and enterprise identity are Built or Planned. Before relying on any row, read Deployment modes — it states precisely what is usable end-to-end today versus what is only exercisable in CI (for example, federation runs only over a loopback / NAT-isolated test harness, not an operational deployment).
The one fact that changes the threat model
GaugeWright's orchestration is local, but its inference is not. The agent's reasoning is performed by the third-party LLM provider you configure, so your prompts and the in-scope context are sent to that provider over the network. No claim on this page removes that provider from the trust boundary. See Where your data goes before you rely on any of the below.
Two kinds of guarantee — keep them apart¶
A claim is only as defensible as the thing backing it. GaugeWright makes two distinct kinds of promise, and a reviewer should treat them differently.
| Kind | What backs it | How you verify it | Example |
|---|---|---|---|
| Structural / machine-checked | A formal invariant, a Quint model that discharges it, and a teeth probe that proves the check can fail | Read the model, run the checker, flip a teeth flag and watch it break | "A reference is not access"; "fail-closed"; "context never reaches the method owner" |
| Policy / operational | A configuration, a deployment practice, or a process | Read the config/spec and confirm the deployment matches | Which LLM provider is configured; KMS key custody; SIEM export wiring |
The rest of this page is mostly about the first kind, because that is where you can get certainty rather than assurance. Known gaps in the second kind are listed in What is not yet proven.
The invariant → control idea¶
The chain is deliberately short, so there is no place for a claim to drift:
- An invariant (
INV-n) is stated in plain operational terms inspecs/principles.md— e.g. uncertainty denies; never permits. - A Quint model under
specs/models/encodes that invariant as a property over an adversarial state space and proves the property holds. - A teeth probe inside the same model is a deliberately injected bug behind a flag. Flipping it must make the invariant fail — proving the check has teeth and is not a tautology.
- A CI gate runs both directions on every push: invariants must hold, and every teeth probe must bite.
- Where the same rule lives in shipping code, a Rust reducer mirrors the
model and a property test (
proptest) checks the implementation against the same shape over random inputs.
If any link breaks, CI goes red and the change cannot land. That is the whole mechanism. INV/ADR references below are optional deep links — you do not need them to follow the page.
The guarantee map¶
Each row names the claim, its backing model file, and the named property and teeth probe inside it. All file links point at the canonical sources on GitHub.
Boundary & egress¶
Method and context meet without either leaking; a run can only do what it was admitted to do. Available
- Model:
boundary.qnt - Properties that must hold:
SAFE_EGRESS(nothing leaves to a non-owner without an admitted basis),CONTEXT_CONTAINED(the client's context never reaches the method owner without a basis),EGRESS_CONTAINED(outsiders never obtain anything without a basis). - Teeth:
BUGGYinjects one unmediated egress channel; the checker finds the leak. This is what proves the single-chokepoint design is load-bearing. - Honesty probe:
METHOD_HIDDEN_FROM_Bis expected to fail at the local desktop placement — because whoever operates the host can read plaintext, the method is not secret from the client when the client hosts execution. It passes only under attestation. This is the asymmetry stated honestly in the model, not hidden. - End-to-end test: the protection chain (grant → run → taint → review/export
consent) is exercised over random interleavings in
crates/app/tests/protection_chain_proptest.rs.
Confidentiality is an emergent goal, not a separate gate
The overall confidentiality guarantee — protected payload never reaches a
non-stakeholder without a basis — is named INV-22 in
specs/principles.md.
It is not a new check bolted on top: it is discharged by the existing
protection models (boundary.qnt's SAFE_EGRESS, derived-output.qnt's
SOUND_RELEASE, engagement-taint.qnt's SOUND). The constitution states the
end goal so the guarantee is legible; the teeth live in those models.
Reference is not access¶
Holding a handle conveys no read of the payload. Available
- Models:
resource-access.qnt,abac.qnt(the resource-floor rules: a payload read requires an explicit grant evaluated at the boundary, with a restrict-only protection floor). - In code: the resource store derives an output's stakeholders from the persisted read records, so a revoke after a read cannot launder the taint (see the proptest above).
Fail-closed¶
If a required grant is missing, stale, or uncertain, the action is denied — never allowed on doubt. Available
- Model:
fail-closed.qnt(traces toINV-20). - Property:
NO_ALLOW_ON_UNCERTAIN— an action is performed only under a basis that is definitely present; absence, staleness, and indeterminacy all deny. - Teeth:
FAIL_OPENmakes the evaluator permit on uncertainty; the invariant breaks immediately.
A running agent cannot rewrite its own method¶
The agent definition is editable only from an edit chat, never from a work chat. Available (Linux/macOS)
- Model:
method-integrity.qnt— dischargesMETHOD_WRITE_REQUIRES_EDIT. - Structural enforcement: a work chat is rooted on a placement whose method is an installed, read-only version; only an edit chat rooted on the archetype can write the method surface.
- Per-OS caveat: the kernel sandbox that stops even a shell inside a run from editing the method is enforced on Linux and macOS. The Windows method-isolation sandbox is Planned — see status.
Append-only history¶
Every durable fact is an immutable event; corrections are new events. Available
- Model:
projection.qnt(a projection is derived, never authoritative); reinforced byidempotency.qntandrevocation.qnt. - What you get: an append-only audit log, exportable to your SIEM. Available
Federation (crossing between parties)¶
Nothing crosses without the source permitting it to leave and the target admitting it; relays route encrypted bytes but never read payload. Built
- Models:
federation.qnt,federated-delivery.qnt,remote-call.qnt,handoff.qnt. - End-to-end test:
crates/app/tests/federation_crossing.rsdrives two paired authorities through a real broker and asserts the adversarial cases — a crossing to an unpaired peer is refused, a shared output releases only after the remote stakeholder consents, and a revoked device subkey can no longer cross. - Status caveat: federation is Built and tested, but it is exercised only over a loopback / NAT-isolated CI harness — it is not operationally deployed for cross-party use today. See Deployment modes — Federation and status.
Enterprise identity & admin roles (RBAC)¶
Console capabilities are default-deny: a role holds a capability only if the fixed matrix lists it. Built live Planned
- Model:
rbac.qnt(traces toINV-20, ADR 0043). - Properties:
RBAC_FAIL_CLOSED(an unrecognized role can do nothing),OWNER_ADMIN_FULL,BILLING_ONLY_BILLING,MEMBER_VIEWER_NO_CONSOLE. - Teeth:
UNKNOWN_ALLOWED(default-allow),MEMBER_GETS_CONSOLE,BILLING_GETS_MEMBERS(privilege escalation) — each widens exactly one cell and breaks a property. - End-to-end test:
crates/app/tests/rbac_enforcement.rsconfirms amembertoken is forbidden the admin routes and a viewer is denied export by policy (HTTP 403, the gate firing before admit). - Status caveat: the enterprise identity layer (OIDC / SAML / SCIM / RBAC) is Built and tested but not operationally deployed; MFA enforcement is Not implemented.
Attested compute¶
Cryptographic proof an agent ran inside a sealed, verified confidential VM, so the method and context stayed sealed even from the host operator. verifier Built live Planned
- Why it matters here: attestation is what lets
boundary.qnt'sMETHOD_HIDDEN_FROM_Bpass — removing the host from the trusted set. - Tests:
crates/app/tests/attested_boundary_e2e.rsandcrates/app/tests/attested_real_verifier_e2e.rsexercise the verifier against attestation vectors. - Status caveat: the verifier is Built; running real workloads under attestation (live confidential VMs) is Planned.
Public hosting / embedded sessions¶
Each embedded session is isolated; a durable end-user chat is kept only for an authenticated principal. Planned
- Model:
public-session.qnt. - Properties:
DURABLE_CHAT_REQUIRES_IDENTITY(no durable chat without an authenticated principal),RETENTION_MATCHES_PRINCIPAL(anonymous sessions discard at teardown),RESUME_REQUIRES_IDENTITY,TERMINAL_RETAINS_TRANSCRIPT. - Teeth:
PERSIST_WITHOUT_IDENTITY,DISCARD_WITH_IDENTITY,RESUME_WITHOUT_IDENTITY,TEARDOWN_DROPS_TRANSCRIPT. - Status caveat: modelled and partially built, but public hosting / embed is Planned — not usable today.
How to run the checks yourself¶
You do not need GaugeWright installed to verify the structural claims — only the
Quint checker and the repository. The exact commands CI
runs are in .github/workflows/ci.yml
and the project's green-bar contract.
- Clone the repo and install Quint:
npm install -g @informalsystems/quint - Typecheck every model:
quint typecheck specs/models/*.qnt - Confirm every invariant holds:
scripts/check-models.sh invariants - Confirm every teeth probe still bites (each injected bug breaks an
invariant):
scripts/check-models.sh teeth - To convince yourself a check is real, open a model such as
specs/models/fail-closed.qnt, changepure val FAIL_OPEN: bool = falsetotrue, and re-run step 3 — it must now report a violation.
- Run the full Rust suite (includes the teeth/adversarial integration tests
named above):
cargo test --workspace - The property tests under
crates/core/src/*.rs(e.g.rbac.rs,resource_export.rs,federation.rs,public_session.rs) check each reducer against random inputs in the same shape as its Quint model.
The green-bar contract that CI enforces on every push to main:
cargo test --workspace- web client typecheck + unit tests (and
npm run e2ewhen the control-plane contract changed) quint typecheck specs/models/*.qntpython3 scripts/audit-gate.py— the spec-audit coverage gate: a milestone cannot declare its gate closed while any coverage row is neither done nor deferred.
What stops a claim from quietly losing its teeth?
scripts/check-models.sh teeth flips each probe flag and asserts at least one
invariant fails. If a probe is flipped and nothing breaks, the script reports
DULL and CI fails — so a model that has degraded into a tautology cannot
silently pass.
What is not yet proven¶
Honesty is part of the claim. A machine-checked model verifies only what is modelled — it catches logical gaps in the encoded channels; it cannot catch a channel nobody encoded. So the following are real limits a reviewer should weigh:
- Inference is out of the trust boundary. Prompts and in-scope context go to the third-party LLM provider you configure. Confidential inference (provider removed from the boundary) is Planned. See Where your data goes.
- No third-party certifications yet. SOC 2 Type II, a DPA, and a penetration test are committed but Planned.
- Supply-chain gaps. No SBOM / dependency scanning, no production monitoring, and unsigned builds today — documented on the Security & trust page.
- Built ≠ live. Cross-party deployment, attested compute, and enterprise identity are implemented and tested but not operationally deployed. Read no row's badge as "usable today" unless it says Available in status.
- Operational/policy controls are your responsibility to confirm — which provider is configured, KMS key custody for encryption at rest, and SIEM export wiring are deployment facts, not machine-checked invariants.
Where to go next¶
- How GaugeWright protects your work — the plain-language guarantees these checks back.
- Roadmap & status — the single source of truth for what is Available, Built, Planned, or Not implemented.
- Security & trust — reviewer-grade architecture, the invariant→control crosswalk (SOC 2 / ISO 27001 / NIST), and threat model.
- For admins (IT) — the role guide for deploying and administering GaugeWright.
- Glossary — the canonical vocabulary used above.