Skip to content

Embedding an agent (integrator & end-user)

core Built live Planned

This page is for two readers: the integrator — a consultant (or their web developer) who wants to embed a GaugeWright agent in their own website — and the end-user — that consultant's customer, who chats with the agent in the browser. The two sit on opposite sides of a trust line, and the page keeps them apart on purpose.

Status — read this first

Nothing embeddable ships today. You cannot, in the product you can download now, deploy a public agent and paste a snippet into a live site. The whole embed surface is Planned for end-users.

What is Built (implemented and tested in the codebase, but not operationally live) is the core: the audience-identity seam, the durable-chat data layer, the scoped remote session, and the web-component elements (<gw-session> / <gw-chat> / <gw-chats>). They cannot run end to end without the managed host that serves a live per-visitor session — and that host is infrastructure that does not run in the local scaffold. Until it ships, treat every snippet and config field below as the designed shape, not a switch you can flip. The single source of truth for status is the roadmap.

Where your data goes

GaugeWright orchestrates locally, but it does not run the model. Every turn an end-user takes sends their prompt and the in-scope context to the third-party LLM provider the consultant configured, over the network, in plaintext. There is no local inference. On top of that, an embedded agent runs on a non-attested managed host the platform operates — so the platform operator can, in principle, see what runs there. Both facts must be disclosed to end-users (see End-user disclosure). Full detail: Where your data goes.


The two sides, and the one rule that separates them

An embed has two audiences with completely different authority postures. Conflating them is the mistake the whole design exists to prevent.

Side Who Authority Where they work
Consultant / integrator the agent's owner full authority, acting as their account the desktop workbench (deploy + monitor)
End-user the consultant's customer never an authority — a provider-asserted audience principal inside the consultant's scope the browser (embedded panels)

The governing rule is the workbench rule applied across a trust boundary: embedded panels read scoped projections and submit scoped commands; the end-user never owns product truth, and anything outside the granted scope fails closed rather than degrading. An end-user is an identified actor, not an authority — so rendering to them is not a cross-authority crossing, and the consultant stays the one responsible party for everything the deployment emits.

Three identity classes — keep them straight

An account is a self-sovereign person (keypair, seed recovery) — that's the consultant. An organization is a company you join via SSO. The end-user is the audience: a third class, provider-asserted (managed login or the consultant's identity provider), never holding a keypair or a seed phrase. The embed surface never mints an account for an end-user.


For the integrator

You build and deploy from the desktop app, where you already build agents — there is no separate web console. You act there as your account.

What you will embed (the snippet shapes)

Panels ship as framework-agnostic custom elements, so you compose them inside your own page in any stack (shadow-DOM isolation, CSS-variable theming). A single <gw-session> provider opens the scoped session and holds the connection; the panels nested inside bind to it.

All shapes below are Planned for live use

The elements are Built in code but there is no published embed.js bundle and no live host to point them at yet. These are the designed shapes.

<script src="cdn.gaugewright.com/embed.js"></script>

<gw-session deployment="book-bot" key="pk_live_…">
  <gw-chat></gw-chat>
</gw-session>

The narrowest panel ceiling: the end-user can send messages and receive streamed answers, nothing more.

<gw-session deployment="report-bot" key="pk_live_…">
  <gw-chat></gw-chat>
  <gw-viewer></gw-viewer>   <!-- read + download this session's own artifacts -->
  <gw-files></gw-files>     <!-- browse this session's own worktree -->
</gw-session>

Widening the ceiling: +viewer exposes this session's produced outputs; +files exposes its own worktree. The viewer and files panels are Planned (later than the MVP chat panel).

<gw-session deployment="advisor-bot" key="pk_live_…" auth="managed">
  <gw-chats></gw-chats>     <!-- standalone history pane: this user's own chats -->
  <gw-chat></gw-chat>       <!-- a drawer switcher also lives inside <gw-chat> -->
</gw-session>

In authenticated mode an end-user signs in and can return to their own durable chats. History ships two ways from v1: a drawer inside <gw-chat> and a standalone <gw-chats> element.

Composition is scope

Choosing a panel set is not cosmetic — it is the redaction. Each panel carries a fixed projection scope and verb set. Deploy sets the ceiling; the embed picks within it. A <gw-files> panel against a chat-only key does not render — never a silent widening, never a broken pane. That is the fail-closed rule made visible.

How to deploy and embed (the designed sequence)

Each step is Planned for live use; the badges mark what exists in code.

  1. Open the placement you want to publish. In the desktop, select the placement (an archetype on a project at a pinned version) you have proven on real work. A public deployment (this same placement, marked public and hosted on the managed platform host) is the deployable form of that placement — the words name the same thing seen from two sides: placement is the local, durable install; deployment is that placement published to serve end-users. Every deployment= attribute in the snippets above targets one of these. model Built
  2. Set the panel ceiling. Choose the maximum panel set the deployment will ever expose: chat, +viewer, or +files. This is the scope ceiling the publishable key will be bound to. Planned
  3. Choose the auth mode. One of:

    • anonymous — ephemeral, identity-less visitors; no history, conversation discarded on teardown.
    • managed — the platform runs a lightweight login (email / magic-link / social) so a consultant with no identity provider gets sign-in for free.
    • byo-oidc — you point the deployment at your own IdP (reuses the operator OIDC/JWKS/PKCE machinery), or silent token pass-through when your host site has already signed the user in.

    adapters Built live Planned 4. Register your allowed origins. List the exact site domain(s) the key may mint sessions from. A lifted key used on another site fails closed. See allowed origins. Planned 5. Set budget and quota caps. A hard spend ceiling plus per-visitor/IP rate limits and a max-concurrent-sessions cap. When the budget is hit the deployment fails closed — sessions show "temporarily unavailable", they do not silently degrade or overspend. (Because anonymous agents let anyone spend your compute, these caps are how you stay safe.) Planned 6. Acknowledge the credential ceiling, then deploy. The hosted agent runs on your account's sealed model credential (your linked OpenAI/Anthropic OAuth token) on a non-attested host. Deploy requires an explicit acknowledgement that your credential runs on the platform's non-attested host; you cannot deploy without it. (The attested host is the premium alternative — Planned.) Planned 7. Copy the snippet and the publishable key. The desktop generates the exact <gw-session>…</gw-session> block plus the publishable key to paste into your site. elements Built publish Planned 8. Preview before going live. A live in-desktop preview points our panels at the deployment so you see exactly what a visitor sees. The preview cannot bypass the deployment's panel/verb scope. Planned

Keys: publishable vs. secret, and rotation

The embed key lives in your page's HTML, where anyone can read it — so it is a publishable key (like a pk_…), not a bearer secret.

Publishable key Secret server-side key
Lives in page source (public) your backend only
Grants only the chat/panel verbs the ceiling allows backend API proxying
Protected by origin allowlist + quotas + budget cap not exposed to the browser
Status Built live Planned Planned (deferred)

Because the publishable key is public, it is never enough on its own: it only mints sessions from your allowed origins, within your quotas, up to your budget cap — all fail-closed. Rotate the key from the desktop's Embed/Preview surface if it leaks or on a schedule; rotation issues a new key and invalidates the old. A snippet must never carry a secret key — the desktop emits publishable keys only. A separate secret key for backend proxying is Planned.

For the spec-minded

The publishable key is a capability-scoped principal: the admission shell's pure decide checks its verb set, and rejects anything outside it (fail-closed, INV-20). The non-attested honest ceiling, the credential carry, and the hosting-as-meter tier come from ADR 0050; the panel surface, the two auth modes, and end-user identity from ADR 0051. The full surface contract is specs/experience/embed-surface.md.

Monitoring a live deployment

From the desktop's monitor surface (all Planned for live use; projections are Built) you will see live and recent sessions, spend vs. the budget cap with the fail-closed ceiling visible, the audience directory (authenticated mode only — end-users and their durable chats), and a "what visitors ask" topic insight rolled up over retained transcripts. From there you can pause or disable the deployment, redeploy a new version, edit origins/budget/quotas, rotate the key, open a session transcript, and request erasure of a transcript or an end-user's data.

What the monitor must never do

These are scoped projections, not authority — rebuilding them changes no product truth. The monitor never surfaces one end-user's chats to another or into the aggregate insight without scope, and it can never permit spend past the cap.

Pricing levers

Embedding is billed on hosting / compute, not on attestation — a public book-explainer has nothing confidential to seal, so attestation cannot be the meter. There are four paid levers:

  • Hosting — having an always-on managed deployment at all.
  • Per-visitor compute — snapshot storage, restore count, runtime-seconds.
  • Attestation — the premium attested host ceiling, when a deployment does need sealing. Planned
  • White-label — removing the "powered by" mark. A subtle powered-by mark rides the panels on the free/standard tier; removing it is a paid upgrade gated under the deployment's hosting entitlement. See powered-by / white-label.

For the end-user

This section is what the consultant's customer experiences in the browser. (It is all Planned — no embedded agent runs end to end today.)

Anonymous vs. signed-in

  • Anonymous. You use the agent without logging in. The session is ephemeral and isolated; when it ends, the conversation is discarded — there is no history to come back to. (A single "a session happened" record and the retained transcript are kept on the consultant's side; the live conversation itself is not durable truth.)
  • Signed-in. You sign in via whatever the consultant configured (a managed login, or "sign in with {your IdP}"), and your chats become durable. You can return later — from any device — and pick up an old conversation in my-chats, the history switcher (a drawer inside the chat, or its own pane). You see only your own chats, never anyone else's.

You are an identified actor, not an authority

Signing in gives you a provider-asserted identity inside the consultant's project — not a GaugeWright account, no keypair, no seed phrase. Recovery is provider-style (email reset / re-auth). The consultant's scoping governs everything you can see and do.

Claiming an anonymous conversation

If you start anonymously and then decide to sign up, you can carry that one conversation into your new identity via a one-time claim token offered when the anonymous session ends. This is the only sanctioned bridge across the anonymous/signed-in isolation line, and it is opt-in and fail-closed: a spent, expired, or wrong-site token grants nothing, and without a presented token the two modes stay completely separate. The claim flow is Planned (decided, build deferred).

What you should know about your data

This is the disclosure language a consultant should surface to visitors, stated plainly:

  • Your messages are read by a third-party LLM. The agent does not run on the consultant's computer or "locally". Every message you send, and the context the agent works over, is sent to a third-party LLM provider over the network to generate the reply. That provider's retention and training terms are the provider's, not GaugeWright's.
  • The host is not attested. The agent runs on a managed host the platform operates, which is not a sealed/attested environment — the platform operator can, in principle, see what runs there. (Attested hosting is a separate, premium option that is Planned.)
  • You are isolated from other visitors. Your session sees only what you made here; you never see another visitor's conversation, files, or artifacts, and no panel exposes the consultant's private workspace or method.
  • Retention and erasure. Anonymous conversations are discarded when the session ends. Signed-in chats are retained so you can return to them, and can be deleted on request (the consultant can erase a transcript or your data; signed-in users can delete their own chats). See Where your data goes.

Guarantees: structural vs. operational

GaugeWright keeps machine-checked structural guarantees apart from policy/operational ones so claims stay defensible.

Guarantee Kind Status
A panel beyond the granted ceiling does not render (fail-closed) Structural — model-checked Built
Commands outside the key's verb scope are rejected at admission Structural — model-checked Built
One end-user never sees another's session or chats Structural — model-checked Built
A handle is not the bytes; downloads cross only via resource-export Structural — model-checked Built
Spend past the budget cap is impossible (fail-closed) Structural — model-checked Built
The end-user is never an authority; the consultant stays responsible Structural — model-checked Built
The managed host is honest about being non-attested Operational — platform-operator trust Planned (host not live)
The inference provider is inside the trust boundary (it sees prompts + context) Operational — current reality Available (current state); removing it — confidential inference — Planned
A live per-visitor session actually runs Operational — needs the managed host Planned

No per-OS sandbox caveat applies here

The kernel-enforced method-isolation sandbox (Linux/macOS Available; Windows Planned) protects the consultant's local build loop. An embedded agent runs on the managed host, not the visitor's or the consultant's machine, so that per-OS sandbox is not what guards an embed — the scope/isolation/fail-closed boundary above is. See How GaugeWright protects your work.


Known gaps (today)

Stated here, not only on the external trust site:

  • No usable embed path. There is no published embed.js, no live managed host, and no deploy/monitor surface you can operate. The whole embed surface is Planned for end-users.
  • The viewer and files panels are later than MVP. MVP is the chat panel (anonymous + managed-auth) with my-chats. The viewer and files panels, BYO-OIDC + token pass-through, the claim flow, white-label, the attested-host ceiling, and the server-side secret key are all Planned after MVP.
  • Inference is remote and inside the trust boundary. Visitor prompts and in-scope context reach the third-party provider the consultant configured; confidential inference (removing the provider from the boundary) is Planned.
  • The managed host is non-attested. Attested hosting for embeds is Planned; until it ships, the platform operator is inside the trust boundary of any embedded deployment.

Where to go next