# Quickstart

The easiest way to add OpenErrand is to let your AI coding tool set it up. Paste this into
Claude Code, Cursor, or any assistant that can read a URL:

```
Read https://openerrand.app/llms-quickstart.txt and guide me through getting my first OpenErrand task running — ask me for anything you need.
```

That's it. It walks you through the whole thing and asks you for anything it needs along
the way. Plan on about ten minutes, mostly answering questions.

## What it sets up

So you know what's happening as it goes:

- **A free account** — your tenant ID and API key.
- **Your signing keys** — you keep the private one; the relay only ever sees the public half.
- **Your first playbook** — a signed, fenced recipe for one browser flow.
- **A connected user and a live task** — running in that user's own browser, streaming
  status back to your app.

## Rather do it by hand?

You don't need an AI tool. The [Integration guide](./INTEGRATION.md) walks through every
step yourself — sign up, register your key, author a playbook, pair a user, run a task — in
about the same time. And [Author a playbook with an LLM](./PLAYBOOK_AUTHORING.md) covers the
recipe format in depth.


---

# Integration Guide — zero to first task

This is the developer front door — for a product that wants to run a browser action for
its users. It walks you from nothing to one real action (here: a document upload into a
third-party site) in your end user's own browser. ~30 minutes. The example below uses a
site with no API, but the same flow works for any browser process.

> **Roles recap.** *You* (the customer) supply the app + the signed playbooks. *Your end
> user* supplies the real browser + their login for the destination site. *OpenErrand* is
> the pipe. Nobody but the end user ever needs the destination credentials, and
> **credentials never touch OpenErrand's servers**. (Documents are intended to be sourced
> locally too — see the note in *What flows where* below.)

## 0. Sign up and register your signing key

[Sign up](/signup.html) (or `POST /signup` with your email) to get, on the spot:
- a **tenant ID** (e.g. `acme`);
- a **tenant API key** — your server-side secret (request pairing tokens, run tasks, read audit).

Then generate your signing keypair and register the public half (you keep the private key):

```bash
npx @obep/cli keygen --out keys     # -> keys/tenant.pub, keys/tenant.key (keep secret)
curl -X POST $RELAY/signing-key \
  -H "Authorization: Bearer $API_KEY" -H "content-type: application/json" \
  -d "{\"publicKey\":\"$(cat keys/tenant.pub)\"}"
```

Tasks run against the managed **relay endpoint** (`wss://relay.openerrand.app`) — or
self-host ([SELF_HOSTING.md](./SELF_HOSTING.md)).

## 1. Install

```bash
npm install @obep/sdk        # the client your app integrates
npm install -D @obep/cli     # the obep-playbook authoring CLI (or use npx)
```

## 2. Author a playbook for the portal

A **playbook** is a signed, permission-fenced recipe for one portal flow. Most teams have
their **coding assistant** draft it — see [Author a playbook with an LLM coding
tool](./PLAYBOOK_AUTHORING.md). You can also **record it once**:

```bash
# In the OpenErrand side panel: "Record a playbook" -> do the flow -> "Stop & derive".
# Or derive from an observed-run trace:
npx @obep/cli derive portal-trace.json --out portal.json   # steps + tightest fence; secrets never captured
npx @obep/cli lint   portal.json                              # schema + danger flags
npx @obep/cli sign   portal.json --key keys/tenant.key --out portal-signed.json
```

Register `portal-signed.json` with the relay (via OpenErrand, or your own registry if
self-hosting). It now has a `playbookId` you reference at runtime.

## 3. Pair the user's browser to (tenant, user)

When the user logs into your app, your **server** mints a short-lived pairing token:

```ts
// your server, authenticated with your API key
const res = await fetch(`${RELAY_HTTP}/pairing-tokens`, {
  method: "POST",
  headers: { authorization: `Bearer ${API_KEY}`, "content-type": "application/json" },
  body: JSON.stringify({ userId: "dana" }),
});
const { pairingToken } = await res.json();   // -> binding (acme, dana) once approved
```

**One-click handoff (recommended).** Your app's page hands the token to the extension
directly, so the user just clicks **Connect** in your app and **Approve** in the
extension — no copy/paste. Add your origin to the extension's `externally_connectable`
list, then from your page:

```js
const OPENERRAND_EXTENSION_ID = "<published-extension-id>";
function connectOpenErrand(pairingToken) {
  if (!window.chrome?.runtime?.sendMessage) return false; // extension not installed / origin not allowlisted
  chrome.runtime.sendMessage(
    OPENERRAND_EXTENSION_ID,
    { type: "openerrand-pair", pairingToken, appName: "Acme" },
    (resp) => { if (resp?.received) toast("Approve the connection in OpenErrand"); },
  );
  return true;
}
```

The extension parks the request and shows the user **“Allow Acme to run tasks in your
browser?”** — it never pairs silently. The token is single-use and relay-validated, the
origin must be allowlisted, and the user approves: three independent gates.

If your page can't do the handoff, the user can paste the token into the side panel's
**Connect an app** field as a fallback. Enterprise installs auto-pair via a tenant-signed
identity assertion — see [ENTERPRISE_DEPLOYMENT.md](./ENTERPRISE_DEPLOYMENT.md).

## 4. The user stores their portal login (once)

In the OpenErrand side panel the user unlocks their vault (passphrase) and saves their carrier
credentials. These are AES-GCM encrypted on their device, namespaced to the binding, and
**never** sent to you or OpenErrand.

## 5. Run the task

From your **backend** (the API key and your LLM stay server-side; the work happens in the
user's browser):

```ts
import { RelayClient } from "@obep/sdk";
import { WebSocket } from "ws"; // Node; in the browser, omit WebSocketImpl

const client = new RelayClient({ url: RELAY_WS, apiKey: API_KEY, WebSocketImpl: WebSocket });

const run = client.run({
  url: "https://portal.example.com/login",
  userId: "dana",
  playbookId: "acme.portal-upload",
  decide,                       // your LLM — only used if a deterministic step breaks
});

for await (const status of run) updateUserUI(status.phase);   // live status stream
const { confirmationNumber } = await run;                     // result bag
```

A deterministic playbook needs **no LLM** for the happy path — the steps drive it, and no
page content leaves the device. To back `decide` with a real model (Claude or otherwise)
for the cold-start / fallback path, see the [LLM decider quickstart](./LLM_DECIDER.md).

## 6. See what happened

```bash
curl -H "Authorization: Bearer $API_KEY" $RELAY_HTTP/audit/acme    # your tenant only (cross-tenant = 403)
```

Or use the OpenErrand dashboard. Audit records log *that* an action occurred (domain +
hash, never content).

## What flows where (the part your security team will ask about)

| Data | Path |
|---|---|
| Portal **credentials** | on-device vault → straight to the portal. Never to you or OpenErrand. |
| The **document** | the user's browser → the portal. (Source it locally; don't route it through the relay.) |
| **Page context** (LLM mode only) | minimized + redacted → your LLM, via the relay (which forwards, never stores). Deterministic mode sends none. |
| **Status + audit** | relay → your app; audit is per-tenant and access-controlled. |

Verify all of this yourself: read the open extension source and run the
[conformance suite](../obep/conformance/) against the relay. See
[SECURITY_MODEL.md](./SECURITY_MODEL.md).

## Next

- [Security model](./SECURITY_MODEL.md) · [Error taxonomy](./ERROR_TAXONOMY.md) ·
  [Self-hosting](./SELF_HOSTING.md) · [Enterprise deployment](./ENTERPRISE_DEPLOYMENT.md)
- Sample playbooks: [`examples/playbooks/`](../examples/playbooks/)


---

# Author a playbook with an LLM coding tool

The fastest way to write a playbook is to let your coding assistant (Claude Code, Cursor,
etc.) draft it, then validate and sign it with the CLI. This page is written to be handed
*to that assistant* — point it here, describe the flow you want, and iterate against
`obep lint` until it's clean.

> A **playbook** is a signed JSON recipe for one browser flow. It carries a **fence**
> (exactly which domains, actions, and credential keys are allowed) and an optional
> deterministic **steps** list. Secrets are never in the playbook — only key *references*
> resolved from the user's on-device vault.

## The closed action set

These are the only 8 actions. Adding one means shipping a new extension version — an LLM
cannot invent actions.

| action | fields | does |
|---|---|---|
| `navigate` | `url` | go to a URL |
| `click` | `selector` | click an element |
| `fill` | `selector`, `value` | type a **non-secret** value |
| `fillSecret` | `selector`, `credentialKey` | type a secret resolved from the vault (value never in the playbook) |
| `upload` | `selector`, `file` | attach a file |
| `wait` | `selector?`, `timeoutMs?` | wait for an element / timeout |
| `extract` | `selector`, `as` | read a value into the result under key `as` |
| `done` | — | finish |

## The shape

```jsonc
{
  "playbookId": "acme.portal-upload",        // ^[a-z0-9]([a-z0-9._-]*[a-z0-9])?$
  "version": 1,
  "tenantId": "acme",
  "permissions": {                            // the FENCE — keep it as tight as the flow needs
    "allowedDomains": ["portal.example.com"], // exact hosts, or single-label wildcard "*.example.com". No bare "*".
    "allowedActions": ["navigate", "fillSecret", "click", "upload", "extract"],
    "allowedCredentialKeys": ["portal_username", "portal_password"], // ^[a-z0-9_]+$ — refs, never values
    "capture": { "screenshots": "never", "fullDom": false, "elementsOnly": true },
    "sensitiveSurfacePolicy": "block"         // or "acknowledge" + "acknowledgeSensitive": ["..."]
  },
  "steps": [                                  // optional deterministic happy-path; omit for pure-LLM
    { "action": "navigate", "url": "https://portal.example.com/login" },
    { "action": "fillSecret", "selector": "#username", "credentialKey": "portal_username" },
    { "action": "fillSecret", "selector": "#password", "credentialKey": "portal_password" },
    { "action": "click", "selector": "#signin" },
    { "action": "navigate", "url": "https://portal.example.com/claims/upload" },
    { "action": "upload", "selector": "input[type=file]", "file": "report.pdf" },
    { "action": "click", "selector": "#submit" },
    { "action": "extract", "selector": ".confirmation-number", "as": "confirmationNumber" }
  ],
  "fallback": "halt"                          // "halt" = stop if a step breaks; "llm" = hand off to your decider
}
```

`issuedAt` and `signature` are added by the CLI at sign time — the assistant does **not**
write them.

## Rules the assistant must follow (these are lint errors otherwise)

- **Secrets only via `fillSecret` + `credentialKey`.** Never put a password in a `fill`
  `value` or anywhere in the file. Credential keys are references to the user's vault.
- **Tightest possible fence.** `allowedDomains` = only the hosts the flow touches (exact
  where you can); `allowedActions` = only the actions used; `allowedCredentialKeys` = only
  the keys referenced. Over-broad fences are flagged.
- **No bare `*` domain**, no bare TLD. Single-label wildcards like `*.example.com` are allowed.
- **Default-tight capture:** `screenshots: "never"`, `fullDom: false`, `elementsOnly: true`.
  Anything looser must be justified and is flagged.
- **`sensitiveSurfacePolicy: "block"`** unless a surface is explicitly acknowledged.

## The loop

Have the assistant write `flow.json`, then drive it against the CLI — the linter is the
guardrail, and `sign` **refuses** a playbook with lint errors, so the assistant iterates
until clean:

```bash
npx @obep/cli lint flow.json          # fix every ✖ error the assistant sees, re-run
npx @obep/cli sign flow.json --key keys/tenant.key --out flow-signed.json
```

Then register `flow-signed.json` with the relay; it gets a `playbookId` you call at runtime.

## A prompt you can paste

```
You are writing an OBEP playbook — a signed JSON recipe for one browser flow.
Read docs/PLAYBOOK_AUTHORING.md for the exact schema, the 8 allowed actions, and
the fence rules. Then write flow.json for this flow:

  <describe the flow: the site, the login fields, the steps, what to extract>

Requirements:
- Use ONLY the 8 actions. Secrets go through fillSecret + a credentialKey (e.g.
  portal_password) — never a literal value.
- Make permissions as tight as the flow needs (exact domains, only actions used,
  only credential keys referenced; capture screenshots:never, fullDom:false).
- Do not add issuedAt or signature.

Then run `npx @obep/cli lint flow.json`, fix every error, and repeat until it
reports clean. Do not sign — I hold the signing key.
```

The last line matters: **the human holds the signing key.** An assistant drafts and lints;
a person reviews and signs. See the [Integration guide](./INTEGRATION.md) for the rest.


---

# Quickstart: driving a task with an LLM

OpenErrand is "the pipe" — *you* bring the intelligence. The SDK's `decide(ctx)`
callback turns each `PageContext` into the next `Command`; back it with whatever
LLM you want. This guide shows it with Claude.

> **When you need this.** A signed playbook with deterministic `steps` needs **no
> LLM** for the happy path — the steps drive it. Reach for an LLM decider for the
> **cold-start / fallback path**: a flow you haven't recorded yet, or a step that
> broke because the page changed.

## The shape

`client.run({ url, userId, decide })` calls your `decide(ctx)` once per step. The
extension sends up the page as a stripped element list (refs + labels + types — no
values, no screenshot by default); your decider picks one action; the extension
**enforces it against the signed playbook fence** and executes it. Loop until `done`.

## Install

```bash
npm install @anthropic-ai/sdk zod
```

## A Claude-backed decider

```ts
import Anthropic from "@anthropic-ai/sdk";
import { z } from "zod";
import { zodOutputFormat } from "@anthropic-ai/sdk/helpers/zod";
import type { Command, PageContext } from "@obep/protocol";

const anthropic = new Anthropic(); // reads ANTHROPIC_API_KEY

// The closed OBEP action surface, as a schema the model MUST emit (structured output).
const CommandSchema = z.discriminatedUnion("action", [
  z.object({ action: z.literal("navigate"), url: z.string() }),
  z.object({ action: z.literal("click"), ref: z.string() }),
  z.object({ action: z.literal("fill"), ref: z.string(), value: z.string() }),
  z.object({ action: z.literal("fillSecret"), ref: z.string(), credentialKey: z.string() }),
  z.object({ action: z.literal("upload"), ref: z.string(), file: z.string() }),
  z.object({ action: z.literal("wait"), ref: z.string().optional(), timeoutMs: z.number().optional() }),
  z.object({ action: z.literal("extract"), ref: z.string(), as: z.string() }),
  z.object({ action: z.literal("done"), result: z.record(z.unknown()).optional() }),
]);

const SYSTEM = `You drive a web task in the user's own browser, one step at a time.
Each turn you receive the current page as a list of interactive elements (ref, type, label).
Choose exactly ONE next action from the allowed set to make progress toward the goal.

Rules:
- Address elements only by their "ref". Never invent a ref that isn't listed.
- To enter a saved login, use fillSecret with the credentialKey — never type a
  password as a value. You never see secret values.
- Call "done" when the goal is complete, with any extracted result.
- Prefer the smallest action that makes progress; do not guess at off-page navigation.`;

export function makeClaudeDecider(goal: string) {
  // Per-task memory: the goal, plus every page seen and action taken so far.
  const history: Anthropic.MessageParam[] = [{ role: "user", content: `Goal: ${goal}` }];

  return async function decide(ctx: PageContext): Promise<Command> {
    history.push({
      role: "user",
      content: `URL: ${ctx.url}\nElements:\n${ctx.interactiveElements
        .map((e) => `- ${e.ref} <${e.type}> ${e.label}`)
        .join("\n")}`,
    });

    const res = await anthropic.messages.parse({
      model: "claude-opus-4-8",
      max_tokens: 1024, // output is one small command
      system: [{ type: "text", text: SYSTEM, cache_control: { type: "ephemeral" } }],
      messages: history,
      output_config: { format: zodOutputFormat(CommandSchema) },
    });

    const command = res.parsed_output!; // validated against CommandSchema
    history.push({ role: "assistant", content: JSON.stringify(command) });
    return command as Command;
  };
}
```

Wire it into a task:

```ts
import { RelayClient } from "@obep/sdk";
import { WebSocket } from "ws";

const client = new RelayClient({ url: RELAY_WS, apiKey: API_KEY, WebSocketImpl: WebSocket });

const result = await client.run({
  url: "https://portal.example.com/login",
  userId: "dana",
  decide: makeClaudeDecider("Log in, upload the claim document, then read the confirmation number."),
});
```

## Why this is safe even though the model is "untrusted"

The enforcement layer treats the LLM as adversarial — which is exactly right:

- **The fence still wins.** Whatever the model emits is re-checked on-device against
  the signed playbook's `allowedDomains` / `allowedActions` / `allowedCredentialKeys`
  before it runs. A hallucinated `navigate` to an off-fence domain is **blocked**, not
  executed.
- **The model never sees secrets.** `fillSecret` carries only a `credentialKey`; the
  value is resolved from the on-device vault. And capture minimization means the model
  receives element *labels*, not field *values* — so a password on the page is never in
  the context you send to Claude.

So a misbehaving or prompt-injected model can, at worst, fail the task — it cannot
exfiltrate a credential or escape the playbook's domains.

## Model choice for a per-step loop

`run()` calls your decider **once per step**, so latency compounds over a flow. The
default here is `claude-opus-4-8` (most capable). Because model choice is yours to make,
if step latency matters more than per-step reasoning depth you can switch to a faster
tier — `claude-haiku-4-5` or `claude-sonnet-4-6` — by changing the `model` string. For
flows that need real reasoning (ambiguous pages, recovery), add adaptive thinking:
`thinking: { type: "adaptive" }`.

## Keeping it cheap: caching across steps

Each step re-sends the whole conversation so far (the goal + every prior page + action),
so the same prefix is processed on every call. Two caching levers:

- **The conversation prefix is the real win.** Put a `cache_control` breakpoint on the
  **last block of the most recent turn** each step; the next step reads the cached prefix
  and only pays full price for the new page. This is the standard multi-turn pattern and
  it compounds as the flow grows.
- **The system prompt** caches too — *but only if it's large enough.* The minimum
  cacheable prefix on Opus-tier models is **4096 tokens** (2048 on Sonnet/Haiku); a short
  instruction block like the one above is below that, so its `cache_control` marker is a
  silent no-op. It pays off when your system prompt is large (detailed policy, few-shot
  examples). Verify with `usage.cache_read_input_tokens` — if it stays 0, nothing cached.

## Notes

- `messages.parse` + `output_config.format` forces the model's output to match
  `CommandSchema`, so you get a validated `Command` with no brittle JSON parsing.
- Keep `decide` deterministic-ish: address by `ref`, and let the extension's enforcement
  (not your prompt) be the security boundary.
- This is "bring your own LLM" — the same `decide(ctx) => Command` contract works with any
  provider; only the call inside `decide` changes.
```


---

# OBEP Security Model

This document states the trust model plainly so a customer's security team can
audit it from the open `/obep` code alone. The governing rule: **anything whose
secrecy would create a vulnerability is in open OBEP; anything whose secrecy only
protects the business may be in closed OpenErrand.** The proof the split is
honest: *even a fully malicious OpenErrand relay cannot read a vault or widen a
playbook, because the open extension re-verifies everything.*

## Trust anchors (all open, all in `/obep`)

- **Playbook signing/verification** — Ed25519 over RFC 8785 (JCS) canonical JSON.
  The tenant signs; the relay holds only the public key; the extension
  re-verifies the signature **and** the content hash against a locally-trusted
  key before executing. ([protocol](../obep/protocol/src/), [extension enforcer](../obep/extension/src/background/enforcer.ts))
- **Permission enforcement** — default-deny, per-action, client-side, identical
  for deterministic steps and LLM-fallback commands. ([enforcement](../obep/enforcement/src/engine.ts))
- **The wire protocol + token formats** — pairing tokens, identity assertions,
  and task tokens are all open and verifiable. ([wire](../obep/protocol/src/wire.ts), [token](../obep/protocol/src/token.ts))
- **The relay protocol contract** — encoded as the runnable [conformance suite](../obep/conformance/).

## "Even a malicious relay cannot…"

| Attack | Why it fails |
|---|---|
| Swap in a wider playbook | Extension recomputes the hash and re-verifies the tenant signature; a widened body fails both. |
| Re-sign a widened playbook | The relay never holds the tenant private key; a different key fails the extension's check. |
| Read a credential | `fillSecret` carries only a vault *key reference*; the value is AES-GCM encrypted on-device and resolved at the moment of use. The relay never sees it. |
| Route across tenants | `tenantId` is derived from the app's authenticated API key, never from a message; routing requires a matching `(tenantId, userId)` binding. |
| Read another tenant's audit | `/audit/:tenantId` requires that tenant's API key (cross-tenant ⇒ 403). |
| Replay a task/pairing token | Tokens are single-use (nonce-tracked) and short-lived (`exp`). |

## Layered defenses for "no sensitive data leaves" (in order of strength)

1. **Capture minimization (strongest).** Default is the stripped interactive-element
   list — labels/refs/types, **no values, no screenshot**. No payload ⇒ nothing to leak.
2. **Playbook domain allowlist.** Can't reach a surface ⇒ can't capture it. Hard stop.
3. **Egress lock.** The extension's `connect-src` CSP permits only secure transports
   (`https:`/`wss:`, plus `localhost` for dev) — never cleartext remote origins — and the
   code opens exactly one connection: to the relay (and, if configured, your decider)
   endpoint **you** set. The single reachable endpoint is fixed by *configuration*, not by
   the CSP host list; lock it down in fleet deployments via managed-config pinning
   ([ENTERPRISE_DEPLOYMENT.md](./ENTERPRISE_DEPLOYMENT.md)).
4. **Local redaction (layer 4, best-effort).** Regex + Luhn + entropy over labels/DOM
   *before transmission*. Catches structured PII/keys on allowed pages we didn't
   anticipate. **Not a guarantee** — unstructured PII (names) needs NER and is out of
   scope. ([redact](../obep/enforcement/src/redact.ts))
5. **Dry-run recorder + audit.** Developers see leaks before launch; runtime audit logs
   *that* a capture occurred (domain + hash), never the content.

Redaction is layer 4, not layer 1. Capture minimization and the allowlist are what keep
secrets in; redaction is the safety net. Customers must not treat redaction as a guarantee.

## Credential vault

- AES-GCM via Web Crypto, key derived from a user passphrase (PBKDF2, 210k iters),
  encrypted **before** anything touches `chrome.storage.local` (never `sync`).
- Namespaced per binding, with the binding key as AES AAD ⇒ cross-binding reads fail
  cryptographically.
- Decrypt-at-moment-of-use; the key lives only in service-worker memory.
- We cannot decrypt a vault server-side and cannot recover a lost passphrase — by design.

## Extension hardening

- Outbound-only WSS; reconnect with exponential backoff + jitter.
- **Sender validation**: a relay-signed, single-use task token is verified against the
  relay's public key before the extension acts — a message is never trusted just because
  it arrived on the socket.
- Only touches tabs it opened; no remote code, no `eval`; strict CSP.
- Kill switch detaches the connection, aborts tasks, and locks the vault instantly.

## What we deliberately cannot do

- Decrypt a user's vault server-side (we never hold the key).
- Recover a lost vault passphrase (offer re-entry, not a backdoor).


---

# The Open/Closed Boundary (OBEP vs OpenErrand)

**Governing rule:** anything whose secrecy would *create* a vulnerability MUST be
open (OBEP); anything whose secrecy only protects the *business* MAY be closed
(OpenErrand). All security is in the first bucket; all commercial value is in the
second.

## Open — OBEP (`/obep`, Apache-2.0)

Must be open or the security is unverifiable:

- Playbook schema, JCS canonicalization, hashing, **Ed25519 signing/verification**.
- The **permission-enforcement engine** (default-deny, blocklist, capture policy, TOFU).
- The **extension** in full (outbound-only, `connect-src` lock, Web Crypto vault).
- The **wire protocol** — messages, pairing/binding handshake, token formats.
- The **audit-record shape** (`AuditRecord` in `@obep/protocol`).
- The **relay protocol contract** — encoded as the runnable conformance suite — plus a
  reference relay good enough to self-host.

## Closed-eligible — OpenErrand (`/openerrand`, proprietary)

Secrecy protects the business, never a guarantee:

- Relay scaling, infra, DB schema, deployment specifics.
- Dashboard, billing, multi-tenant ops tooling, anomaly-detection heuristics.
- SLA/uptime engineering, monitoring, onboarding, support.

## The one-way dependency rule (enforced in CI)

Nothing under `/obep` may import from `/openerrand`. Dependencies point one way:
**OpenErrand → OBEP, never the reverse.** This is enforced by
[`.dependency-cruiser.cjs`](../.dependency-cruiser.cjs) and **fails the build** on
violation ([CI](../.github/workflows/ci.yml)). It guarantees OBEP works standalone —
the entire basis of "trust the protocol, not the vendor."

The [`openerrand/dashboard`](../openerrand/dashboard/) is the worked example: proprietary
code that imports the *open* `AuditRecord` shape and builds a commercial surface on top,
weakening no guarantee.

## Verifying the split is honest

Run the open conformance suite against any relay — OpenErrand or self-hosted — to prove
it behaves per spec without trusting it:

```bash
node --import tsx obep/conformance/src/cli.ts --reference
```

OpenErrand is "just" a relay that passes this suite, and customers can prove it does.

## Extraction

`/obep` is structured to split into its own public Apache-2.0 repo with near-zero
untangling once the protocol is stable and the conformance suite passes. The one-way
rule keeps that extraction mechanical.


---

# OBEP Error Taxonomy

Stable error codes carried on `status` messages (`phase: "error"`, `code`). The
canonical source is [`ERROR_CODES`](../obep/protocol/src/wire.ts) in
`@obep/protocol`. Consumers MUST tolerate unknown codes (forward-compatible).

| Code | Meaning | Typical cause |
|---|---|---|
| `no_binding` | No `(tenantId, userId)` binding to route to. | Task targets an unpaired user, or no extension connected. |
| `permission_denied` | Action/credential/capture outside the playbook boundary. | Action not in `allowedActions`; credentialKey not allowed. |
| `element_not_found` | Referenced element/selector absent. | Page changed; selector stale. |
| `navigation_blocked` | Domain not allowed, or a blocked sensitive surface. | Target outside `allowedDomains`; unacknowledged sensitive surface. |
| `credential_not_found` | Vault locked or no entry for the key. | Vault not unlocked; key never stored. |
| `timeout` | Operation exceeded its budget. | Slow page load / wait. |
| `bad_message` | Malformed/non-conformant wire message. | Invalid JSON; unknown shape. |
| `playbook_invalid` | Signature/hash verification failed or tenant mismatch. | Tampered playbook; wrong key; cross-tenant playbook. |
| `consent_required` | Unknown/widened playbook needs fresh consent (TOFU). | First use, or scope widened vs the approved version. |
| `halted` | Deterministic step couldn't proceed and `fallback: "halt"`. | High-assurance flow chose not to improvise. |
| `unauthorized` | Bad/revoked API key, or unverifiable task token. | Wrong API key; forged/expired task token. |
| `rate_limited` | Tenant exceeded its rate limit. | Too many task starts in the window. |
| `pairing_failed` | Invalid/expired/used pairing material. | Replayed pairing token; bad assertion. |
| `internal` | Unexpected relay/extension error. | Bug; peer disconnected mid-task. |


---

# Self-Hosting an OBEP Relay

OBEP is self-hostable: the reference relay in [`obep/relay-reference`](../obep/relay-reference/)
is a conformant relay you can run yourself. You do not need OpenErrand.

## Run the reference relay

```bash
corepack enable && pnpm install
PORT=8787 pnpm --filter @obep/relay-reference start
# ws://localhost:8787   health: http://localhost:8787/health
```

Front it with a TLS-terminating reverse proxy (nginx/Caddy) for `wss://` in production;
the reference relay speaks `ws://` and trusts the proxy.

## REST surface

| Endpoint | Auth | Purpose |
|---|---|---|
| `GET /health` | none | liveness |
| `GET /relay-key` | none | the relay's Ed25519 public key (for task-token verification) |
| `POST /pairing-tokens` | `Bearer <apiKey>` | issue a single-use pairing token for `{ userId }` |
| `GET /audit/:tenantId` | `Bearer <apiKey>` | that tenant's audit records (cross-tenant ⇒ 403) |

WebSocket (`/`) speaks the OBEP wire protocol (hello/pair/task.start/command/context/status).

## Provisioning

The reference relay keeps tenants, API keys, playbooks, and bindings **in memory**
(see [`Registry`](../obep/relay-reference/src/registry.ts)). For production, implement
the `AuditStore` interface against Postgres and add durable tenant/key/playbook storage —
the protocol contract is unchanged. Register programmatically:

```ts
import { startRelay } from "@obep/relay-reference";
const relay = await startRelay({ port: 8787 });
relay.registry.registerTenant("tenantA", tenantPublicKeyB64);
await relay.registry.registerApiKey("tenantA", apiKey);
relay.registry.registerPlaybook(signedPlaybook);
```

Authoring playbooks (sign with the tenant key; the relay holds only the public key):

```bash
P=obep/cli/src/index.ts
node --import tsx $P keygen --out keys
node --import tsx $P sign examples/playbooks/portal-upload.json --key keys/tenant.key --out signed.json
```

## Prove your relay is conformant

Implement `RelayProvisioner` against your relay and run the open conformance suite —
it must pass all checks (see [`runConformance`](../obep/conformance/src/runner.ts)).
This is how a customer's security team verifies *any* relay without trusting it.

## Enterprise installs

Pin the relay endpoint via Chrome managed config (`chrome.storage.managed.relayUrl`) so a
compromised page can't redirect the extension to a rogue relay. See
[Enterprise deployment](./ENTERPRISE_DEPLOYMENT.md) for managed config and auto-pairing.


---

# Enterprise Deployment (force-install + managed config)

Force-install via Google Workspace admin policy is the **primary B2B distribution
path** (target customers are on managed devices). Public Web Store install is
secondary.

## 1. Force-install the extension

In the Google Admin console (Devices → Chrome → Apps & extensions, or via policy):

- Add the extension by ID with **Force install**, or set `ExtensionInstallForcelist`
  / `ExtensionSettings.installation_mode = "force_installed"`.
- See [`enterprise-policy.example.json`](./enterprise-policy.example.json) for a complete
  example. Replace `EXTENSION_ID` with the published Web Store id (or your self-hosted
  CRX update URL).

## 2. Push managed configuration

The extension reads `chrome.storage.managed` (read-only, admin-set). Schema:
[`obep/extension/src/managed_schema.json`](../obep/extension/src/managed_schema.json).

| Key | Purpose |
|---|---|
| `relayUrl` | **Pins** the relay endpoint, so a compromised page can't redirect the extension to a rogue relay. |
| `tenantId` | Marks an org-managed install for this tenant and enables enterprise auto-pairing. |
| `allowedTenants` | Locks the install to exactly these tenant(s). A normally-installed extension can **never** auto-pair. |
| `unpairLock` | When true, end users cannot unpair org bindings from the side panel. |

Set these under `ExtensionSettings.<id>.managed_configuration` (see the example).

## 3. Enterprise auto-pairing flow

With managed config present, the extension enters auto-pair mode for the pinned
tenant(s):

1. The extension reads the managed config (relay endpoint + tenant), so no manual relay
   URL entry is needed.
2. When the user is logged into the customer's web app, the app posts a **short-lived,
   signed identity assertion** (`{ tenantId, userId }` signed by the tenant's private key)
   to the extension via the trusted-origin channel.
3. The extension forwards it (`pair { assertion }`); the relay verifies the signature
   against the tenant's **registered public key** and creates the `(tenantId, userId)`
   binding — no manual code entry, no prompt.

Guardrails (enforced):
- Auto-pair only when managed config confirms an org-policy install; `allowedTenants`
  pins which tenant(s) may pair (extension + relay both check).
- Identity assertions are signed and single-use/short-lived; a forged or replayed
  assertion is rejected.
- The side panel still lists all bindings; `unpairLock` controls whether end users may unpair.

## 4. Verify before approving

Your security team can audit the open `/obep` code and run the conformance suite against
the relay before rollout — see [SELF_HOSTING.md](./SELF_HOSTING.md) and
[SECURITY_MODEL.md](./SECURITY_MODEL.md). OpenErrand is just a relay that passes the open
conformance suite.


---

# OpenErrand — Privacy Policy

_Last updated: 2026-06-15_

OpenErrand is a browser extension that runs actions you authorize inside your own
browser. This policy describes exactly what data it handles and where that data goes.
It is written to be verifiable: the protocol and all security-critical code are open
source (Apache-2.0), so you can confirm every claim below against the source.

## The short version

- **Your logins never leave your device.** Credentials you save are encrypted on your
  machine and are sent only to the destination site you're logging into — never to
  OpenErrand, never to a connected app.
- **OpenErrand holds no access to any website until you grant it**, one domain at a time.
- **We don't sell your data, and we don't use it for advertising.** If you pair the
  extension with an app, that app receives task status and a minimized page view — see
  *What leaves your device* and *Data you send to a connected app* below.

## What the extension stores on your device

All of this lives in `chrome.storage.local` on your computer. It is device-bound and is
**not** synced to a Google account or to us.

- **Credential vault** — any logins you choose to save, encrypted with AES-GCM using a
  key derived from your passphrase (PBKDF2). We never receive your passphrase or the
  decrypted contents. A wrong passphrase cannot be recovered by anyone, including us.
- **Connection settings** — relay endpoint, paired-app bindings, trusted signing keys,
  recorded/stored playbooks, and an optional decider endpoint URL.

## What leaves your device, and to where

| Data | Goes to | Notes |
|---|---|---|
| Your **credentials** | the destination site only | Decrypted on-device at the moment of use; never to OpenErrand or a connected app. |
| The **action steps** (navigate/click/fill/upload/extract) | the destination site | This is the task running in your browser. |
| A **minimized page view** (interactive elements — labels and types, **not** values) | only a connected app's decider, and only when a signed recipe runs in app-driven mode | Off entirely for deterministic recipes and for fully local runs. Full screenshots/DOM are off by default and only sent if a signed recipe explicitly enables them. |
| **Status + audit metadata** (that an action occurred: domain + content hash, timestamps) | a connected app / relay you use | Records *that* a capture happened, never its content. Partitioned per tenant. |

If you run a playbook **locally** with no connected app and no decider endpoint, the only
network traffic the extension causes is your browser reaching the destination site —
exactly as if you'd done the steps by hand.

## Site access

The extension ships with **no host permissions**. The first time a recipe needs a
particular site, Chrome prompts you to grant access to that one domain. You can review or
revoke per-site access at any time from Chrome's extension controls. The set of sites the
extension can ever touch is bounded by the domains you've granted, which match the signed
recipe's domain fence.

## What we do *not* do

- We do not collect analytics or telemetry from the extension.
- We do not receive your credentials, your passphrase, or page content.
- We do not sell, rent, or share your data, and we do not use it for advertising or any
  purpose unrelated to running the actions you authorize.

## Data you send to a connected app

If you pair the extension with a third-party app, that app receives the status/audit
metadata and (in app-driven mode) the minimized page view described above. That app's own
privacy policy governs what it does with that data. You can see every connection in the
side panel and unpair any of them — or hit the global kill switch — at any time.

## Self-hosting

OpenErrand can be self-hosted. If you run your own relay, data described as going to "a
relay you use" goes to the server you operate, under your own policies.

## Contact

Questions about this policy: privacy@openerrand.app.