Docs/Author an errand with an LLM coding tool raw .md

Author an errand with an LLM coding tool

Record it first if you can. The simplest, most reliable way to author an errand is to record the flow once in the OpenErrand side panel (Advanced → Record an errand): the extension captures every step with the real selectors and derives the errand for you — no screenshots, no describing it, no guessing selectors. Reach for the LLM path below only when you can't record — e.g. you're scripting against a site you don't have open, or generating many errands programmatically. If you do use an assistant and can record, hand it the recorded trace (not screenshots) as the input.

When recording isn't an option, let your coding assistant (Claude Code, Cursor, etc.) draft the errand, then validate and sign it with the CLI. This page is written to be handed to that assistant — point it here, describe the flow, and iterate against obep lint until it's clean.

A errand is a signed JSON recipe for one browser flow. It carries a fence (exactly which domains, actions, and credential keys are allowed) and an optional deterministic steps list. Secrets are never in the errand — only key references resolved from the user's on-device vault.

#The closed action set

These are the only 8 actions. Adding one means shipping a new extension version — an LLM cannot invent actions.

action fields does
navigate url go to a URL
click selector click an element
fill selector, value type a non-secret value
fillSecret selector, credentialKey type a secret resolved from the vault (value never in the errand)
upload selector, file attach a file
wait selector?, timeoutMs? fixed pause / timeout
waitFor selector?/text?, condition?, timeoutMs?, prompt? poll until a target is visible (default) / enabled / hidden
extract selector, as read a value into the result under key as
done finish

Targeting by text instead of a selector. Any step that targets an element accepts text in place of selector — it matches the element's visible text / accessible label (aria-label, placeholder, an associated <label>, text content, title, or alt; a matched <label> resolves to its control). It's far more resilient to selector churn, so prefer it when a stable label exists:

{ "action": "click", "text": "Invoice" }
{ "action": "fill",  "text": "Email", "value": "{{email}}" }   // fills the input labeled "Email"

text is matched exact-first, then contains, and supports {{inputs}} like any field.

Human-gated steps (the canonical recipe). For out-of-band human steps — verify an email, enter an OTP, complete OAuth or a CAPTCHA — use action → waitFor → continue: do the action that triggers it, then waitFor the result, then continue deterministically.

{ "action": "click", "text": "Send verification email" },
{ "action": "waitFor", "text": "Continue", "condition": "enabled",
  "timeoutMs": 600000,                                  // minutes are fine — the run survives
  "prompt": "Verify your email, then this continues automatically" },  // shown to the user
{ "action": "click", "text": "Continue" }
  • condition: "visible" (default) · "enabled" (visible and not disabled — wait for a button to un-grey) · "hidden" (wait for something to go away, e.g. a spinner or a modal).
  • Long waits are robust: the wait is polled by the extension (survives the page navigating) and the service worker is kept alive for the run's lifetime, so a minutes-long wait won't drop.
  • prompt surfaces a "waiting for you…" hint in the side panel so the user knows the run is alive and what to do.

#The shape

{
  "playbookId": "acme.portal-upload",        // the errand's id (wire field name) — you pass it to the SDK as `errandId`. ^[a-z0-9]([a-z0-9._-]*[a-z0-9])?$
  "version": 1,
  "tenantId": "acme",
  "permissions": {                            // the FENCE — keep it as tight as the flow needs
    "allowedDomains": ["portal.example.com"], // exact hosts, or single-label wildcard "*.example.com". No bare "*".
    "allowedActions": ["navigate", "fillSecret", "click", "upload", "extract"],
    "allowedCredentialKeys": ["portal_username", "portal_password"], // ^[a-z0-9_]+$ — refs, never values
    "capture": { "screenshots": "never", "fullDom": false, "elementsOnly": true },
    "sensitiveSurfacePolicy": "block"         // or "acknowledge" + "acknowledgeSensitive": ["..."]
  },
  "steps": [                                  // optional deterministic happy-path; omit for pure-LLM
    { "action": "navigate", "url": "https://portal.example.com/login" },
    { "action": "fillSecret", "selector": "#username", "credentialKey": "portal_username" },
    { "action": "fillSecret", "selector": "#password", "credentialKey": "portal_password" },
    { "action": "click", "selector": "#signin" },
    { "action": "navigate", "url": "https://portal.example.com/claims/upload" },
    { "action": "upload", "selector": "input[type=file]", "file": "report.pdf" },
    { "action": "click", "selector": "#submit" },
    { "action": "extract", "selector": ".confirmation-number", "as": "confirmationNumber" }
  ],
  "fallback": "halt"                          // "halt" = stop if a step breaks; "llm" = hand off to your decider
}

issuedAt and signature are added by the CLI at sign time — the assistant does not write them.

#Rules the assistant must follow (these are lint errors otherwise)

  • Secrets only via fillSecret + credentialKey. Never put a password in a fill value or anywhere in the file. Credential keys are references to the user's vault.
  • Tightest possible fence. allowedDomains = only the hosts the flow touches (exact where you can); allowedActions = only the actions used; allowedCredentialKeys = only the keys referenced. Over-broad fences are flagged.
  • No bare * domain, no bare TLD. Single-label wildcards like *.example.com are allowed.
  • Default-tight capture: screenshots: "never", fullDom: false, elementsOnly: true. Anything looser must be justified and is flagged.
  • sensitiveSurfacePolicy: "block" unless a surface is explicitly acknowledged.

#The loop

Have the assistant write flow.json, then drive it against the CLI — the linter is the guardrail, and sign refuses an errand with lint errors, so the assistant iterates until clean:

npx @obep/cli lint flow.json          # fix every ✖ error the assistant sees, re-run
npx @obep/cli sign flow.json --key keys/tenant.key --out flow-signed.json

Then register flow-signed.json with the relay; it gets an errandId you call at runtime.

#A prompt you can paste

You are writing an OBEP errand — a signed JSON recipe for one browser flow.
Read docs/ERRAND_AUTHORING.md for the exact schema, the 8 allowed actions, and
the fence rules. Then write flow.json for this flow:

  <describe the flow: the site, the login fields, the steps, what to extract>

Requirements:
- Use ONLY the 8 actions. Secrets go through fillSecret + a credentialKey (e.g.
  portal_password) — never a literal value.
- Make permissions as tight as the flow needs (exact domains, only actions used,
  only credential keys referenced; capture screenshots:never, fullDom:false).
- Do not add issuedAt or signature.

Then run `npx @obep/cli lint flow.json`, fix every error, and repeat until it
reports clean. Do not sign — I hold the signing key.

The last line matters: the human holds the signing key. An assistant drafts and lints; a person reviews and signs. See the Integration guide for the rest.