The Stash protocol.

How LLMs and agentic tools read Stash captures. This page is the human-readable companion to /llms.txt — the same spec, same precedence rules, just nicer to skim.

Precedence: native captures use MCP > XMP > pixel banner; Chrome extension captures use Stash History/MCP import > PNG tEXt chunks > visible pixels. Protocol version: stash-1.

Capture data availability

Supplemental matrix for agent-facing data. Rows are ordered from raw media data to protocol-level metadata and local MCP retrieval.

Screenshot data

A regular screenshot means a normal image with no Stash context.

# Data type LLM use Regular screenshot AI mode MCP Chrome extension
01 Pixels
Raster image data.
Model-visible raster input. Supports visual inspection of layout, color, shape, and apparent UI state. Anything outside the visible pixels remains inferred. YesYesYesYes
02 Visible text
Text visible in the image.
Provides visible labels, error messages, filenames, and button text through vision/OCR. Reliability depends on size, contrast, truncation, and table density. YesYesYesYes
03 Visible layout
Spatial relationship of UI elements.
Provides selected rows, disabled controls, overlapping content, alignment defects, empty states, and visible hierarchy. Semantic roles still require inference unless structured data is available. YesYesYesYes
04 App name and bundle ID
Native screenshot metadata, XMP, MCP dossier.
Identifies the owning application. App-specific behavior differs across Chrome DevTools, Cursor, Xcode, Warp, and System Settings. Bundle ID disambiguates applications with similar names or UI. NoYesYesNo
05 Window title
Native screenshot metadata, XMP, MCP dossier.
Identifies the active document, project, tab, dashboard, or terminal session when that information appears in the window title. NoYesYesNo
06 Browser URL
Native browser context when available. Chrome PNG metadata for extension captures.
Identifies route, environment, object ID, query state, and sometimes tenant/account. Prevents conflating staging with production, list routes with detail routes, or one object record with another. NoYesYesYes
07 Timestamp
Pixel banner, XMP, MCP dossier, Chrome PNG metadata.
Orders captures in time. Distinguishes current state, previous attempts, and before/after comparisons in a multi-turn coding session. NoYesYesYes
08 Appearance, OS, display, scale
Pixel banner for appearance and OS. XMP and MCP for display details.
Specifies rendering environment. UI behavior can vary by OS version, color scheme, Retina scale, display dimensions, and viewport dimensions. NoYesYesYes
09 Stable capture ID
Pixel banner short ID, XMP captureId, History row, MCP tools.
Provides a durable lookup key such as #A1B2C3D4. The same capture can be referenced, fetched, or searched across turns. NoYesYesYes
10 Annotation summary
Pixel banner line such as user focus: arrow pointing. MCP get_capture returns a fuller annotation explainer; XMP carries structured annotation metadata.
Encodes user focus separately from the application UI. The annotation describes mark behavior while target resolution still comes from vision or accessibility data. MCP carries the most detailed explainer. NoYesYesNo
11 Annotation geometry
XMP userFocus array; also returned by MCP get_capture.
Stores coordinates for arrows, boxes, and other marks. Coordinates can be matched against OCR output, accessibility nodes, or visual regions. NoYesYesNo
12 XMP payload
Embedded file metadata under the Stash namespace.
Provides file-embedded structured metadata for file-on-disk flows such as email, Drive, and local folders when the local MCP server is not reachable. NoYesNoNo
13 MCP capture dossier
get_capture(id).
Returns the local structured record for a capture: app, display, accessibility, developer context, and related metadata. NoNoYesYes
14 Accessibility tree
MCP capture dossier, with summarized fallback in XMP where available.
Provides roles, labels, values, enabled state, and hierarchy as structured text. Identifies controls, selected items, table rows, menu items, and form fields without relying on OCR. NoNoYesNo
15 Developer context
MCP capture dossier for supported editors and terminals. XMP can carry a snapshot summary.
Maps visible editor or terminal state to repo context: file path, language, selected text, cursor position, terminal cwd, git branch, and recent commands when available. NoNoYesNo
16 Recent and search indexes
list_recent(n) and search(query).
Provides retrieval over prior captures by app, window, text, URL, and recency. Supports multi-turn workflows without requiring the image to be pasted again. NoNoYesYes
17 Plain rendered image
render_plain(id).
Returns raw image bytes without banner or XMP. Supports evaluation of pixel-only model behavior separately from structured context. NoNoYesNo
18 Chrome full-page capture
Chrome extension scroll-and-stitch capture.
Captures document content beyond the visible viewport. Required when a web page issue depends on content below the fold or on relationships between distant page sections. NoNoNoYes
19 Chrome URL, title, domain, browser
Chrome extension PNG tEXt chunks.
Identifies the browser-origin page and runtime. Relevant for route-specific defects, auth redirects, extension behavior, and browser rendering differences. NoNoNoYes
20 Viewport, DPR, page height, parts
Chrome PNG tEXt chunks, including capturePart and capturePartsTotal.
Describes viewport geometry, device pixel ratio, full page height, and ordered split-image parts. Prevents treating one part as the complete page. NoNoNoYes
21 Image hash and DOM hash
Chrome extension PNG tEXt chunks.
Provides deterministic identifiers for the image artifact and captured DOM structure. Supports before/after comparisons and changed-structure detection. NoNoNoYes
22 Signature and attestation fields
Local ECDSA signature, public key, attestation ID, server timestamp when available.
Provides verification metadata for evidence workflows: local signature, public key, attestation ID, and server timestamp when available. NoNoNoYes
23 Downloads-to-History import
Stash watches ~/Downloads for stash-*.png files when permission is granted.
Indexes Chrome extension PNGs into local Stash History when Downloads access is granted. Imported records become retrievable through MCP recent/search tools. NoNoNoYes

Video data

A regular recording means a normal screen video file with no Stash bundle metadata. There is no Chrome extension video capture path.

# Data type LLM use Regular recording Stash recording MCP
01 Video pixels
Rendered frames in the media file.
Model-visible visual input when frames are sampled. Content outside sampled frames remains unavailable to the model. YesYesYes
02 Audio track
Spoken audio or system audio when present.
Provides spoken context, narration, or system sounds when the recorder includes audio. Stash stores extracted audio as audio.m4a when present. YesYesYes
03 Duration
Total recording length.
Orders the recording as a time span and supports timeline references such as beginning, middle, and end. YesYesYes
04 Sampled still frames
frame_NN.jpg, capped at 30 per recording.
Provides discrete visual checkpoints that an LLM can inspect without reading the entire video file. Stash keeps start and end bookends and samples remaining frame budget from interaction and ambient frames. NoYesYes
05 Frame order
Numeric frame filenames and frame_tags.json.
Defines the intended reading order for sampled frames, preventing the model from treating still frames as unrelated screenshots. NoYesYes
06 Per-frame app/window tags
frame_tags.json.
Maps sampled frames to visible app and window state over time. This lets an agent track app switches, active windows, and context changes inside one recording. NoYesYes
07 Frame tag classification
start, interaction, end, ambient, and gap-fill.
Labels why each saved frame exists in the bundle. The tag set is stored in frame_tags.json. NoYesYes
08 report.md
YAML frontmatter plus markdown timeline.
Provides a text-first summary of the recording with machine-readable fields such as protocol, bundle version, capture ID, duration, frame count, audio presence, primary app, and MCP URI. NoYesYes
09 llms.txt
Offline self-description inside the bundle.
Gives an agent local instructions for reading the bundle even when the website protocol page is unavailable. NoYesYes
10 Bundle capture ID and MCP URI
captureId and stash://bundle/<UUID>.
Provides a durable handle for referencing, fetching, and discussing the same recording across turns. NoYesYes
11 MCP bundle fetch
get_bundle(id).
Returns the video bundle as one unit: report.md, enriched frame_tags, and absolute paths to every asset in the folder. NoNoYes
12 Absolute asset paths
Returned by get_bundle(id).
Allows a local agent to inspect specific frames, audio, report files, or the original video without asking the user to locate files manually. NoNoYes

Three channels, one capture

Every Stash screenshot carries structure in three places so a capture can always be resolved — even after the user pastes it into a web chat and every byte of metadata is stripped.

ChannelSurvivesWhat it carries
Pixel banner Anything an image survives App, window title, appearance, OS version, timestamp, shortID
XMP metadata File-on-disk flows (Drive, email) Full structured payload: annotations, a11y tree summary, dev context
Chrome extension PNG tEXt Downloaded browser captures and ordered long-page parts URL, title, browser, viewport, page height, image/DOM hashes, part fields, attestation/signature fields, generic visible page context
MCP server Local RPC (same machine) Live dossier including full a11y tree, annotation explainers, and un-summarized fields

Screenshot banner

Rendered at the bottom of every Stash screenshot in a monospace font:

📌 Claude — Settings · dark · macOS 26.4 · 2026-04-12 14:24 · #8FD26F28

When the user drew annotations, a second line appears above the pin:

user focus: blue arrow pointing · red box enclosing

The banner describes shape behavior — never the target. Resolve the target yourself using vision and/or the a11y tree.

Standard banner explainers are intentionally compact: arrow pointing, double-arrow connecting, box enclosing, oval enclosing, blur obscuring, mark/highlight marking, callout annotating, emoji marking, and label text.

XMP payload

On auto-save-to-desktop for developer apps, the JPEG carries an XMP payload under namespace http://stash.app/ns/1.0/. Serialized as a single JSON string under stash:payload:

{
  "protocolVersion": "stash-1",
  "source": "xmp-snapshot",
  "captureId": "8FD26F28-…",
  "mcpURI": "stash://capture/8FD26F28-…",
  "snapshotTimestamp": "2026-04-12T14:24:00Z",
  "appName": "Cursor",
  "bundleID": "com.todesktop.230313mzl4w4u92",
  "windowTitle": "ContextBannerRenderer.swift",
  "appearance": "dark",
  "osVersion": "macOS 26.4",
  "userFocus": [
    { "type": "arrow", "color": "BA0C2F", "behavior": "pointing",
      "llmInstruction": "User drew a single arrow. Treat the arrow tip/end point as the specific object, control, text, state, or visual detail the user wants called out. Do not treat it as decoration.",
      "from": [120, 340], "to": [420, 300] }
  ],
  "a11yTreeSummary": { /* trimmed: top 3 levels + labelled controls */ },
  "devContext": {
    "activeFilePath": "/Users/x/proj/Foo.swift",
    "selectedText": "let appearance = …",
    "gitBranch": "main"
  }
}

Also tagged with IPTC 2025.1 Iptc4xmpExt:AISystemUsed = "Stash" so conformant tooling can detect AI-assisted captures. Filename convention on save-to-desktop: Stash-YYYY-MM-DD-HHmmss-{shortID}.jpg.

Chrome extension PNG metadata

The Stash Chrome extension saves full-page browser captures as PNG files named stash-{domain}-{date}-{time}.png or stash-{domain}-{date}-{time}-part-XX-of-YY.png. Very tall pages are split into ordered parts, and each part is a separate image artifact.

Each PNG includes tEXt chunks with stash: keywords, including URL, title, timestamp, browser, viewport, DPR, page height, domain, extension version, image hash, DOM hash, security protocol, extension ID, attestation ID, capture part, capture parts total, generic visible page context, and optional local/server signature fields.

When Stash for Mac is running and has Downloads Folder access, it event-watches ~/Downloads, imports new stash-*.png files into History as source app Chrome Extension, preserves the PNG metadata in metadata_json, and dedupes by the original PNG SHA-256 hash. Prefer the imported History/MCP item over raw PNG parsing when both are available.

Video bundles

Produced by the Stash screen recorder. A self-describing folder, indexable as one unit:

Recordings/<uuid>/
├── report.md           ← YAML frontmatter + markdown timeline
├── frame_tags.json     ← { "frames": [ … ] } — per-frame app/window/tag
├── llms.txt            ← offline self-description
├── frame_NN.jpg        ← 1-indexed, zero-padded; hard-capped at 30 per
│                         recording. Start + end bookends always kept;
│                         remaining budget sampled uniformly from
│                         interaction frames, then ambient. Read in
│                         numeric order via frame_tags.json.
├── audio.m4a           ← extracted audio when present
└── video.mp4           ← original; generally skip

The report.md opens with machine-readable YAML frontmatter:

---
protocol: stash-1
bundleVersion: 2
captureId: <UUID>
duration: 42.30
frameCount: 12
hasAudio: true
primaryApp: Cursor
mcpURI: stash://bundle/<UUID>
---

MCP server

Stash ships a local Model Context Protocol server on a UNIX domain socket at ~/Library/Application Support/Stash/mcp.sock — line-delimited JSON-RPC 2.0. Local-only by design; the socket is not exposed to the network.

Transport

Stdio MCP clients (Claude Code, Claude Desktop, Cursor, Codex CLI, Continue, Windsurf, Zed, Warp, Cline, …) connect via a small bridge binary that relays stdin/stdout to the socket. The bridge ships bundled inside the app at /Applications/Stash.app/Contents/Helpers/stash-mcp — pre-signed as part of Stash.app under Stash's Apple team with Hardened Runtime. There is no compile step and no Apple Developer certificate required. The one-line installer at yourstash.ai/claude verifies the bundled helper exists and is validly signed, then points Claude Code, Claude Desktop, and Cursor at that absolute path. Other clients are manual; see /claude#manual-setup for full per-client snippets.

Note: GUI clients launched from Finder/Dock do not inherit your shell PATH and ~ does not expand reliably — always use the absolute path to stash-mcp in the command field. The bundled path /Applications/Stash.app/Contents/Helpers/stash-mcp is already absolute.

Peer auth

Stash reads the peer's codesign team identifier on connect and silently rejects unknown signers. Built-in allowlist: Anthropic (58LP8PCM82) and Stash itself (VJMJQKCRMC). The bundled bridge is signed under VJMJQKCRMC, so it connects with no extra setup. Extend the allowlist via Stash → Settings → Privacy → Additional trusted team IDs, or toggle Allow unsigned MCP clients — the mcpAllowUnsignedClients UserDefault. That toggle is for advanced/local use only: enabling it lets any unsigned local process connect to the MCP socket, so it should not be a general recommendation.

One honest caveat: team-ID allowlisting is a speedbump, not a hard boundary. Because the bridge is a pure relay, any local process can launch the trusted bridge and drive it — a confused-deputy weakness. It stops other signed apps from connecting directly; it does not stop arbitrary same-user code. A capability-token handshake is the planned future replacement.

Tools

ToolPurpose
get_capture(id)Full dossier for a screenshot or video capture
get_bundle(id)Video bundle: report.md, enriched frame_tags, absolute file paths
list_recent(n)Paste-flow fallback; compact summaries newest-first
search(query)Substring match over app / window / text / URL
render_plain(id)Raw JPEG bytes, no banner, no XMP (for evals)

All tools return the MCP-standard {content: [{type: "text", text: "<json>"}]} envelope. render_plain returns {type: "image", data: "<base64>", mimeType: "image/jpeg"}.

Annotation explainers

The screenshot banner has limited space, so it uses compact behavior labels. MCP returns the same shapes with coordinates plus an expanded llmInstruction, a userFocusSummary, and a userFocusGuidance field that tells agents to treat annotations as intentional user instructions, not decoration.

ShapeMCP explainer
ArrowThe tip/end point is the exact thing the user wants called out.
Double arrowThe endpoints define a relationship, comparison, distance, before/after connection, or dependency between two visible objects.
RectangleThe enclosed area is deliberate focus. It may contain existing content to consider together, or it may reference a proposed/new region or shape the user wants the agent to reason about.
EllipseThe enclosed area is deliberate focus. It may call out one object, a cluster, a status, or an ambiguous region the user wants isolated from the rest of the screen.
BlurThe region is intentionally hidden or sensitive. The agent should never attempt to retrieve, reconstruct, infer, OCR, or guess what is under the blur; it should reason from surrounding visible context only.
LineThe mark indicates alignment, boundary, path, separation, or a visual span.
HighlightThe marked content is high-priority evidence. Preserve exact visible wording when possible.
CalloutThe leader endpoint is the referenced target and the callout marker is an ordered user note or sequence marker.
EmojiThe marker position carries user emphasis or sentiment attached to the nearest visible UI/content.
Text labelThe text is user-authored instruction or context for the nearest visible object or region.
Multiple colorsIf annotations use multiple colors, assume the user is separating distinct situations, categories, priorities, or comparisons on the screenshot unless the visible context proves otherwise.

Privacy model

Versioning

stash-1 is stable. Additive changes (new fields, new tools) land as v1.1, v1.2 and do not break v1 readers. A breaking change bumps to stash-2. Prefer live MCP data over a frozen XMP snapshot when both are available.

The machine-readable version of this page lives at yourstash.ai/llms.txt. Point your agent's config there for a one-shot sync; it's the same spec as above, optimized for plain-text consumption.

Frequently asked questions

What is the stash-1 protocol?

The stash-1 protocol is Stash's specification for how LLMs and agentic tools read Stash captures across native screenshot channels, Chrome extension PNG metadata, and a local Model Context Protocol (MCP) server. Each channel is a fallback for the layer above.

Which channel is most authoritative — banner, XMP, or MCP?

For native Stash captures, the MCP server is most authoritative when the original Mac is reachable, because it carries the full live dossier including the un-summarized accessibility tree. XMP is next — it survives file-on-disk flows like email and Drive. The pixel banner is the last-resort channel that survives anything an image survives, including pasted-into-chat captures where every byte of metadata has been stripped. For Chrome extension captures, prefer the imported Stash History/MCP item, then the PNG tEXt chunks, then visible pixels.

Can I read Stash captures from a remote machine?

No. The MCP server binds to a UNIX domain socket under the user account and is not exposed to the network by design. To read captures remotely, agents must rely on the XMP payload embedded in the file or the pixel banner.

How does Stash authenticate MCP clients?

Stash reads the connecting peer's codesign team identifier on connect and silently rejects unknown signers. The built-in allowlist trusts Anthropic (team 58LP8PCM82) and Stash itself (team VJMJQKCRMC). The bridge binary ships bundled inside Stash.app, pre-signed under team VJMJQKCRMC, so it connects with no extra setup. Additional team IDs can be added via Settings → Privacy, or the mcpAllowUnsignedClients toggle can bypass the check entirely — but that lets any unsigned local process connect to the socket, so it's for advanced local use only, not a general recommendation. Note that team-ID allowlisting is a speedbump, not a hard boundary: because the bridge is a pure relay, any local process can launch the trusted bridge and drive it (a confused-deputy weakness). It stops other signed apps from connecting directly; it does not stop arbitrary same-user code. A capability-token handshake is the planned future replacement.

Are Stash video bundles indexable by AI tools?

Yes. Each bundle is a self-describing folder containing report.md with YAML frontmatter, frame_tags.json with per-frame app and window metadata, an offline llms.txt, sampled JPEG frames (capped at 30 per recording), and extracted audio. Agents read it as a single unit via get_bundle().

Will the stash-1 protocol ever break?

No. The stash-1 protocol is stable. Additive changes — new fields, new tools — land as v1.1, v1.2 and do not break existing v1 readers. A breaking change would bump to stash-2.

Key takeaways

Related reading