Announcing the Theseus Playground

A browser IDE for authoring Theseus agents in markdown. The shape is the one Claude Agent SDK and OpenAI Assistants users already know (a system prompt and a tool catalog), and the playground takes both directly.

June 8, 2026

Most working AI agents come in two parts: a system prompt that defines the persona and operating contract, and a list of tools the model can call. The Claude Agent SDK and OpenAI Assistants both keep this shape clean. Typical production agents are fifty lines of prompt and a handful of HTTP endpoints behind them.

Deploying one onto a verifiable runtime should not require rewriting any of that. Until recently it did. Theseus agents were authored in a Rust subset called the SHIP IR, compiled to SCALE bytes, and registered on chain via the CLI. The result was correct (each agent got its own on-chain account, signed receipts on every model invocation and tool call, and a content hash of the deployed code), but the on-ramp was a Rust file with ToolSpec literals most agent developers had never written.

The playground takes that on-ramp away. You author the files you would for a Claude agent: a THESEUS.md holding the system prompt and frontmatter (the agent's name and model), a tools.yaml declaring its tool surface, and one or more SKILL.md files for the capabilities it can call. The playground reads them, generates the SHIP IR, compiles in the browser via WASM, signs, and submits the register_ship_agent extrinsic. The compiled bytes are byte-identical to what a hand-authored Rust agent would produce. The playground is in private beta; you can request access by emailing eric@theseus.network. If you want to skip to the worked example, the five-minute walkthrough takes a starter agent from a template to an on-chain address.

You sign in with an email magic link, GitHub, or Google, or look around in read-only guest mode first.

The playground sign-in: email magic link, GitHub, or Google, with a guided tour for first-timers

The playground IDE: a THESEUS.md system prompt on the left, a live preview of the agent's name, model, and sovereignty on the right

What this looks like

A working agent in three pieces. Two of them are markdown.

This is the workspace the playground opens with: Hello Agent, a greeter that can read its own balance. THESEUS.md holds the frontmatter, the agent's name, id, and model, followed by the system prompt body. The model, sovereignty, a description, and tags are also editable in the deploy panel beside the editor, shown above.

THESEUS.md open in the playground editor: name, id, and model frontmatter above the system prompt body

The greeter's one capability is a skill at skills/check-balance/SKILL.md. Its frontmatter names the native tool it is allowed to reach, and the body tells the model how to use it.

The check-balance skill open in the playground, declaring allowed-tools: chain.balance

The tool surface lives in tools.yaml. It draws from two pools: native tools the runtime executes on chain (token transfers, balances, bridge sends), and common tools verifiably executed off chain from a curated registry (web_search, web_fetch). Native tools are available by default, so the greeter leaves them as they are; you edit tools.yaml only to narrow the native set or opt into a common tool.

tools.yaml open in the playground, listing the native on-chain tools available to the agent

On deploy the playground reads the workspace, emits the tool surface plus a standard tool-loop control flow (init → think → act → think → done), compiles the assembly to SCALE bytes in the browser, and submits register_ship_agent with your wallet. The agent receives its own on-chain account, funded with an initial endowment you set at deploy. Every subsequent invocation queues a model job, which a prover picks up and resolves on chain. Native tool calls execute on chain; common tool calls dispatch to a verifiable executor and the response body returns on chain.

If you need a control flow the synthesizer does not generate (pauses for human input, custom state fields, scheduled triggers, structured output schemas), you add an agent.rs to the workspace and the playground uses yours instead.

The agent is the account of record

The most important thing the playground deploys is not the prompt or the tools, but the agent's own on-chain account. At register_ship_agent time the chain derives an AccountId from hash(deployer || compiled_hash || salt), transfers an initial endowment from your account to that address, and stores the compiled SHIP IR against it. From that point on the agent's address has its own balance and its own on-chain history, distinct from the deployer's. The chain attributes operations to the agent, not to you.

This is the model How AI Agents Actually Own Assets lays out at length. In practical terms it means two things today. First, every inference result, every refused commission, every reconciled price feed lands on chain indexed by the agent's address; that record is the agent's, not the deployer's. Second, the agent's authority is not a private key but its compiled SHIP IR, which is hash-locked at deploy. A buggy or compromised model output can only spend what has been funded into the agent's address, and only on the actions the SHIP IR encodes. There is no operator key to compromise mid-run because the agent's authorization is the code on chain.

What the chain records

Beyond the account itself, the verifiability surface is straightforward. The deployed agent has a deterministic content hash, which anyone can re-derive from the published THESEUS.md and SKILL.md files. If the agent's behavior is updated on chain, the hash diverges and the change is visible.

Every model invocation generates two on-chain events. A Queued event fires when the agent's SHIP IR triggers inference; it names the model tag and includes the prompt bytes. A Verified event fires when the prover submits the result; it carries the output, the reasoning trace, an output hash, and the identity of the prover who signed. The chain stores an input commitment (hash) in persistent state and preserves the full prompt in the event log. The model tag is fixed by the compiled SHIP IR, so a prover cannot quietly route an inference job to a different model than the agent declared.

Every tool call lands on chain with the args sent and the response body the executor submitted, truncated if oversized. Anyone can hash the body and check it against what the executor reported. A tool that lies about what it returned can be challenged on the bytes themselves.

What's live, and what's next

The playground is live in private beta, and it deploys to the Theseus alpha testnet. Registering an agent, funding its account, and reading its on-chain history all work today. What the testnet does not yet run is the prover. Until it does, demo-agents.theseus.network exercises each agent against centralized model providers, and the on-chain version is wired up and waiting for the prover to come online.

Sovereign agents are no longer a future item. The sovereignty selector lets you deploy an agent with no controller at all, where the on-chain SHIP IR is the only authority, or keep the operator-managed relationship where the deployer can update the agent, add or remove skills, deactivate it, or cancel a run.

Still ahead: the prover resolving inference on chain; the allowed-tools runtime filter, which turns each skill's allowed-tools frontmatter from documentation into a gate on which tools the model can reach; and a transfer_ownership extrinsic so an operator-managed agent's administrative relationship can move NFT-style. The THESEUS.md and tools.yaml you write today do not change when those land.

Try it

The playground is in private beta. Request access by emailing eric@theseus.network.

theseus.network/docs/playground: the five-minute walkthrough. Starts from a markdown template, walks you through the frontmatter and tools.yaml, and ends with an on-chain address.

demo-agents.theseus.network: thirteen worked examples. Eight adjudicators (price oracle, bridge guardian, governance reviewer, FAA safety reviewer, sovereign fund, Polymarket adjudicator, launch sniper, Terra failsafe) and five identity-anchored agents (literary author, visual artist, music critic, legal co-author, in-game NPC chronicler). Each is a single THESEUS.md and a skills directory; the playground ships all thirteen as starter templates.

For what we are shipping next, theseuschain.substack.com carries the long-form thread and t.me/theseusnetwork is the chat.