The AIVM is the execution backbone of Theseus, engineered to support sovereign AI with tensor-native opcodes, deterministic execution, and cryptographic proof generation.
The AIVM comprises tightly modular subsystems designed specifically for AI workloads:
Evaluates tensor opcodes deterministically using a stack-based dispatch and fixed-point arithmetic. This ensures that every node running the same computation produces identical results.
Manages sandboxed memory for quick information retrieval. Agent contexts, embeddings, and temporary tensors are stored here during execution.
Generates a succinct Tensor Commit receipt so any validator can check the result in milliseconds. This is the bridge between execution and verification.
Provides secure boundary crossing for external calls (e.g., agent messaging, external state anchoring). All cross-boundary operations are verified and metered.
Emits a Merkle root each block, giving auditors a single hash to verify full execution traces. This enables light clients and cross-chain verification.
Unlike traditional VMs, AIVM includes specialized operations for AI inference:
All randomness comes from a VRF (Verifiable Random Function), and all validations require full consensus amongst the validator set. This means any full node can reproduce receipts bit-for-bit. Tensor Commitments also have deterministic validations.
Every tensor operation carries a linear gas price based on FLOPs (floating-point operations):
A congestion multiplier (broadcast once per block) keeps prices elastic depending on load. This allows accurate gas metering while fairly tracking hardware costs and inference statefulness.
| Model | Tokens/sec (A100) | Est. Gas/Token |
|---|---|---|
| GPT-2 | 180-200 | 50K |
| LLaMA-7B | 90-100 | 150K |
| LLaMA-13B | 50-60 | 400K |
| GPT-3.5 | 15-25 | 800K-1M |
| LLaMA-65B | 5-10 | ≥900K |
These benchmarks align with contemporary LLM workloads. For instance, in GPT-3 (175B), a single forward pass on a prompt of 1024 tokens takes approximately 40-60 ms on A100 hardware, generating upwards of 1.2M FLOPs/token.
Theseus must juggle thousands of simultaneous model calls—from millisecond-critical trading bots to overnight analytics—without favoring whales or letting gas fees spike at random.
Many real-world AI inference tasks, such as Mixture-of-Experts (MoE), do not involve a single forward pass through one model. Treating each step as a separate on-chain transaction requires extra gas, extra block confirmations, and losing the transaction flow.
Theseus enables model pipelining so that the AIVM understands complex model behavior. AIVM allows tensor operations to feed outputs into the next in a single chaining operation (op-graph chaining):
This supports multi-model workflows (encoder-decoder, RAG, etc.) efficiently.
The EVM simply isn't built for on-chain AI. It offers no tensor-aware opcodes and no native inference proofs like AIVM's Tensor Commits, so hardware-specific rounding quirks can slip through unchecked.
| Feature | EVM | AIVM |
|---|---|---|
| Tensor operations | No native support | Built-in opcodes |
| Inference proofs | Not supported | Tensor Commits |
| Agent autonomy | Requires human keys | Native sovereignty |
| Gas model | Generic opcodes | FLOPs-based for AI |