AI Virtual Machine (AIVM)

The AIVM is the execution backbone of Theseus, engineered to support sovereign AI with tensor-native opcodes, deterministic execution, and cryptographic proof generation.

Architecture Overview

The AIVM comprises tightly modular subsystems designed specifically for AI workloads:

Execution Layer

Evaluates tensor opcodes deterministically using a stack-based dispatch and fixed-point arithmetic. This ensures that every node running the same computation produces identical results.

Memory Layer

Manages sandboxed memory for quick information retrieval. Agent contexts, embeddings, and temporary tensors are stored here during execution.

Proof Interface

Generates a succinct Tensor Commit receipt so any validator can check the result in milliseconds. This is the bridge between execution and verification.

Syscall Gateway

Provides secure boundary crossing for external calls (e.g., agent messaging, external state anchoring). All cross-boundary operations are verified and metered.

State Anchoring Layer

Emits a Merkle root each block, giving auditors a single hash to verify full execution traces. This enables light clients and cross-chain verification.

Key Features

Tensor-Native Opcodes

Unlike traditional VMs, AIVM includes specialized operations for AI inference:

TMATMUL - Matrix multiplication for tensors
TEWOP - Element-wise operations (ReLU, GELU, etc.)
TCUSTOM - Call registered custom kernels
TLOAD/TSTORE - Tensor memory operations
TCOMMIT - Generate Tensor-Commit proof
TSTREAM - Streaming inference operations

Deterministic Execution

All randomness comes from a VRF (Verifiable Random Function), and all validations require full consensus amongst the validator set. This means any full node can reproduce receipts bit-for-bit. Tensor Commitments also have deterministic validations.

Gas Model

Every tensor operation carries a linear gas price based on FLOPs (floating-point operations):

Gas = γ × FLOPs(op)
MODEL_FEE = Σ Gas_op + Proof Overhead

A congestion multiplier (broadcast once per block) keeps prices elastic depending on load. This allows accurate gas metering while fairly tracking hardware costs and inference statefulness.

Performance Metrics

ModelTokens/sec (A100)Est. Gas/Token
GPT-2180-20050K
LLaMA-7B90-100150K
LLaMA-13B50-60400K
GPT-3.515-25800K-1M
LLaMA-65B5-10≥900K

These benchmarks align with contemporary LLM workloads. For instance, in GPT-3 (175B), a single forward pass on a prompt of 1024 tokens takes approximately 40-60 ms on A100 hardware, generating upwards of 1.2M FLOPs/token.

Agent Scheduling

Theseus must juggle thousands of simultaneous model calls—from millisecond-critical trading bots to overnight analytics—without favoring whales or letting gas fees spike at random.

How It Works

  • Every agent gets a priority score based on stake, recent latency, and fairness
  • Calls to AGENT_TICK() or MODEL_INFER() land in epoch-bound queues, so nothing starves
  • The AIVM ships with a minimal, on-chain scheduler that respects latency classes

Latency Classes

RT (Real-Time):≤ 1 epoch deadline
Interactive:≤ 3 epochs deadline
Bulk:Unbounded (best-effort)

Model Pipelining

Many real-world AI inference tasks, such as Mixture-of-Experts (MoE), do not involve a single forward pass through one model. Treating each step as a separate on-chain transaction requires extra gas, extra block confirmations, and losing the transaction flow.

Theseus enables model pipelining so that the AIVM understands complex model behavior. AIVM allows tensor operations to feed outputs into the next in a single chaining operation (op-graph chaining):

TLOAD(encoder) -> TMATMUL -> TCUSTOM -> TLOAD(decoder) -> TMATMUL -> TCOMMIT

This supports multi-model workflows (encoder-decoder, RAG, etc.) efficiently.

Comparison to EVM

The EVM simply isn't built for on-chain AI. It offers no tensor-aware opcodes and no native inference proofs like AIVM's Tensor Commits, so hardware-specific rounding quirks can slip through unchecked.

FeatureEVMAIVM
Tensor operationsNo native supportBuilt-in opcodes
Inference proofsNot supportedTensor Commits
Agent autonomyRequires human keysNative sovereignty
Gas modelGeneric opcodesFLOPs-based for AI