Tensor Commits Protocol

The Tensor Commitment Protocol is the security base of Theseus, providing public verifiability and tamper-proof computations with proven efficiency of <1% overhead.

Overview

Tensor-commit protocols enable verifiable ML which focuses on proving that a machine learning model was executed correctly. Traditional verification methods like recomputing the entire model are prohibitively expensive, especially for large language models.

Theseus' Tensor Commits provide batch verification and reduce the opening costs while keeping the computation overheads small. This is achieved through a novel application of KZG commitment schemes extended to multi-dimensional tensor structures.

Key Achievements

<1% Proof Generation Overhead

Minimal impact on inference performance. Generating proofs adds less than 1% to the total computation time, making it practical for production workloads.

<0.1% Verification Time

Fast cryptographic verification. Verifiers can check proofs in milliseconds, enabling thousands of validators to audit simultaneously without bottlenecks.

Efficient & Scalable

  • • Compact proofs with O(log n) verification complexity
  • • Along with Terkle Trees, a frontier foundational model has a Tensor Commitment proof size of less than 1MB
  • • Inference verification can scale to over a thousand verifiers simultaneously
  • • Sublinear scaling with model size

Terkle Trees

A Terkle tree (tensor Merkle tree) is a Merkle tree whose leaves are sub-tensors, and whose internal nodes carry tensor commitments instead of hash values. Tensor-commit takes advantage of Terkle trees to compress and reduce the proof size, while adding proof of memberships for large models (with millions of parameters).

Structure

Theseus uses Terkle trees by partitioning the full tensor into blocks. For a weight tensor with dimensions d₁ × d₂ × ... × dₖ:

  • • Each dimension j has mⱼ blocks where bⱼ is the block size
  • • Total number of leaf nodes (blocks) is M = m₁ × m₂ × ... × mₖ
  • • Each leaf cℓ is a tensor commitment of a sub-tensor Tℓ
  • • Parent nodes are computed by committing to the concatenation of children tensors
  • • The root cᵣₒₒₜ is the global fingerprint of the model

Benefits

  • Batch verification: Verify multiple tensor operations in one proof
  • Selective opening: Open specific sub-tensors without revealing the entire model
  • Efficient membership proofs: Prove a weight exists in the model with logarithmic proof size
  • Hierarchical structure: Natural fit for neural network layer organization

How Verification Works

1. Model Registration

When a model is registered, the prover uploads the model weights along with their Tensor Commit. This commitment is stored on-chain as the canonical fingerprint of the model.

2. Inference Execution

During inference, the prover (a specialized node with high-end hardware) runs the full forward pass and emits a Tensor Commit proof. The proof includes:

  • • Opening (y, π) showing the computation result
  • • Commitment to input embeddings
  • • Commitment to each layer's output
  • • Merkle path through the Terkle tree

3. Verification

Every verifier in the active set verifies every single inference. The process:

  • Proof size: On the order of hundreds of kB or less
  • Check time: ≈ 2 ms on a modern CPU core
  • Network overhead: Each proof is gossiped once; 1,000 validators can confirm 100 simultaneous prover jobs in well under one second thanks to ordinary parallelism
  • Consensus: Verifiers run standard BFT finality atop the proof layer (2/3 verifier agreement needed)

Performance Comparison

OperationLatencyProof SizeGas Cost
TMATMUL 512x5124.1 ms230 KB18K
TSTREAM 4x5128.6 ms400 KB27K
TCOMMIT 70B22 ms470 KB120K

* Gas costs based on base-load multiplier m = 1.0. Actual costs scale with network congestion.

LLM-Specific Optimizations

Token and Positional Embeddings

Input embeddings for tokens are committed polynomially with positional encoding leveraging homomorphic properties, allowing efficient verification without revealing input content.

Layer Normalization

Using polynomial commitments, we verify mean and variance computations. Inverse square root approximation is efficiently handled via polynomial approximation.

Multi-Head Attention

Attention computations (query Q, key K, and value V matrices) are committed individually. Attention scores and softmax weights are polynomially approximated for efficient verification.

Residual Connections

Residual paths are easily handled via commitment homomorphism: Comᵣ = Comₓ₍ℓ₎ · ComA. Each subsequent layer leverages prior commitments, enabling efficient recursive proof verification.

Mixture-of-Experts

Sparse expert activations are committed and verified efficiently using sparse tensor commitments. Only activated experts contribute proofs, significantly reducing verification complexity.

Why This Matters

Theseus' tensor commitments uniquely support scalable, transparent, and cryptographically sound verification for LLM inference. This novel commitment mechanism positions Theseus as the ideal blockchain solution for deploying trustworthy, decentralized, and verifiable large language models.

  • No recomputation required: Verifiers do not need to re-run the entire model
  • Hardware independence: Proofs are valid regardless of the hardware used
  • Privacy preserving: Model weights remain private while computation is verifiable
  • Scalable verification: Thousands of validators can verify simultaneously