zkVM exploration

SP1 zkVM

Overview

SP1 uses FRI-based STARK proving system with recursive aggregation and optional STARK-to-SNARK wrapping. It supports on-chain verification on Ethereum and Solana.

SP1 can execute and prove programs written in Rust, C, C++, or any language compiled to risc-v.

Proving System

  • Base proving system: FRI-based STARK

  • Field: Baby Bear field

    • Small field with prime

      p = 15 \cdot 2^{27} + 1

  • Recursion: Fully supported (recursive STARKs)

  • Acceleration:

    • Cryptographic precompiles

    • GPU acceleration

Proof Types in SP1

SP1 supports multiple proof types:

1. Core Proof (Default)

  • A list of STARK proofs

  • Fully transparent (no trusted setup)

  • Large proof size

  • Best suited for:

    • Off-chain verification

    • Further recursive proving and aggregation

2. Compressed Proof

  • Constant-size proof

  • Required for proof aggregation

  • Used as input for recursive verification inside SP1

Note: To verify an SP1 proof within SP1, a compressed proof must be generated first

3. Groth16 Proof

  • ~260 bytes

  • ~270k gas on Ethereum

  • Requires a trusted setup

  • Setup details:

    • Aztec ceremony

    • Additional entropy contributions from the Succinct team

4. PLONK Proof

  • ~868 bytes

  • ~300k gas on Ethereum

  • Universal trusted setup

Proof Aggregation & Recursion

  • SP1 natively supports recursive STARKs

  • Multiple proofs can be aggregated into a single proof

  • Aggregation occurs inside the zkVM itself

  • Aggregated proofs can then be wrapped into:

    • Groth16

    • PLONK

Local & Remote Proving

Local Proving

  • Fully supported(STARK, Groth16, and Plonk proofs)

  • Requires Docker env to generate Groth16 and Plonk proofs

  • Hardware requirements:

  • In practice, 32GB RAM recommended for Groth16 and Plonk proofs

  • Requires downloading Groth16 (2.82 GiB) and Plonk (1.09 GiB) circuit artifacts.

Remote Proving

SP1 supports a prover network that enables:

  • Fast proving with a GPU cluster
  • Decentralized proof generation

Precompiles

SP1 includes a set of precompiles to accelerate common cryptographic operations inside the zkVM.
It also supports extending precompiles by external contributors, but current documentation is limited—only a few examples exist, and the full developer guide will be coming as claimed.

Benchmark(TBD)
I’m benchmarking large circuits and cryptographic primitives; results coming soon.

Let me know if I missed any key features

Other zkVM exploration is underway—including Nexus, OpenVM, and others. Please suggest additional promising zkVMs so I can adjust priorities accordingly.

3 Likes

Benchmark Report: SP1 vs RISC Zero

All measurements on Apple M-series Mac, --release build.

  • SP1: v5.2.4 (compressed STARK proofs)
  • RISC Zero: v3.0.3 (succinct STARK proofs)

TODO: benchmark Groth16 proofs and performance on GPU

Compliance Circuit

SP1: k256 precompilation is not enabled in the compliance due to the hash2curve issue. I tried switching to secp256k1 (which supports precompilation but lacks hash2curve) on the use_secp256k1 branch and manually reimplemented hash2curve, but performance worsened. The correct fix is to integrate hash2curve into K256’s precompilation, which will significantly reduce cycle count.

Metric SP1 RISC Zero
Cycles 12,932,689 1,092,294 user / 1,310,720 total
Proving 177.74s 86.10s
Verification 62.10ms 11.47ms

Note: SP1 “cycles” number is closest to RISC Zero’s “user cycles” — both count actual instructions executed by the program.

  • RISC Zero proves the compliance circuit 2.1x faster than SP1.
  • RISC Zero verification is 5.4x faster.
  • SP1 reports significantly more cycles (12.9M vs 1.1M user), reflecting different ISA overhead and the lack of k256 precompilation on SP1 for this circuit.

Transfer Circuit

SP1 results shown with k256 precompilation enabled.

Persistent Resource Consumption

Metric SP1 RISC Zero
Cycles 269,280 423,815 user / 524,288 total
Proving 85.78s 29.77s
Verification 65.37ms 9.99ms

Persistent Resource Creation

Metric SP1 RISC Zero
Cycles 148,716 569,929 user / 1,048,576 total
Proving 83.69s 57.05s
Verification 66.00ms 9.97ms
  • RISC Zero proves transfer circuits 1.5-2.9x faster than SP1.
  • RISC Zero verification is 6.5-6.6x faster.
  • SP1 reports fewer cycles with k256 precompilation (269K/149K vs 424K/570K user), but the proving time remains higher due to SP1’s larger constant overhead.

SHA-256

Hashes SP1 Cycles RISC Zero Cycles (user/total) SP1 Proving RISC Zero Proving SP1 Verify RISC Zero Verify
10 9,416 12,007 / 65,536 30.56s 13.26s 62.14ms 18.55ms
100 53,696 57,727 / 131,072 31.21s 18.51s 65.61ms 18.44ms
1,000 496,496 514,927 / 1,048,576 42.63s 86.11s 63.44ms 18.41ms
10,000 4,924,496 5,086,927 / 6,291,456 251.80s 540.92s 61.57ms 18.53ms
  • User cycle counts are comparable across both VMs, confirming similar computational work.
  • RISC Zero total cycles are padded to power-of-2 segment boundaries; SP1 does not pad.
  • At low hash counts (10-100): RISC Zero proves 1.7-2.3x faster due to lower base overhead.
  • At high hash counts (1,000-10,000): SP1 proves 2.0-2.1x faster, scaling more efficiently with workload size.
  • Crossover point: Between 100 and 1,000 hashes, SP1’s proving time becomes faster than RISC Zero’s.
  • RISC Zero verification is consistently 3.3-3.6x faster regardless of workload size.

ECDSA Signature Verification (k256)

Signatures SP1 Cycles RISC Zero Cycles (user/total) SP1 Proving RISC Zero Proving SP1 Verify RISC Zero Verify
1 237,785 340,472 / 524,288 79.30s 45.13s 65.61ms 19.20ms
10 2,256,413 3,067,240 / 3,407,872 138.93s 311.68s 66.44ms 19.39ms
  • SP1 uses fewer cycles per verification (238K vs 340K for 1 sig), likely due to better k256 precompilation, risc0 also enables the k256 acceleratioin though.
  • At 1 verification: RISC Zero proves 1.8x faster (lower base overhead).
  • At 10 verifications: SP1 proves 2.2x faster, demonstrating much better scaling for repeated ECDSA operations.
  • RISC Zero verification is 3.4x faster in both cases.

Summary

Dimension SP1 RISC Zero
Proving (small workloads) Slower (higher base cost ~30s) Faster (base cost ~13s)
Proving (large workloads) Faster (scales ~2x better) Slower (cost grows faster)
Proof verification ~62-66ms ~10-19ms (3-6x faster)
Precompile support SHA-256, k256 patches available Built-in accelerators
Cycle efficiency Comparable user cycles Comparable, but padded to power-of-2

Key Takeaways

  1. RISC Zero has lower base proving overhead: For small circuits, RISC Zero’s proving starts at ~13s vs SP1’s ~30s, making it faster for lightweight workloads.

  2. SP1 scales better with workload size: As computation grows, SP1’s proving time increases more slowly. This is visible in both SHA-256 (2.1x faster at 10K hashes) and ECDSA (2.2x faster at 10 verifications).

  3. RISC Zero has significantly faster verification: Proof verification is consistently 3-6x faster on RISC Zero (~10-19ms vs ~62-66ms), which is important for on-chain verification cost.

  4. Precompilation matters: SP1’s k256 precompilation reduces transfer circuit cycles by 17-39x (from 4.7M/5.9M to 269K/149K), though proving time improvement is modest (~3-18s savings). This suggests proving cost is dominated by factors beyond raw cycle count.

Code Links

sp1 benchmark repo: GitHub - XuyangSong/arm-sp1

risc0 benchmark(compliance, sha2, and ecdsa): GitHub - anoma/arm-risc0 at xuyang/circuit_bench

risc0 benchmark(transfer circuit): https://github.com/anoma/anomapay-backend/tree/xuyang/circuit_bench/simple_transfer/transfer_circuit

2 Likes