Resource machine optimization explorations

Currently, proofs for the RISC0 RM take several minutes to create (on fast hardware), and sometimes tens of minutes or more on slower hardware. Considering specifically the compliance proof (for which we have a dedicated profile file thanks to @xuyang):

  • ~ 2/3 of proving time is spent on proving elliptic curve operations
  • ~ 1/6 of proving time is spent on proving hash computations (about 50 hashes)
  • ~ 1/6 of proving time is spent on encoding/decoding

For latency-sensitive applications (even simple ones, e.g. Anoma Pay / shielded transfers), proving times in the minutes are not tenable from a UX perspective. For more complex applications, proving times could run into the hours, which is probably also unworkable, so I think we should devote some attention to how the situation might be improved. Unfortunately, for a zkVM stack that we did not write and do not deeply understand (e.g. RISC0), our ability to optimize may be somewhat limited.

At a high level, if we want substantially better proving times in the short term, I think there are directionally two options:

  1. Attempt to optimize the resource machine RISC0 implementation.
  2. Explore alternatives to zkVMs (which allow easy programmability but clearly still come with a lot of overhead).

Optimizing the RISC0 RM

For substantially improved proving times, I imagine that we would need to both either highly optimize or remove the elliptic curve operations and either reduce the count or increase the proving speed of hash functions.

Optimize elliptic curve operations

I don’t know the details here and would defer this question to @xuyang – are we using an optimized gadget/precompile for the elliptic curve operations? If not, could we write one, or is that not really possible in RISC0?

Remove elliptic curve operations

We could design a version of the RM which dispenses with the balance check entirely. Some relevant ideas were discussed in this topic a few months ago – we haven’t fully worked out what this would look like, but it’s certainly possible to explore. Likely the elliptic curve operations would be traded for more hash functions, since we’d need to use more ephemeral resources to carry the constraints which are currently carried by the delta and balance checks.

Reduce the count of hash functions required

I don’t know the details here either – maybe this topic is relevant?

Increase the proving speed of hash functions

Are we using the fastest hash function available on RISC0?

Explore alternatives to zkVMs

An interesting option could be to resurrect the old Taiga codebase, which had pretty good proving times even on old hardware. The primary constraint here that we would need to figure out how to satisfy is EVM proof verification (we still want a protocol adapter - like construction). We might be able to verify Halo2 proofs – I did a bit of searching and found this repo by Axiom, although I’m not sure of the performance characteristics there. Another option could be to use Aligned Layer which supports more verifiers.

As an aside, have we explored OpenVM? It looks pretty interesting.

I think there are roughly two feasibility questions here:

  1. How difficult would it be to update the old Taiga codebase to the current shielded RM design and construction?
  2. Can we manage to verify the proofs on Ethereum? Would the above Solidity contracts work in a reasonably performant way or would we need to build something custom / do a layer of Groth16 recursion / etc.?

This is just a brief sketch, but I wanted to lay out these thoughts and ping the experts – @vveiln @xuyang @ArtemG – for feedback and a sense as to which directions here are most worth exploring.

2 Likes

Just to keep in mind: for compliance circuits, balance check is the only place where we deal with elliptic curves. In resource logics, though, it is one of the most common authorisation mechanisms. For example, in the kudos transfer with the example denomination and receive logic, each denomination function (issue, burn, transfer) is authorised with a signature. In addition, the kudos main logic also requires a signature check. So if we want to try to optimise elliptic curve situation by getting rid of the EC checks, we would have to come up with a new go-to mechanism for such things. One option would be to explore the PRF approach further.

If we succeed to come up with the mechanism to replace signatures, we still would have to deal with signatures the moment we leave our application land. For example, if the action is authorised by some external mechanism, say, Metamask, it would be signed by its key – there are no other mechanisms available, and the resource logic would have to check whatever is authorised.

1 Like

We may also explore the Sigmabus technique. Designed just for this. It offloads EC scalar multiplications from the compliance circuit (or any), and replaces them with hashes and EC additions. It requires the kind group element to be known though; so Sigmabus would need to be adapted to use a hidden basepoint, instead.

1 Like

I created a quick benchmark test to see if OpenVM might be helpful – at least per this benchmark, SHA256 proving appears to be pretty fast, I proved 1000 hashes in < 10 seconds on my MacBook Pro. @xuyang Can you take a quick look at this – am I missing something obvious here?

Just a reminder that the hash function that we are using must be available as a precompile on Ethereum and the other chains we want to deploy to. Otherwise, gas costs would be too expensive.

1 Like

No, we’re using an accelerated version of the ec library provided by Risc0, which includes some general hacks from the original ec lib. It’s not a precompiled or natively implemented gadget. I asked the Risc0 team about this when I noticed that signature verification in their example code was also slow. However, they said the performance is normal and that their system currently doesn’t support precompiled or custom gadgets implemented in the native proving system. We could write some native circuits manually, but we can’t integrate them into the Risc0 compiler. Hopefully, they will add support in the future.

If we can figure out a way to remove ec from our circuits, I believe it could significantly reduce the proving time.

According to the hash benchmark in risc0, yes. I also asked risc0 team, they confirmed sha2 is the best.

Using variable-sized tree could be helpful to reduce a few hashes in logic but it may not be an obvious improvement as sha2 is quite cheap.

Regarding Taiga and the Halo2 system, we can achieve the best proving time using Taiga/Halo2 or any native proving system, but at the cost of reduced programmability. There are two versions of Halo2: zcash-IPA-halo2 and PSE-KZG-halo2 (forked from zcash-halo2, now seems maintained by Axiom). KZG-halo2 uses the bn254 curve, enabling Ethereum-compatible proof verification. Currently, Taiga uses zcash-halo2, but switching to KZG-halo2 is feasible and not hard. Btw, KZG-halo2 requires a setup, basically it’s a plonk system

It may not be that hard. We don’t need major changes to the high-level design, but some details need refinement and switch to KZG-halo2.

Yes, using KZG-halo2 in Taiga could make it work on Ethereum. We don’t need a Groth16 layer, as KZG-halo2 proofs can be directly verified. I remember Axiom’s repo includes proof aggregation. I’m not sure whether Halo2 now supports recursion. I haven’t followed recent updates.

It looks nice. And performance makes sense, as it supports custom instructions as described in README. That’s the feature we expect

  • Extensible Instruction Set: The instruction set architecture (ISA) is designed to be extended with new custom instructions that integrate directly with the virtual machine. Current extensions available for OpenVM include:

    • RISC-V support via RV32IM

    • A native field arithmetic extension for proof recursion and aggregation

    • The Keccak-256 and SHA2-256 hash functions

    • Int256 arithmetic

    • Modular arithmetic over arbitrary fields

    • Elliptic curve operations, including multi-scalar multiplication and ECDSA scalar multiplication.

    • Pairing operations on the BN254 and BLS12-381 curves.

1 Like

Initial benchmarks of openvm and risc0 on a Mac M4. Groth16 proofs in risc0 were generated using Bonsai.

openvm-sha2-iter app-proving app-proof stark-proving stark-proof snark-proving(evm) snark-proof(evm)
10 1.3s 1.9M 14.1s 1.5M
100 1.8s 1.9M 14.6s 1.5M
1000 8.0s 2.1M 21.4s 1.5M
risc0-sha2-iter composition-proving composition-proof succinct-proving succinct-proof groth16-proving(bonsai) groth16-proof(evm)
10 1.5s 209K 3.1s 222K 21.4s 256 B
100 6.0s 243K 7.7s 222K 22.2s 256 B
1000 24.2s 267K 25.9s 222K 24.4s 256 B
openvm-ecc-add-iter app-proving app-proof stark-proving stark-proof snark-proving(evm) snark-proof(evm)
10 1.8s 3.4M 19.1s 1.5M
100 1.9s 3.4M 19.9s 1.5M
1000 2.4s 3.4M 20.4s 1.5M
risc0-ecc-add-iter composition-proving app-proof succinct-proving stark-proof groth16-proving(bonsai) groth16-proof(evm)
10 52s 281K 54s 222K 28s 256 B
100 320s 1941K 349s 222K 47s 256 B
openvm-ecc-multiply-iter app-proving app-proof stark-proving stark-proof snark-proving(evm) snark-proof(evm)
10 2.3s 3.4M 19.6s 1.5M
100 6.9s 3.4M 24.7s 1.5M
1000 51.4s 6.9M 82.7s 1.6M
openvm-ecc-msm-iter app-proving app-proof stark-proving stark-proof snark-proving(evm) snark-proof(evm)
10 1.9s 3.4M 19.0s 1.5M
100 2.4s 3.4M 19.7s 1.5M
1000 4.3s 3.4M 21.5s 1.5M
openvm-signature-verification app-proving app-proof stark-proving stark-proof snark-proving(evm) snark-proof(evm)
1 2.5s 3.4M 26.8s 1.5M
risc0-signature-verification app-proving app-proof stark-proving stark-proof groth16-proving(evm) snark-proof(evm)
1 25.5 267K 26.4s 222K 35.2s 256 B
3 Likes

A mocked compliance circuit in openvm: GitHub - XuyangSong/compliance-performance-evaluation-openvm

The basic operations are roughly as follows. I couldn’t find the hash_to_curve implementation in their library. Instead, I counted the field operations, which may not be very accurate.

commitments: two hashes on 256 bytes
nf: one hash on 128bytes
path check: 32 hashes on 64 bytes

ec operations: 3 additions + 3 multiplications
hash_to_curve: 2 hashes + ~45 additions + ~80 multiplications + 15 squares + 1 pow + 6 inverts

  • hash_to_field - (1hash + init 3 fields + 1 addition + 1 multiply on fields)
  • map_to_curve * 2
    • osswu: 6 squares + 15 multiplications + 4 additions + 1 pow + 1 invert + 3 normalizes
    • isogeny: 1 square + 2 inverts + 4 multiplication
      • 4 compute_iso( 4 additions + 4 multiplications ) = 16 additions + 16 multiplication

The compliance proving performance:

openvm-compliance-circuit app-proving app-proof stark-proving stark-proof snark-proving(evm) snark-proof(evm)
1 2.1s 3.7M 19.7s 1.5M

Note: The stark proof inherently includes the app proof. We currently lack remote and EVM proof benchmarks.

For a transfer transaction in the Kudo app, we require 3 compliance proofs and 6 logic proofs. We can estimate transaction creation time using proof_num * compliance_proof_time. Generating a transfer transaction with app proofs takes 9 × 3 = 27 seconds, while generating Stark proofs takes 20 × 9 = 180 seconds. The latency must also include time for remote EVM proof generation.

4 Likes

Notes from meeting 2025.08.14:

  • We don’t think that it’s possible to generate EVM proofs locally with OpenVM.
  • Currently app and STARK proofs are not zero-knowledge. Axiom plans to make them zero-knowledge (their ETA was ~2 months) but this may impact proving time (we don’t know yet).
  • What we want to compare: how the remote proving times actually compare.
  • PLONK proofs may cost additional gas to verify.
  • Axiom’s next release includes an open source GPU prover.

1st priority: protocol adapter updates (@xuyang @ArtemG)

Next step on OpenVM: get end-to-end remote proving benchmark for OpenVM mock compliance circuit (@xuyang)

Other optimization directions:

  • Reduce # of resources in an application
    • @vveiln to analyze whether we can build a simpler version of shielded kudos which requires fewer resources - forum post incoming, follow up early next week
      • One denomination per ERC20
      • No receive authorization
      • Think about how send authorization should work (signature?)
      • Called something else
      • Denominations are only tied to ERC20s
  • Increase # of resources in a single compliance circuit (@xuyang) - forum post incoming
    • Currently: 1 consumed, 1 created
    • Could extend this to more, have each action use one compliance circuit, then we don’t need dummy resource anymore - but need to figure out relationship between nonce and nullifier - hash again to generate more nonces perhaps.
      • More fundamentally, need a way to make the new commitment unique
      • @xuyang has a rough idea of what to do here
1 Like

Duplicating info here for visibility:

We explored two approaches that might allow for secure proof generation.

Full Local EVM Proofs

The user generates all the proofs for the transactions - and hence the entire RM transaction - locally. This would of course be preferable for any cypherpunks and people actually interested in security.

However, this is practically unattainable currently.

Firstly we have to take into account that the user will have to download a 18 GB large proving key for proving. Yet that part is ok, the user would need to do it only one time. If they are interested in security that is bearable, although - if we’re comparing ourselves to Railgun - this may take hours depending on internet speed. Took 3 hours for me, but I have pretty bad connection.

Secondly, the local generation of EVM proofs currently ranged from very slow to impossible.

Locally running sha256 example taken from the official repo takes about 20 minutes - for a single hash function run and one equality comparisson! - and barely made my 64 GB RAM machine shut down. At some point it simply froze while running at 99+% of RAM usage.

While testing @xuyang provided mock compliance proof generation my laptop simply ran out of memory after 20+ minutes.

Local app-proving and server-side aggregation proof

The user generates all the app proofs and then sends them to e.g. us to aggregate them into an EVM-verifiable proof.

Currently this does not actually grant us anything because it seems that the app proofs lack any ZK properties. While theoretically it may not be easy to get any info from them, for all intents and purposes we should look at it as basically being transparent.

However, as @cwgoes mentioned, if they can can make them ZK then it might be worth returning to this idea in the future, hoping that the proof sizes and times do not escalate due to that.