Using Zero-Knowledge Proofs to prevent spam in erasure coding

Various approaches to prevent spam have been listed in specs. A more generic question on the same was also posted at How is spam prevented in Marlin?.

Marlin splits blocks into erasure-coded chunks to ensure that blocks are propagated through the network despite the presence of malicious actors. As an individual relayer can’t detect if a chunk is from a valid block just by looking at a single chunk, a malicious source may publish an invalid block which may spam the network. Staking or producer payments might be insufficient to thwart externally motivated actors. To ensure such malicious sources are penalized/make griefing costly, it is necessary to prove on-chain that the published set of chunks are invalid.

A Naive way of solving this problem is to reconstruct the block from erasure-coded chunks on a smart contract and slash the source. This approach is limited by the amount of data smart contracts can handle.

Current Gas Limit in Ethereum = 10 million gas

Gas per byte of call data (Post EIP 2028) = 16 gas/Byte

Max calldata that can be sent to a smart contract = 10000000/16 = 625 KB

As calculated above the maximum calldata size is 625 KB which is less than the size of a bitcoin block ignoring any increase in size due to erasure coding. Hence it will not be feasible to directly reconstruct block on a smart contract.

We have explored multiple approaches and scaling techniques available which are listed below

  • Optimistic Rollups: Optimistic Rollups are useful to reduce the compute performed onchain but the input data should still be pushed onchain for availability of the transaction data. This is not possible as we have seen above the block sizes might be higher than the maximum input data that can be sent to Ethereum.
  • Plasma: Plasma chains might support a higher gas limit compared to the base chain but it is still hard to send the input as a block of other blockchain which can be arbitraryly large.
  • State Channels: Assuming existence of state channel tech which can support arbitrary computations. In case of conflicts the data has to be verified onchain which leads us back to same problem of pushing block data onchain.
  • Truebit: Trubit can solve the computational aspect of the onchain verification but similar to other techniques it is also prone to sending the block data onchain.

ZkSnarks might be a valid approach to solve the problem. In case of ZkSnarks, the input can be private data and hence need not be published onchain and only the hashes of the chunks could be published. The proof size and verification time is O(1) which ensures that the proof can be verified on the smart contract for a reasonably large number of chunks.

The challenge with this approach is that as we might have to perform a significant number of operations to create a hash, the ZkSnark circuit size blows up. As an estimate, around 57000 constraints are necessary for generating a hash using Zokrates. This is a significant issue as we want to verify the transaction Merkle tree where we need to perform a significant number of hashes based on the number of transactions in the tree. So as the number of constraints blows up, the proof generation time also increases.

We have to look at ways to decrease the proof generation time by finding an efficient algorithm and accelerating compute perhaps using GPU and ensure that it can be performed on consumer-grade hardware.