Approaches to Verifiable Machine Learning
The scorer's dilemma
As a credit scorer off-chain, it is advantageous to be able to prove properties of the model and the method of computation of any score without revealing the exact model. This is the ultimate setup for most blockchain applications, where data is stored on-chain, but computation is done off-chain and only proofs are produced on-chain. As such, there is no reason why these proofs of computations should only exist for valid state transitions (as in zk-rollups), but instead can extend to arbitrarily complex computations that can be cast in the form of a zk-proof setup.
A starting model: ZKP for Decision Trees
The structure of Decision Trees is similar to that of a Merkle Tree and as a result, Zero-Knowledge for Decision Trees can be viewed as an idea which combines Merkle Trees and SNARK proofs at a high level.
Steps to produce a proof that a Decision Tree performed as expected:
- We first train the decision tree using the public blockchain activity.
- The next step is to merkle-ize the decision tree. This allows one to cryptographically commit to the decision tree. When a user interacts with the system, they can check using the commitment that the model is committed to before assigning them their credit score, proving that no ad-hoc changes occur to the model causing biases in score classification.
- The final step is to produce proofs that the credit score was calculated honestly (with the blockchain data and the committed decision tree)
The model flow
zk-SNARKs are used to hide the particular path in the decision tree to prevent users from collaborating to learn model parameters. The statement we prove is that we have a valid path in the tree corresponding to the commitment, without revealing the path itself.