Challenge 1: Credit Scoring

Submission File Format

The submission file should include the following for each address included in the validation dataset:

  • Predicted probabilities of liquidation

  • Predicted labels

  • All feature values used to make the prediction

The predictions should be formatted as follows (and submitted via Spectral CLI):

borrower_id,pred_prob,pred_label,feature_1,feature_2,...,feature_n-1,feature_n
0xa,0.3532861062879764,0,-0.5532861062879764,-0.5228489554109936,...,-0.6116379823629728,0.6216379823629728
0xb,0.5532861062879764,1,-0.11900525633358609,0.34030821858390026,...,-0.4023606161112323,-0.3023606161112323
...

Note: Depending upon the model architecture, some models can also predict logits instead of probabilities. Please ensure that your models output probabilities directly (and without converting logits into probabilities through torch.nn.Sigmoid() or other similar functions) for them to remain compatible with our zkML setup.

Model Validation Criteria

All submitted models will be evaluated against the weighted average of the following seven model validation metrics:

  • Area Under the Receiver Operating Characteristic Curve (AUC/AUROC)

  • Area Under the Precision-Recall Curve (PR-AUC)

  • Recall Score

  • F1 Score

  • Brier Score (since the lower the Brier Score the better it is, we use 1 - Brier Score to score models)

  • Kolmogorov-Smirnov Statistic (KS Statistic)

  • Predicted Probability Densities (difference between the median predicted probability of the two labels)

These metrics will be calculated for the predictions (probabilities + labels) returned by the modeler on the validation dataset.

The respective weights and knock-out thresholds for each of the above metrics are:

Additional Details:

  • The overall Model Score (which is a number between 0 and 100 inclusive) is the weighted average of all seven metrics based on their respective weights (akin to Excel’s SUMPRODUCT function)

  • The Knock-Out Thresholds indicate the minimum required metric value for a given model, i.e., any model that results in any of the seven metrics being less than the knock-out threshold will be automatically discarded, irrespective of the overall Model Score

Last updated