[Step 3] Fetch Training Dataset

Please refer to our Starter Kit to see how you can participate as a Modeler!


To interact with a challenge you want to participate in, you need to learn its CHALLENGE_ID.

To do so, run:

spectral-cli list-challenges

Sample output

Available challenges:
Credit Scoring: 0xFDC1BE05aD924e6Fc4Ab2c6443279fF7C0AB5544

Fetch Training Dataset

Please confirm the exact CHALLENGE_ID in your Spectral CLI!

Then, you can fetch the training dataset for the challenge:

spectral-cli fetch-training-data <CHALLENGE_ID>

Challenge 1: Data Dictionary

Disclaimer on Challenge 1's data dictionary

  1. Given the vast landscape of over 800,000 cryptocurrencies in existence, some potentially conservative filters and decisions were employed to mitigate the impact of outliers.

  2. Each row in the dataset corresponds to a borrow event, and feature values are generally calculated at the borrow event (up to and including the block prior).

  3. DeFi protocols considered in this dataset are Compound v2 and Aave v2 on Ethereum.

  4. Conceptually, risk factor is the reciprocal of health factor (for the purpose of numerical stability).

  5. We define risky smart contracts (unrelated to risk factor) as tokens that have exhibited extreme volatility (more than 90% price drop in a day) or lack of trading volume (less than $1,000 within a day) on DEX. We define risky transactions as transactions with risky smart contracts.

  6. Timezone is calibrated to UTC +0 (relevant for features observing daily values).

  7. ERC20 tokens and transactions (normal wallet transaction, a DeFi transaction, etc.) are converted to value in ETH as per price conversion carried out on the first block of the day. Some tokens that are infrequently used or have pricing issues have been excluded to err on the side of caution.

  8. Whenever an intermediary smart contract is involved in a DeFi transaction, connection is made back to the EOA (externally owned account) that initiates it.

Last updated