File size: 974 Bytes
c2f286f 9c8e045 5f77729 0320db8 c2f286f c5cf5e0 c2f286f 9c8e045 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
---
license: mit
---
Classifier is fine-tuned from [deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on [this forecastability classification dataset](https://huggingface.co/datasets/noanabeshima/forecastability_classification) to predict if Claude 3.7 Sonnet thinks a [fineweb](https://huggingface.co/datasets/HuggingFaceFW/fineweb/viewer/default/train) document is 'forecastable', i.e. is a useful seed for generating pastcasting questions.
Despite having a ROC AUC of .9625, only ~2% of fineweb documents are considered forecastable, so this classifier's precision/recall curves on random unseen fineweb documents look like this:

To load the model use
```
model = AutoModel.from_pretrained('noanabeshima/forecastability-classifier-v1')
tokenizer = AutoTokenizer.from_pretrained('noanabeshima/forecastability-classifier-v1')
``` |