---
base_model: Qwen/Qwen3-8B
datasets:
- nikhilchandak/OpenForesight
language:
- en
license: mit
pipeline_tag: text-generation
library_name: transformers
tags:
- forecasting
- reasoning
- question-answering
- reinforcement-learning
- calibration
---

# OpenForecaster-8B

**OpenForecaster-8B** is a specialized language model for forecasting and predicting future events. This model is post-trained from **Qwen3-8B** using reinforcement learning on the [OpenForesight dataset](https://huggingface.co/datasets/nikhilchandak/OpenForesight).

It was introduced in the paper [Scaling Open-Ended Reasoning to Predict the Future](https://huggingface.co/papers/2512.25070).

**🌐 [Website](https://openforecaster.github.io) | 📄 [Paper](https://huggingface.co/papers/2512.25070) | 💻 [Code](https://github.com/OpenForecaster/scaling-forecasting-training)**

## Model Description

OpenForecaster-8B is designed to make calibrated predictions on open-ended questions about future events. The model has been trained to:
- Provide calibrated confidence estimates when asked to do so
- Reason about uncertainty and future scenarios
- Leverage retrieved information (when provided in context) to improve predictions

## Training

This model was trained on the **OpenForesight** dataset, which contains over 52,000 forecasting questions generated from global news events. The training was done using GRPO optimizing a joint reward function combining accuracy and brier score.

**Base Model**: Qwen3-8B  
**Training Dataset**: [OpenForesight](https://huggingface.co/datasets/nikhilchandak/OpenForesight)

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "nikhilchandak/OpenForecaster-8B"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

# template 
prompt = "What is the likelihood that [future event] will occur by [date]?"
# example
prompt = "Who will become the next Prime Minister of India based on the general election to be held in 2029? Provide specific predictions with probabilities."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=8192)
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(prediction)
```

## Performance

OpenForecaster-8B achieves competitive performance with much larger models like DeepSeek-v3 and Qwen3-235B-A22B on forecasting benchmarks. Key improvements include:
- **Improved Accuracy**: Better prediction of future events
- **Better Calibration**: More reliable confidence estimates 
- **Enhanced Consistency**: Reduced logical violations in predictions

## Citation

```bibtex
@article{openforesight2025,
  title  = {Scaling Open-Ended Reasoning To Predict the Future},
  author = {Chandak, Nikhil and Goel, Shashwat and Prabhu, Ameya and Hardt, Moritz and Geiping, Jonas},
  year   = {2025}
}
```

## License

This model is released under the MIT License.

## Contact

For questions or issues, please visit our [website](https://openforecaster.github.io) or open an issue on the model repository.