--- base_model: Qwen/Qwen3-8B datasets: - nikhilchandak/OpenForesight language: - en license: mit pipeline_tag: text-generation library_name: transformers tags: - forecasting - reasoning - question-answering - reinforcement-learning - calibration --- # OpenForecaster-8B **OpenForecaster-8B** is a specialized language model for forecasting and predicting future events. This model is post-trained from **Qwen3-8B** using reinforcement learning on the [OpenForesight dataset](https://huggingface.co/datasets/nikhilchandak/OpenForesight). It was introduced in the paper [Scaling Open-Ended Reasoning to Predict the Future](https://huggingface.co/papers/2512.25070). **🌐 [Website](https://openforecaster.github.io) | 📄 [Paper](https://huggingface.co/papers/2512.25070) | 💻 [Code](https://github.com/OpenForecaster/scaling-forecasting-training)** ## Model Description OpenForecaster-8B is designed to make calibrated predictions on open-ended questions about future events. The model has been trained to: - Provide calibrated confidence estimates when asked to do so - Reason about uncertainty and future scenarios - Leverage retrieved information (when provided in context) to improve predictions ## Training This model was trained on the **OpenForesight** dataset, which contains over 52,000 forecasting questions generated from global news events. The training was done using GRPO optimizing a joint reward function combining accuracy and brier score. **Base Model**: Qwen3-8B **Training Dataset**: [OpenForesight](https://huggingface.co/datasets/nikhilchandak/OpenForesight) ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "nikhilchandak/OpenForecaster-8B" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # template prompt = "What is the likelihood that [future event] will occur by [date]?" # example prompt = "Who will become the next Prime Minister of India based on the general election to be held in 2029? Provide specific predictions with probabilities." inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=8192) prediction = tokenizer.decode(outputs[0], skip_special_tokens=True) print(prediction) ``` ## Performance OpenForecaster-8B achieves competitive performance with much larger models like DeepSeek-v3 and Qwen3-235B-A22B on forecasting benchmarks. Key improvements include: - **Improved Accuracy**: Better prediction of future events - **Better Calibration**: More reliable confidence estimates - **Enhanced Consistency**: Reduced logical violations in predictions ## Citation ```bibtex @article{openforesight2025, title = {Scaling Open-Ended Reasoning To Predict the Future}, author = {Chandak, Nikhil and Goel, Shashwat and Prabhu, Ameya and Hardt, Moritz and Geiping, Jonas}, year = {2025} } ``` ## License This model is released under the MIT License. ## Contact For questions or issues, please visit our [website](https://openforecaster.github.io) or open an issue on the model repository.