FinQA Overview
FinQA is a financial question-answering agent fine-tuned from Qwen3-4B-Instruct-2507 using reinforcement learning (RL). The model answers questions about SEC 10-K financial statements using specialized tools (SQL queries, table lookup, calculators), achieving 59.70% accuracy on Snorkel Finance Benchmark and 26.6% on Snorkel Finance Reasoning.
Data
Our training dataset is built from SEC 10-K filings and consists of 5,110 question-answer pairs across:
- 207 companies spanning multiple sectors
- 6,923 financial tables extracted from 10-K filings
- Single-table questions: Direct lookups and calculations from individual tables
- Multi-table questions: Cross-table reasoning requiring data from multiple sources
The dataset is available on HuggingFace.
Tools
The agent uses 4 specialized tools for financial analysis:
| Tool | Description |
|---|---|
get_table_names |
List available tables for a given company |
get_table_info |
Get table metadata, columns, dtypes, and sample values |
sql_query |
Execute SQL queries on financial tables (SQLite) |
calculator |
Evaluate mathematical expressions |
Training
We fine-tune Qwen3-4B-Instruct-2507 using GRPO with LLM-as-judge rewards for correctness evaluation. A more detailed description of the training recipe can be found in our documentation.
Evaluation
| Model | FinQA | FinQA Reasoning |
|---|---|---|
| Qwen3-4B-Instruct-2507 (Base) | 27.90% | 13.90% |
| gpt-5-nano-2025-08-07 | 50.00% | 26.60% |
| Qwen3-235B-A22B | 51.37% | 18.90% |
| rLLM-FinQA-4B (Ours) | 59.70% | 26.60% |
| Gemini-2.5-Pro-Preview | 60.60% | 34.60% |
| GPT-4.1-2025-04-14 | 62.70% | 37.90% |
| o3-mini-2025-01-31 | 63.79% | 30.37% |
Serving FinQA
Start a vLLM server and run the agent:
python -m vllm.entrypoints.openai.api_server \
--model rLLM/rLLM-FinQA-4B \
--host 0.0.0.0 \
--port 30000 \
--dtype bfloat16
python -m projects.finqa.run_finqa
For detailed setup instructions, see the project README.
Acknowledgement
- This is a joint collaboration between the rLLM team at UC Berkeley and Snorkel AI.
- Our model is trained on top of
Qwen3-4B-Instruct-2507. - Our work is done as part of Berkeley Sky Computing Lab.
Citation
@misc{rllm2026finqa,
title={FinQA: Training Financial Agents with Reinforcement Learning},
author={Manan Roongta and Sijun Tan and Bhavishya Pohani and Charles Dickens and Christopher Glaze},
year={2026},
howpublished={\url{https://rllm-project.com/post.html?post=finqa.md}},
note={Blog Post}
}
- Downloads last month
- 27
Model tree for rLLM/rLLM-FinQA-4B
Base model
Qwen/Qwen3-4B-Instruct-2507