Update README.md

595788b verified 21 days ago

4.5 kB

	---
	license: apache-2.0
	library_name: transformers
	datasets:
	- rLLM/rLLM-FinQA-Dataset
	language:
	- en
	base_model:
	- Qwen/Qwen3-4B-Instruct-2507
	pipeline_tag: text-generation
	tags:
	- finance
	- tool-use
	- agent
	---
	<div align="center">
	<span style="font-family: default; font-size: 1.5em;">FinQA</span>
	<div>
	Training Financial Agents with Reinforcement Learning
	</div>
	</div>
	<br>
	<div align="center" style="line-height: 1;">
	<a href="https://github.com/rllm-org/rllm" style="margin: 2px;">
	<img alt="Code" src="https://img.shields.io/badge/FinQA-000000?style=for-the-badge&logo=github&logoColor=000&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
	</a>
	<a href="https://rllm-project.com/post.html?post=finqa.md" target="_blank" style="margin: 2px;">
	<img alt="Blog" src="https://img.shields.io/badge/Blog-%23000000.svg?style=for-the-badge&logo=notion&logoColor=white" style="display: inline-block; vertical-align: middle;"/>
	</a>
	<a href="https://x.com/rllm_project" style="margin: 2px;">
	<img alt="X.ai" src="https://img.shields.io/badge/rLLM-white?style=for-the-badge&logo=X&logoColor=000&color=000&labelColor=white" style="display: inline-block; vertical-align: middle;"/>
	</a>
	<a href="https://huggingface.co/rLLM" style="margin: 2px;">
	<img alt="Hugging Face" src="https://img.shields.io/badge/rLLM-fcd022?style=for-the-badge&logo=huggingface&logoColor=000&labelColor" style="display: inline-block; vertical-align: middle;"/>
	</a>
	</div>
	</div>
	</div>

	## FinQA Overview

	FinQA is a financial question-answering agent fine-tuned from Qwen3-4B-Instruct-2507 using reinforcement learning (RL). The model answers questions about SEC 10-K financial statements using specialized tools (SQL queries, table lookup, calculators), achieving 59.70% accuracy on Snorkel Finance Benchmark and 26.6% on Snorkel Finance Reasoning.

	## Data

	Our training dataset is built from SEC 10-K filings and consists of 5,110 question-answer pairs across:
	- 207 companies spanning multiple sectors
	- 6,923 financial tables extracted from 10-K filings
	- Single-table questions: Direct lookups and calculations from individual tables
	- Multi-table questions: Cross-table reasoning requiring data from multiple sources

	The dataset is available on [HuggingFace](https://huggingface.co/datasets/rLLM/rLLM-FinQA-Dataset).

	## Tools

	The agent uses 4 specialized tools for financial analysis:

	\| Tool \| Description \|
	\|------\|-------------\|
	\| `get_table_names` \| List available tables for a given company \|
	\| `get_table_info` \| Get table metadata, columns, dtypes, and sample values \|
	\| `sql_query` \| Execute SQL queries on financial tables (SQLite) \|
	\| `calculator` \| Evaluate mathematical expressions \|

	## Training

	We fine-tune Qwen3-4B-Instruct-2507 using GRPO with LLM-as-judge rewards for correctness evaluation. A more detailed description of the training recipe can be found in our [documentation](https://rllm-project.readthedocs.io/en/latest/projects/finqa/).

	## Evaluation

	\| Model \| FinQA \| FinQA Reasoning \|
	\|-------\|-------\|-----------------\|
	\| Qwen3-4B-Instruct-2507 (Base) \| 27.90% \| 13.90% \|
	\| gpt-5-nano-2025-08-07 \| 50.00% \| 26.60% \|
	\| Qwen3-235B-A22B \| 51.37% \| 18.90% \|
	\| rLLM-FinQA-4B (Ours) \| 59.70% \| 26.60% \|
	\| Gemini-2.5-Pro-Preview \| 60.60% \| 34.60% \|
	\| GPT-4.1-2025-04-14 \| 62.70% \| 37.90% \|
	\| o3-mini-2025-01-31 \| 63.79% \| 30.37% \|


	## Serving FinQA

	Start a vLLM server and run the agent:

	```bash
	python -m vllm.entrypoints.openai.api_server \
	--model rLLM/rLLM-FinQA-4B \
	--host 0.0.0.0 \
	--port 30000 \
	--dtype bfloat16

	python -m projects.finqa.run_finqa
	```

	For detailed setup instructions, see the [project README](https://github.com/rllm-org/rllm/tree/main/projects/finqa).

	## Acknowledgement

	- This is a joint collaboration between the [rLLM](https://github.com/rllm-org/rllm) team at UC Berkeley and [Snorkel AI](https://snorkel.ai/).
	- Our model is trained on top of [`Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507).
	- Our work is done as part of [Berkeley Sky Computing Lab](https://skycomputing.berkeley.edu/).

	## Citation

	```bibtex
	@misc{rllm2026finqa,
	title={FinQA: Training Financial Agents with Reinforcement Learning},
	author={Manan Roongta and Sijun Tan and Bhavishya Pohani and Charles Dickens and Christopher Glaze},
	year={2026},
	howpublished={\url{https://rllm-project.com/post.html?post=finqa.md}},
	note={Blog Post}
	}
	```