irisaparina's picture
Update README.md
10d6f82 verified
---
license: mit
language:
- en
base_model:
- Qwen/Qwen3-4B-Instruct-2507
tags:
- text-to-sql
- ambiguity
- reinforcement-learning
- grpo
---
# IntentRL-Ambig-Text2SQL-4B
This model is trained to handle **ambiguous text-to-SQL requests** by explicitly reasoning about user intent and producing multiple interpretation–answer pairs rather than silently committing to a single interpretation.
It is based on [Qwen/Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507), fine-tuned with **RL (DAPO/GRPO)** using a custom reward that encourages recall (covering more valid interpretations) for ambiguous questions and precision for unambiguous ones.
## Example
Given a schema and an ambiguous question:
> **Schema:** `CREATE TABLE Jobs (JobID INTEGER PRIMARY KEY, Min_Years INTEGER, Pref_Years INTEGER, Position TEXT, Salary REAL);`
>
> **Question:** Show the required experience for the best-paid role.
The model produces multiple interpretation–answer pairs:
1. **Minimum years of experience required**`SELECT Min_Years ...`
2. **Preferred years of experience**`SELECT Pref_Years ...`
3. **Both minimum and preferred years**`SELECT Min_Years, Pref_Years ...`
## Paper
[Reasoning about Intent for Ambiguous Requests](https://arxiv.org/abs/2511.10453)
**Authors:** Irina Saparina, Mirella Lapata
## Training Details
- **Base model:** Qwen3-4B-Instruct-2507
- **Method:** RL with DAPO/GRPO and a custom recall/precision reward
- **Training data:** [Ambrosia](https://ambrosia-benchmark.github.io/) text-to-SQL benchmark
- **Ambiguous examples** are upsampled to balance training
## Code
Training and evaluation code: [https://github.com/saparina/intentRL](https://github.com/saparina/intentRL)
## Citation
```bibtex
@misc{saparina2025reasoningintentambiguousrequests,
title={Reasoning about Intent for Ambiguous Requests},
author={Irina Saparina and Mirella Lapata},
year={2025},
eprint={2511.10453},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2511.10453},
}
```