R^3-SQL: Ranking Reward and Resampling for Text-to-SQL
Abstract
R$^3$-SQL addresses inconsistencies in scoring functionally equivalent SQL queries and improves candidate recall through unified reward ranking and agentic resampling techniques.
Modern Text-to-SQL systems generate multiple candidate SQL queries and rank them to judge a final prediction. However, existing methods face two limitations. First, they often score functionally equivalent SQL queries inconsistently despite identical execution results. Second, ranking cannot recover when the correct SQL is absent from the candidate pool. We propose R^3-SQL, a Text-to-SQL framework that addresses both issues through unified reward for ranking and resampling. R^3-SQL first groups candidates by execution result and ranks groups for consistency. To score each group, it combines a pairwise preference across groups with a pointwise utility from the best group rank and size, capturing relative preference, consistency, and candidate quality. To improve candidate recall, R^3-SQL introduces agentic resampling, which judges the generated candidate pool and selectively resamples when the correct SQL is likely absent. R^3-SQL achieves 75.03 execution accuracy on BIRD-dev, a new state of the art among methods using models with disclosed sizes, with consistent gains across five benchmarks.
Community
R³-SQL improves Text-to-SQL reranking by grouping execution-equivalent SQL candidates for consistent groupwise ranking and selectively resampling candidate pools when correct queries are missing, achieving state-of-the-art execution accuracy and more robust candidate selection.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Every Step Counts: Step-Level Credit Assignment for Tool-Integrated Text-to-SQL (2026)
- TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas (2026)
- DPC: Training-Free Text-to-SQL Candidate Selection via Dual-Paradigm Consistency (2026)
- PV-SQL: Synergizing Database Probing and Rule-based Verification for Text-to-SQL Agents (2026)
- You Only Judge Once: Multi-response Reward Modeling in a Single Forward Pass (2026)
- FINER-SQL: Boosting Small Language Models for Text-to-SQL (2026)
- From Isolated Scoring to Collaborative Ranking: A Comparison-Native Framework for LLM-Based Paper Evaluation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper