thanhdath commited on
Commit
920a8c2
·
verified ·
1 Parent(s): c937783

Replace README with minimal version (GitHub link + citation)

Browse files
Files changed (1) hide show
  1. README.md +7 -139
README.md CHANGED
@@ -1,145 +1,13 @@
1
- ---
2
- base_model:
3
- - griffith-bigdata/Qwen-2.5-Coder-0.5B-SQL-Writer
4
- license: apache-2.0
5
- language:
6
- - en
7
- tags:
8
- - text-to-sql
9
- - bird
10
- - grpo
11
- - finer-sql
12
- - code
13
- library_name: transformers
14
- pipeline_tag: text-generation
15
- ---
16
-
17
- # FINER-SQL-0.5B-BIRD
18
-
19
- A small but capable 0.5 B-parameter Text-to-SQL model fine-tuned from
20
- [`griffith-bigdata/Qwen-2.5-Coder-0.5B-SQL-Writer`](https://huggingface.co/griffith-bigdata/Qwen-2.5-Coder-0.5B-SQL-Writer)
21
- with GRPO + the FINER-SQL dense rewards (Memory + Atomic).
22
-
23
- ✅ **50.85% Execution Accuracy on BIRD Dev** (n=30, value-aware voting). Runs on a 4-8 GB GPU.
24
-
25
- 📄 See other models: https://huggingface.co/collections/griffith-bigdata/finer-sql
26
- 📄 GitHub: https://github.com/thanhdath/finer-sql/tree/main
27
-
28
- ---
29
-
30
- ## FINER-SQL Model Family — Comparison Across All Sizes
31
-
32
- | Model | Params | BIRD Dev (n=30, vav) | Spider Dev (n=30, vav, +agg_hint) |
33
- |-------|--------|---------------------|----------------------------------|
34
- | [FINER-SQL-3B-BIRD](https://huggingface.co/griffith-bigdata/FINER-SQL-3B-BIRD) | 3 B | **67.54%** ✅ | 83.8% |
35
- | [FINER-SQL-3B-Spider](https://huggingface.co/griffith-bigdata/FINER-SQL-3B-Spider) | 3 B | 63.04% | **85.10%** ✅ |
36
- | **FINER-SQL-0.5B-BIRD** *(this model)* | 0.5 B | **50.85%** ✅ | 68.6% |
37
- | [FINER-SQL-0.5B-Spider](https://huggingface.co/griffith-bigdata/FINER-SQL-0.5B-Spider) | 0.5 B | TBD | **75.0%** ✅ |
38
-
39
- The 0.5 B family demonstrates that GRPO + FINER rewards scale down to deployment-friendly sizes while retaining most of the gain.
40
-
41
- ---
42
-
43
- ## Inference
44
-
45
- ### Quick start (vLLM)
46
-
47
- ```python
48
- from vllm import LLM, SamplingParams
49
-
50
- llm = LLM(
51
- model="griffith-bigdata/FINER-SQL-0.5B-BIRD",
52
- dtype="bfloat16",
53
- max_model_len=4096,
54
- gpu_memory_utilization=0.7,
55
- )
56
-
57
- system_prompt = """You are a meticulous SQL expert. Generate a single, correct SQL query for the user question and the provided database schema.
58
- Follow this exact response format:
59
-
60
- Rules:
61
- - Output exactly one SQL statement.
62
- - The SQL must be executable on SQLite.
63
- - Do not include any explanatory text.
64
- - Output one SQL statement only. Do not include any extra text, tags, or code fences."""
65
-
66
- sampling = SamplingParams(n=30, temperature=1.0, max_tokens=2048)
67
- messages = [
68
- {"role": "system", "content": system_prompt},
69
- {"role": "user", "content": f"Database Schema:\n{schema}\n\nQuestion: {question}\n\nEvidence: {evidence}"},
70
- ]
71
- output = llm.chat(messages, sampling)
72
- candidate_sqls = [c.text.split("</think>")[-1].strip() for c in output[0].outputs]
73
- # Apply majority voting (vav) — see GitHub repo
74
- ```
75
-
76
- ### Recommended evaluation pipeline
77
-
78
- 1. Generate n=30 candidates with temperature=1.0
79
- 2. Execute each candidate; group results
80
- 3. Pick from the largest non-empty success group (value-aware voting, "vav")
81
- 4. Score with the official BIRD evaluator
82
-
83
- This pipeline gives **50.85% MV** on BIRD Dev V2 prompts (best 0.5 B result).
84
-
85
- ---
86
-
87
- ## Detailed BIRD Dev results (V2 prompts, n=30, vav)
88
-
89
- | Difficulty | Count | Execution Accuracy |
90
- |------------|-------|--------------------|
91
- | Simple | 925 | ~58% |
92
- | Moderate | 464 | ~42% |
93
- | Challenging | 145 | ~38% |
94
- | **All** | **1534** | **50.85%** |
95
-
96
- Recall@30: **68.32%** (any-correct rate among 30 candidates).
97
-
98
- ---
99
-
100
- ## Cross-benchmark: this model on Spider Dev (zero-shot)
101
-
102
- | Setup | Spider Official EX |
103
- |-------|--------------------|
104
- | Default | 68.6% |
105
- | FINER-SQL-0.5B-Spider (specialist) | **75.0%** |
106
-
107
- For Spider use-cases, the [FINER-SQL-0.5B-Spider](https://huggingface.co/griffith-bigdata/FINER-SQL-0.5B-Spider) checkpoint is preferred (+6.4 pp).
108
-
109
- ---
110
-
111
- ## Training
112
-
113
- | Parameter | Value |
114
- |-----------|-------|
115
- | Base model | `griffith-bigdata/Qwen-2.5-Coder-0.5B-SQL-Writer` |
116
- | Algorithm | GRPO |
117
- | Train data | BIRD train (V2 prompts, top-30 GRAST) |
118
- | Total steps | 4000 (this checkpoint = 3000) |
119
- | Learning rate | 8e-6 |
120
- | Num generations per prompt | 32 |
121
- | Gradient accumulation | 32 |
122
- | Max completion length | 2048 |
123
- | Max prompt length | 2048 |
124
- | Temperature (rollout) | 1.0 |
125
- | Selection during eval | vav (value-aware voting) |
126
- | Rewards | Execution + Atomic + Memory + Format |
127
- | Intrinsic Top-K | 20 (ChromaDB) |
128
-
129
- ---
130
-
131
- ## License
132
-
133
- Inherits the base model's license (Apache 2.0).
134
-
135
- ---
136
 
137
  ## Citation
138
 
139
  ```bibtex
140
- @article{finer-sql-2026,
141
- title = {FINER-SQL: Fine-grained reasoning rewards for small Text-to-SQL models},
142
- author = {Thanh Dat and others},
143
- year = {2026},
 
 
144
  }
145
  ```
 
1
+ 📄 GitHub: https://github.com/thanhdath/finer-sql
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
  ## Citation
4
 
5
  ```bibtex
6
+ @inproceedings{finersql,
7
+ author = {Thanh Dat Hoang and Thanh Trung Huynh and Matthias Weidlich and Thanh Tam Nguyen and Tong Chen and Hongzhi Yin and Quoc Viet Hung Nguyen},
8
+ title = {Boosting Small Language Models for Text-to-SQL with Fine-Grained Execution Feedback and Cost-Efficient Rewards},
9
+ booktitle = {ICDE},
10
+ publisher = {IEEE},
11
+ year = {2026},
12
  }
13
  ```