anonymous-2321 commited on
Commit
7f8f89f
·
verified ·
1 Parent(s): b0f26e0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +98 -0
README.md ADDED
@@ -0,0 +1,98 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: Qwen/Qwen3-4B
3
+ library_name: transformers
4
+ tags:
5
+ - generated_from_trainer
6
+ - open-r1
7
+ - Text2SQL
8
+ - Reasoning
9
+ licence: apache-2.0
10
+ license: apache-2.0
11
+ language:
12
+ - en
13
+ ---
14
+
15
+ # Model Information
16
+
17
+ This model is the reasoning model for the Text-to-SQL task introduced in [Think2SQL: Blueprinting Reward Density and Advantage Scaling for Effective Text-to-SQL Reasoning]()
18
+
19
+
20
+ This model is a fine-tuned version of [Qwen/Qwen3-4B](https://huggingface.co/Qwen/Qwen3-4B) on the [BIRD](https://bird-bench.github.io/) dataset.
21
+ It has been trained using [TRL](https://github.com/huggingface/trl).
22
+
23
+
24
+
25
+ ## Quick start
26
+
27
+ The best model performance is given with its System and User prompts.
28
+ The model is intended to be used with three inputs: question, evidence, and the database schema.
29
+
30
+
31
+ Starting with `transformers >= 4.43.0` onward, you can run conversational inference using the Transformers `pipeline` abstraction or by leveraging the Auto classes with the `generate()` function.
32
+
33
+ Make sure to update your transformers installation via `pip install --upgrade transformers`.
34
+
35
+ ```python
36
+ import transformers
37
+ import torch
38
+ model_id = "anonymous-2321/Think2SQL-4B"
39
+ pipeline = transformers.pipeline(
40
+ "text-generation",
41
+ model=model_id,
42
+ model_kwargs={"torch_dtype": torch.bfloat16},
43
+ device_map="auto",
44
+ )
45
+
46
+ system_message ="""
47
+ You are a data science expert that provides well-reasoned and detailed responses. Your task is to understand the schema and generate a valid SQL query to answer the question.
48
+ You first think about the reasoning process as an internal monologue and then provide the user with the answer.
49
+ Respond in the following format:
50
+ <reasoning>
51
+ ...
52
+ </reasoning>
53
+ <answer>
54
+ ...
55
+ </answer>
56
+ """.strip()
57
+
58
+ user_message = """
59
+ Answer the following question with the SQL code. Use the piece of evidence and base your answer on the database schema.
60
+ Given the question, the evidence and the database schema, return in the <answer> tags only the SQL script that addresses the question.
61
+
62
+ Database Engine:
63
+ SQLite
64
+
65
+ Question:
66
+ {{ question }}
67
+
68
+ Evidence:
69
+ {{ evidence }}
70
+
71
+ Database Schema:
72
+ {{ schema }}
73
+ """
74
+
75
+
76
+ messages = [
77
+ {"role": "system", "content": system_message},
78
+ {"role": "user", "content": user_message},
79
+ ]
80
+
81
+ outputs = pipeline(
82
+ messages,
83
+ max_new_tokens=4096,
84
+ temperature=0.6,
85
+ top_p=0.95,
86
+ top_k=20
87
+ )
88
+ print(outputs[0]["generated_text"][-1])
89
+ ```
90
+
91
+ ## 📖 Overview
92
+ Think2SQL is a systematic study on injecting reasoning capabilities into Text-to-SQL through Reinforcement Learning with Verifiable Rewards (RLVR). We uncover the critical interplay between reward density, advantage scaling, and model capacity, proposing novel execution-guided dense rewards and optimal scaling strategies. Our 4B-parameter model achieves reasoning capabilities competitive with state-of-the-art models, while providing a comprehensive analysis for optimizing Text-to-SQL reasoning under computational constraints.
93
+
94
+ **Key Contributions:**
95
+ - Execution-guided dense reward function that outperforms binary signals
96
+ - Analysis of advantage scaling mechanics for models of different sizes
97
+ - Evaluation of cold start effects and supervised fine-tuning impact
98
+ - Pareto frontier mapping for training efficiency optimization