Zheyuan Zhao commited on
Commit
1e0227f
·
verified ·
1 Parent(s): a49e9d7

Update model card: add GitHub link, design docs, and benchmark setup guide

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -31,6 +31,8 @@ model-index:
31
 
32
  A fine-tuned [Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) model for generating **Pipe SQL** through multi-turn tool-calling conversations.
33
 
 
 
34
  ## What is Pipe SQL?
35
 
36
  Pipe SQL is a more readable SQL syntax that uses the `|>` (pipe) operator to chain operations in a linear, top-to-bottom flow:
@@ -56,6 +58,17 @@ This is transpiled to standard SQL via [sqlglot](https://github.com/tobymao/sqlg
56
  | **Attention Heads** | 12 (2 KV heads) |
57
  | **Context Length** | 2048 tokens (training) |
58
 
 
 
 
 
 
 
 
 
 
 
 
59
  ## Training
60
 
61
  The model was fine-tuned using **QLoRA** on multi-turn tool-calling conversations for text-to-SQL generation.
@@ -160,7 +173,7 @@ Tables in database 'concert_singer':
160
 
161
  ### Inference
162
 
163
- For inference with the correct chat template, see the evaluation server code in the [sqlglot repository](https://github.com/nittygritty-zzy/sqlglot/tree/main/evaluation/server).
164
 
165
  ## Reproducing the Benchmark
166
 
 
31
 
32
  A fine-tuned [Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) model for generating **Pipe SQL** through multi-turn tool-calling conversations.
33
 
34
+ **GitHub**: [nittygritty-zzy/sqlglot](https://github.com/nittygritty-zzy/sqlglot)
35
+
36
  ## What is Pipe SQL?
37
 
38
  Pipe SQL is a more readable SQL syntax that uses the `|>` (pipe) operator to chain operations in a linear, top-to-bottom flow:
 
58
  | **Attention Heads** | 12 (2 KV heads) |
59
  | **Context Length** | 2048 tokens (training) |
60
 
61
+ ## Design Documents
62
+
63
+ The full design and methodology behind this project is documented in the following design docs (also available in [docs/design/](https://github.com/nittygritty-zzy/sqlglot/tree/main/docs/design) on GitHub):
64
+
65
+ | Document | Description |
66
+ |----------|-------------|
67
+ | [Fine-Tuning Design Doc](docs/pipe-sql-fine-tuning-design-doc.md) | End-to-end system design for incremental pipe SQL synthesis and specialized fine-tuning of 1.5B-7B models |
68
+ | [Decompiler Design Doc](docs/pipe-sql-decompiler-design-doc.md) | Standard SQL to pipe SQL decompiler — the deterministic data generation component |
69
+ | [Validation Loop Design Doc](docs/pipe-sql-validation-loop-design-doc.md) | SQLite round-trip validation and feedback loop to ensure semantic correctness |
70
+ | [Training Reproduction Guide](docs/pipe-sql-training-reproduction-guide.md) | Step-by-step guide to reproduce the full training pipeline from scratch |
71
+
72
  ## Training
73
 
74
  The model was fine-tuned using **QLoRA** on multi-turn tool-calling conversations for text-to-SQL generation.
 
173
 
174
  ### Inference
175
 
176
+ For inference with the correct chat template, see the [evaluation server code](https://github.com/nittygritty-zzy/sqlglot/tree/main/evaluation/server) on GitHub.
177
 
178
  ## Reproducing the Benchmark
179