Ellbendls commited on
Commit
fac762c
·
verified ·
1 Parent(s): 2212d7b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +168 -7
README.md CHANGED
@@ -1,11 +1,172 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # Ellbendls/Qwen-3-4b-Text_to_SQL-GGUF
2
 
3
- Derived GGUF exports of `Ellbendls/Qwen-3-4b-Text_to_SQL`
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
 
5
- Files:
6
- - Base: `Qwen-3-4b-Text_to_SQL-F16.gguf`
7
- - Quants: Qwen-3-4b-Text_to_SQL-q2_k.gguf, Qwen-3-4b-Text_to_SQL-q3_k_m.gguf, Qwen-3-4b-Text_to_SQL-q4_k_m.gguf, Qwen-3-4b-Text_to_SQL-q5_k_m.gguf, Qwen-3-4b-Text_to_SQL-q8_0.gguf
8
 
9
- Converted and quantized with llama.cpp.
10
- Attribution: base model `Ellbendls/Qwen-3-4b-Text_to_SQL` (see original license).
11
- Generated: 2025-09-17T01:33:07.726161Z
 
1
+
2
+ ---
3
+ library_name: gguf
4
+ license: apache-2.0
5
+ base_model:
6
+ - Ellbendls/Qwen-3-4b-Text_to_SQL
7
+ - Qwen/Qwen3-4B-Instruct-2507
8
+ tags:
9
+ - gguf
10
+ - llama.cpp
11
+ - qwen
12
+ - text-to-sql
13
+ - sql
14
+ - instruct
15
+ language:
16
+ - eng
17
+ - zho
18
+ - fra
19
+ - spa
20
+ - por
21
+ - deu
22
+ - ita
23
+ - rus
24
+ - jpn
25
+ - kor
26
+ - vie
27
+ - tha
28
+ - ara
29
+ pipeline_tag: text-generation
30
+ ---
31
+
32
  # Ellbendls/Qwen-3-4b-Text_to_SQL-GGUF
33
 
34
+ Quantized GGUF builds of `Ellbendls/Qwen-3-4b-Text_to_SQL` for fast CPU/GPU inference with llama.cpp-compatible runtimes.
35
+
36
+ - **Base model**. Fine-tuned from **Qwen/Qwen3-4B-Instruct-2507** for Text-to-SQL.
37
+ - **License**. Apache-2.0 (inherits from base). Keep attribution.
38
+ - **Purpose**. Turn natural language into SQL. When schema is missing, the model can infer a simple schema then produce SQL.
39
+
40
+ ## Files
41
+
42
+ Base and quantized variants:
43
+
44
+ - `Qwen-3-4b-Text_to_SQL-F16.gguf` — reference float16 export
45
+ - `Qwen-3-4b-Text_to_SQL-q2_k.gguf`
46
+ - `Qwen-3-4b-Text_to_SQL-q3_k_m.gguf`
47
+ - `Qwen-3-4b-Text_to_SQL-q4_k_m.gguf` ← good default
48
+ - `Qwen-3-4b-Text_to_SQL-q5_k_m.gguf`
49
+ - `Qwen-3-4b-Text_to_SQL-q8_0.gguf` ← near-lossless, larger
50
+
51
+ Conversion and quantization done with `llama.cpp`.
52
+
53
+ ## Recommended pick
54
+
55
+ - **Q4_K_M**. Best balance of speed and quality for laptops and small servers.
56
+ - **Q5_K_M**. Higher quality, a bit more RAM/VRAM.
57
+ - **Q8_0**. Highest quality among quants. Use if you have headroom.
58
+
59
+ ## Approximate memory needs
60
+
61
+ These are ballpark for a 4B model. Real usage varies by runtime and context length.
62
+
63
+ - Q4_K_M: 3–4 GB RAM/VRAM
64
+ - Q5_K_M: 4–5 GB
65
+ - Q8_0: 6–8 GB
66
+ - F16: 10–12 GB
67
+
68
+ ## Quick start
69
+
70
+ ### llama.cpp (CLI)
71
+
72
+ CPU only:
73
+ ```bash
74
+ ./llama-cli -m Qwen-3-4b-Text_to_SQL-q4_k_m.gguf \
75
+ -p "Generate SQL to get average salary by department in 2024." \
76
+ -n 256 -t 6
77
+ ````
78
+
79
+ NVIDIA GPU offload (build with `-DLLAMA_CUBLAS=ON`):
80
+
81
+ ```bash
82
+ ./llama-cli -m Qwen-3-4b-Text_to_SQL-q4_k_m.gguf \
83
+ -p "Generate SQL to get average salary by department in 2024." \
84
+ -n 256 -ngl 999 -t 6
85
+ ```
86
+
87
+ ### Python (llama-cpp-python)
88
+
89
+ ```python
90
+ from llama_cpp import Llama
91
+
92
+ llm = Llama(model_path="Qwen-3-4b-Text_to_SQL-q4_k_m.gguf", n_ctx=4096, n_gpu_layers=35) # set 0 for CPU-only
93
+ prompt = "Generate SQL to list total orders and revenue by month for 2024."
94
+ out = llm(prompt, max_tokens=256, temperature=0.2, top_p=0.9)
95
+ print(out["choices"][0]["text"].strip())
96
+ ```
97
+
98
+ ### LM Studio / Kobold / text-generation-webui
99
+
100
+ * Select the `.gguf` file and load.
101
+ * Set temperature 0.1–0.3 for deterministic SQL.
102
+ * Use a system prompt to anchor behavior.
103
+
104
+ ## Prompting tips (Text-to-SQL)
105
+
106
+ Use clear instructions. Give schema if you have it. Ask for SQL only.
107
+
108
+ **With schema**
109
+
110
+ ```
111
+ You are a Text-to-SQL generator. Return only SQL.
112
+ Schema:
113
+ tables: employees(emp_id, name, dept_id, salary, hired_at)
114
+ departments(dept_id, name)
115
+
116
+ Task:
117
+ Average salary by department for year 2024. Use ANSI SQL.
118
+ ```
119
+
120
+ **Without schema**
121
+
122
+ ```
123
+ You are a Text-to-SQL generator. Return only SQL.
124
+ If schema missing, assume a minimal reasonable schema.
125
+
126
+ Task:
127
+ Top 5 products by revenue in Q2 2024. Use ANSI SQL.
128
+ ```
129
+
130
+ ## Model details
131
+
132
+ * **Base**. `Qwen/Qwen3-4B-Instruct-2507` (32k context, multilingual).
133
+ * **Fine-tune**. Trained on `gretelai/synthetic_text_to_sql`.
134
+ * **Task**. NL → SQL. Capable of simple schema inference when needed.
135
+ * **Languages**. Works best in English. Can follow prompts in several languages from the base model.
136
+
137
+ ## Conversion reproducibility
138
+
139
+ Export used:
140
+
141
+ ```bash
142
+ python convert_hf_to_gguf.py /path/to/hf_model --outtype f16 --outfile Qwen-3-4b-Text_to_SQL-F16.gguf
143
+ ```
144
+
145
+ Quantization used:
146
+
147
+ ```bash
148
+ ./llama-quantize Qwen-3-4b-Text_to_SQL-F16.gguf Qwen-3-4b-Text_to_SQL-q4_k_m.gguf Q4_K_M
149
+ # likewise for q2_k, q3_k_m, q5_k_m, q8_0
150
+ ```
151
+
152
+ ## Intended use and limits
153
+
154
+ * **Use**. Analytics, reporting, dashboards, data exploration, SQL prototyping.
155
+ * **Limits**. No database connectivity. It only generates SQL text. Validate and test queries before use in production. Provide real schema for best accuracy.
156
+
157
+ ## Attribution
158
+
159
+ * Base model: [`Qwen/Qwen3-4B-Instruct-2507`](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507)
160
+ * Fine-tuned model: [`Ellbendls/Qwen-3-4b-Text_to_SQL`](https://huggingface.co/Ellbendls/Qwen-3-4b-Text_to_SQL)
161
+
162
+ ## License
163
+
164
+ Apache-2.0. Include license and NOTICE from upstream when redistributing the weights. Do not imply endorsement from Qwen or original authors.
165
+
166
+ ## Changelog
167
 
168
+ * 2025-09-17. Initial GGUF release. Added q2\_k, q3\_k\_m, q4\_k\_m, q5\_k\_m, q8\_0, and F16.
 
 
169
 
170
+ ```
171
+ ::contentReference[oaicite:0]{index=0}
172
+ ```