SQL-Gemma3
SQL-Gemma3 is a fine-tuned version of Gemma 3 1B Instruct for text-to-SQL generation. It was trained on a balanced sampled subset of the Gretel synthetic_text_to_sql dataset to improve SQL generation from table schema and natural language questions.
Model Details
- Base model:
unsloth/gemma-3-1b-it - Task: Natural language to SQL
- Training data: balanced sampled subset of
gretelai/synthetic_text_to_sql - Reported training loss:
0.201 - Reported test loss:
0.21
Intended Use
This model is intended for:
- Generating SQL queries from schema-aware prompts
- Learning and experimentation with text-to-SQL workflows
- Prototyping NL-to-SQL assistants
It is not guaranteed to produce correct, executable, or secure SQL for every prompt. Review generated queries before using them in production systems.
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "vishnurchityala/sql-gemma3"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [
{
"role": "user",
"content": (
"CREATE TABLE employees(id INT, name TEXT, salary INT);\n\n"
"Find the average salary of all employees."
),
}
]
inputs = tokenizer(
tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
),
return_tensors="pt",
)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations
- Performance is summarized here using loss only, not execution accuracy
- Output quality depends heavily on schema clarity and prompt format
- The model may generate dialect-specific or invalid SQL in some cases
Acknowledgements
- Base model: Gemma 3
- Dataset: Gretel AI synthetic_text_to_sql
- Downloads last month
- 19