sql-gemma3 / README.md
vishnurchityala's picture
Update README.md
dae81ea verified
---
language:
- en
license: gemma
base_model: unsloth/gemma-3-1b-it
tags:
- text-to-sql
- finetuning
datasets:
- gretelai/synthetic_text_to_sql
pipeline_tag: text-generation
---
# SQL-Gemma3
`SQL-Gemma3` is a fine-tuned version of `Gemma 3 1B Instruct` for text-to-SQL generation. It was trained on a balanced sampled subset of the [Gretel synthetic_text_to_sql dataset](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) to improve SQL generation from table schema and natural language questions.
## Model Details
- Base model: `unsloth/gemma-3-1b-it`
- Task: Natural language to SQL
- Training data: balanced sampled subset of `gretelai/synthetic_text_to_sql`
- Reported training loss: `0.201`
- Reported test loss: `0.21`
## Intended Use
This model is intended for:
- Generating SQL queries from schema-aware prompts
- Learning and experimentation with text-to-SQL workflows
- Prototyping NL-to-SQL assistants
It is not guaranteed to produce correct, executable, or secure SQL for every prompt. Review generated queries before using them in production systems.
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "vishnurchityala/sql-gemma3"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [
{
"role": "user",
"content": (
"CREATE TABLE employees(id INT, name TEXT, salary INT);\n\n"
"Find the average salary of all employees."
),
}
]
inputs = tokenizer(
tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
),
return_tensors="pt",
)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Limitations
- Performance is summarized here using loss only, not execution accuracy
- Output quality depends heavily on schema clarity and prompt format
- The model may generate dialect-specific or invalid SQL in some cases
## Acknowledgements
- Base model: [Gemma 3](https://huggingface.co/google)
- Dataset: [Gretel AI synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql)