Instructions to use vishnurchityala/sql-gemma3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Inference
File size: 2,251 Bytes
dba0690 8d9f705 dae81ea dba0690 8d9f705 dae81ea | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 | ---
language:
- en
license: gemma
base_model: unsloth/gemma-3-1b-it
tags:
- text-to-sql
- finetuning
datasets:
- gretelai/synthetic_text_to_sql
pipeline_tag: text-generation
---
# SQL-Gemma3
`SQL-Gemma3` is a fine-tuned version of `Gemma 3 1B Instruct` for text-to-SQL generation. It was trained on a balanced sampled subset of the [Gretel synthetic_text_to_sql dataset](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) to improve SQL generation from table schema and natural language questions.
## Model Details
- Base model: `unsloth/gemma-3-1b-it`
- Task: Natural language to SQL
- Training data: balanced sampled subset of `gretelai/synthetic_text_to_sql`
- Reported training loss: `0.201`
- Reported test loss: `0.21`
## Intended Use
This model is intended for:
- Generating SQL queries from schema-aware prompts
- Learning and experimentation with text-to-SQL workflows
- Prototyping NL-to-SQL assistants
It is not guaranteed to produce correct, executable, or secure SQL for every prompt. Review generated queries before using them in production systems.
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "vishnurchityala/sql-gemma3"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
messages = [
{
"role": "user",
"content": (
"CREATE TABLE employees(id INT, name TEXT, salary INT);\n\n"
"Find the average salary of all employees."
),
}
]
inputs = tokenizer(
tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
),
return_tensors="pt",
)
outputs = model.generate(**inputs, max_new_tokens=128, do_sample=False)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
## Limitations
- Performance is summarized here using loss only, not execution accuracy
- Output quality depends heavily on schema clarity and prompt format
- The model may generate dialect-specific or invalid SQL in some cases
## Acknowledgements
- Base model: [Gemma 3](https://huggingface.co/google)
- Dataset: [Gretel AI synthetic_text_to_sql](https://huggingface.co/datasets/gretelai/synthetic_text_to_sql) |