|
|
--- |
|
|
language: en |
|
|
tags: |
|
|
- sql |
|
|
- code-generation |
|
|
- reinforcement-learning |
|
|
- text-generation |
|
|
datasets: |
|
|
- spider |
|
|
|
|
|
--- |
|
|
|
|
|
# Model Card for RL-GRPO-SQL-Model |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
- **Model type**: Fine-tuned Causal Language Model with Reinforcement Learning |
|
|
- **Training approach**: Reinforcement Learning with GRPO (Group Relative Policy Optimization) |
|
|
- **Task**: SQL generation and understanding |
|
|
- **Developed by**: Ali Assi |
|
|
|
|
|
## Training Data |
|
|
|
|
|
- **Data sources**: Spider train set |
|
|
- **Preprocessing**: Parsing and validation |
|
|
- **Languages**: English |
|
|
|
|
|
## How to Use |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("ALI-USER/rl-grpo-sql-model") |
|
|
model = AutoModelForCausalLM.from_pretrained("ALI-USER/rl-grpo-sql-model") |
|
|
|
|
|
# Example usage |
|
|
prompt = "Generate SQL for: Find all customers with orders over $100" |
|
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
|
outputs = model.generate(**inputs, max_length=512) |
|
|
print(tokenizer.decode(outputs[0])) |
|
|
``` |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Model performance may vary depending on database schema complexity |
|
|
|
|
|
## Ethical Considerations |
|
|
|
|
|
- May generate SQL queries that are inefficient or unsafe if not properly validated |
|
|
- Should be used with query validation before execution |
|
|
|
|
|
## Intended Uses |
|
|
|
|
|
**Primary use cases:** |
|
|
- Natural language to SQL translation |
|
|
- SQL code generation assistance |
|
|
- Educational purposes for SQL understanding |
|
|
|
|
|
**Out-of-scope uses:** |
|
|
- Direct production deployment without query validation |
|
|
- Non-English language queries (not trained for this) |