---
base_model: unsloth/llama-3-8B
library_name: peft
pipeline_tag: text-generation
tags:
- text-to-sql
- dpo
- lora
- transformers
- trl
- sql-generation
- database
---

# Text-to-SQL DPO Model

A Direct Preference Optimization (DPO) fine-tuned LLaMA-3-8B model specialized for text-to-SQL generation tasks. This model has been trained using LoRA (Low-Rank Adaptation) for efficient parameter-efficient fine-tuning.

## Model Details

### Model Description

This model is a fine-tuned version of LLaMA-3-8B using Direct Preference Optimization (DPO) specifically for text-to-SQL tasks. It has been trained on preference pairs to generate accurate SQL queries from natural language descriptions.

- **Developed by:** faizack
- **Model type:** Causal Language Model with LoRA adapter
- **Language(s) (NLP):** English
- **License:** Apache 2.0 (inherited from base model)
- **Finetuned from model:** unsloth/llama-3-8B

### Model Sources

- **Repository:** [Text-to-SQL DPO Repository](https://github.com/IDEAS-Incubator/text-to-sql_DPO)
- **Base Model:** [unsloth/llama-3-8B](https://huggingface.co/unsloth/llama-3-8B)

## Uses

### Direct Use

This model is designed for generating SQL queries from natural language descriptions. It can be used for:

- Converting natural language questions to SQL queries
- Database query generation
- Text-to-SQL applications
- Database interaction interfaces

### Example Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load the base model and tokenizer
base_model = "unsloth/llama-3-8B"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16)

# Load the LoRA adapter
model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")

# Generate SQL query
prompt = "Show me all users from the customers table"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

### Out-of-Scope Use

This model should not be used for:
- General-purpose text generation beyond SQL queries
- Generating malicious or harmful SQL queries
- Database operations without proper validation
- Production use without proper testing and validation

## Bias, Risks, and Limitations

### Limitations

- The model is specialized for SQL generation and may not perform well on other tasks
- Generated SQL queries should be validated before execution
- Performance may vary depending on database schema complexity
- The model may generate queries that are syntactically correct but logically incorrect

### Recommendations

- Always validate generated SQL queries before execution
- Test the model on your specific database schema
- Use appropriate safety measures when executing generated queries
- Consider the model's limitations when integrating into production systems

## How to Get Started with the Model

### Installation

```bash
pip install transformers peft torch
```

### Quick Start

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load model and adapter
base_model = "unsloth/llama-3-8B"
model = AutoModelForCausalLM.from_pretrained(base_model)
model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo")
tokenizer = AutoTokenizer.from_pretrained(base_model)

# Generate SQL
prompt = "Find all orders placed in the last 30 days"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=150, temperature=0.1)
sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(sql_query)
```

## Training Details

### Training Data

The model was trained on the `zerolink/zsql-sqlite-dpo` dataset, which contains preference pairs for text-to-SQL tasks.

### Training Procedure

#### Training Hyperparameters

- **Training regime:** DPO (Direct Preference Optimization)
- **Epochs:** 6
- **Batch size:** 2
- **Gradient accumulation:** 32
- **Learning rate:** 5e-5
- **LoRA rank:** 16
- **LoRA alpha:** 16
- **LoRA dropout:** 0.05
- **Target modules:** q_proj, v_proj

#### Training Infrastructure

- **Base model:** unsloth/llama-3-8B
- **Framework:** PEFT (Parameter-Efficient Fine-Tuning)
- **Training method:** LoRA (Low-Rank Adaptation)
- **Total steps:** 120
- **Steps per epoch:** 3660

## Technical Specifications

### Model Architecture

- **Base architecture:** LLaMA-3-8B
- **Adapter type:** LoRA
- **Trainable parameters:** ~16M (LoRA adapter only)
- **Total parameters:** ~8B (base model + adapter)

### Compute Infrastructure

- **Hardware:** GPU-based training
- **Framework versions:**
  - PEFT: 0.17.1
  - Transformers: 4.56.2
  - PyTorch: Compatible with CUDA

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{text-to-sql-dpo-2024,
  title={Text-to-SQL DPO Model},
  author={faizack},
  year={2024},
  url={https://huggingface.co/faizack/text-to-sql-dpo}
}
```

## Model Card Contact

For questions or issues related to this model, please contact the model author or open an issue in the repository.

## Framework versions

- PEFT 0.17.1
- Transformers 4.56.2