--- base_model: unsloth/llama-3-8B library_name: peft pipeline_tag: text-generation tags: - text-to-sql - dpo - lora - transformers - trl - sql-generation - database --- # Text-to-SQL DPO Model A Direct Preference Optimization (DPO) fine-tuned LLaMA-3-8B model specialized for text-to-SQL generation tasks. This model has been trained using LoRA (Low-Rank Adaptation) for efficient parameter-efficient fine-tuning. ## Model Details ### Model Description This model is a fine-tuned version of LLaMA-3-8B using Direct Preference Optimization (DPO) specifically for text-to-SQL tasks. It has been trained on preference pairs to generate accurate SQL queries from natural language descriptions. - **Developed by:** faizack - **Model type:** Causal Language Model with LoRA adapter - **Language(s) (NLP):** English - **License:** Apache 2.0 (inherited from base model) - **Finetuned from model:** unsloth/llama-3-8B ### Model Sources - **Repository:** [Text-to-SQL DPO Repository](https://github.com/IDEAS-Incubator/text-to-sql_DPO) - **Base Model:** [unsloth/llama-3-8B](https://huggingface.co/unsloth/llama-3-8B) ## Uses ### Direct Use This model is designed for generating SQL queries from natural language descriptions. It can be used for: - Converting natural language questions to SQL queries - Database query generation - Text-to-SQL applications - Database interaction interfaces ### Example Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel import torch # Load the base model and tokenizer base_model = "unsloth/llama-3-8B" tokenizer = AutoTokenizer.from_pretrained(base_model) model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype=torch.float16) # Load the LoRA adapter model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo") # Generate SQL query prompt = "Show me all users from the customers table" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=100) response = tokenizer.decode(outputs[0], skip_special_tokens=True) print(response) ``` ### Out-of-Scope Use This model should not be used for: - General-purpose text generation beyond SQL queries - Generating malicious or harmful SQL queries - Database operations without proper validation - Production use without proper testing and validation ## Bias, Risks, and Limitations ### Limitations - The model is specialized for SQL generation and may not perform well on other tasks - Generated SQL queries should be validated before execution - Performance may vary depending on database schema complexity - The model may generate queries that are syntactically correct but logically incorrect ### Recommendations - Always validate generated SQL queries before execution - Test the model on your specific database schema - Use appropriate safety measures when executing generated queries - Consider the model's limitations when integrating into production systems ## How to Get Started with the Model ### Installation ```bash pip install transformers peft torch ``` ### Quick Start ```python from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel # Load model and adapter base_model = "unsloth/llama-3-8B" model = AutoModelForCausalLM.from_pretrained(base_model) model = PeftModel.from_pretrained(model, "faizack/text-to-sql-dpo") tokenizer = AutoTokenizer.from_pretrained(base_model) # Generate SQL prompt = "Find all orders placed in the last 30 days" inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_length=150, temperature=0.1) sql_query = tokenizer.decode(outputs[0], skip_special_tokens=True) print(sql_query) ``` ## Training Details ### Training Data The model was trained on the `zerolink/zsql-sqlite-dpo` dataset, which contains preference pairs for text-to-SQL tasks. ### Training Procedure #### Training Hyperparameters - **Training regime:** DPO (Direct Preference Optimization) - **Epochs:** 6 - **Batch size:** 2 - **Gradient accumulation:** 32 - **Learning rate:** 5e-5 - **LoRA rank:** 16 - **LoRA alpha:** 16 - **LoRA dropout:** 0.05 - **Target modules:** q_proj, v_proj #### Training Infrastructure - **Base model:** unsloth/llama-3-8B - **Framework:** PEFT (Parameter-Efficient Fine-Tuning) - **Training method:** LoRA (Low-Rank Adaptation) - **Total steps:** 120 - **Steps per epoch:** 3660 ## Technical Specifications ### Model Architecture - **Base architecture:** LLaMA-3-8B - **Adapter type:** LoRA - **Trainable parameters:** ~16M (LoRA adapter only) - **Total parameters:** ~8B (base model + adapter) ### Compute Infrastructure - **Hardware:** GPU-based training - **Framework versions:** - PEFT: 0.17.1 - Transformers: 4.56.2 - PyTorch: Compatible with CUDA ## Citation If you use this model in your research, please cite: ```bibtex @misc{text-to-sql-dpo-2024, title={Text-to-SQL DPO Model}, author={faizack}, year={2024}, url={https://huggingface.co/faizack/text-to-sql-dpo} } ``` ## Model Card Contact For questions or issues related to this model, please contact the model author or open an issue in the repository. ## Framework versions - PEFT 0.17.1 - Transformers 4.56.2