| # LoRA Adapters for `sqlchat` Model | |
| This repository contains the **LoRA (Low-Rank Adaptation) adapters** for the `nnul/sqlchat` model. These adapters represent the fine-tuned "knowledge layer" that specializes the base model for Text-to-SQL tasks. | |
| Using these adapters provides maximum flexibility. You can load them on top of the original base model to replicate the `sqlchat` model, or use them as a starting point for further fine-tuning. This approach is highly efficient for experimentation and allows for easy conversion to various quantized formats (like GGUF) with minimal quality loss. | |
| ## Model Details | |
| * **Base Model:** `Qwen/Qwen3-1.7B` | |
| * **Fine-Tuning Library:** [Unsloth](https://github.com/unslothai/unsloth) | |
| * **Technique:** LoRA (Low-Rank Adaptation) | |
| * **Rank (`r`):** 32 | |
| * **Alpha (`lora_alpha`):** 32 | |
| * **Training Dataset:** `nnul/sql-chat-dataset` (a combination of `b-mc2/sql-create-context` and `gretelai/synthetic_text_to_sql`). | |
| ## How to Use These Adapters | |
| To use these LoRA adapters, you must load them on top of the original base model using the Unsloth library. This ensures all performance optimizations are correctly applied. | |
| ### Prerequisites | |
| First, install the necessary libraries. | |
| ```bash | |
| pip install unsloth | |
| pip install "torch>=2.3.1" | |
| ``` | |
| ### Running Inference with LoRA Adapters | |
| Here is a Python script demonstrating how to load the base model and apply these LoRA adapters for inference. | |
| ```python | |
| import torch | |
| from unsloth import FastLanguageModel | |
| from transformers import TextStreamer | |
| # When loading LoRA adapters, you must specify the base model they were trained on. | |
| # Unsloth will first load the 4-bit base model, then fuse these adapters into it. | |
| print("Loading base model and applying sqlchat-lora adapters...") | |
| model, tokenizer = FastLanguageModel.from_pretrained( | |
| model_name="nnul/sqlchat-lora", # YOUR LoRA adapter repository | |
| max_seq_length=4096, | |
| dtype=None, | |
| load_in_4bit=True, | |
| ) | |
| print("Model and adapters loaded successfully.") | |
| # Optimize the model for the fastest possible inference. | |
| FastLanguageModel.for_inference(model) | |
| def generate_sql(instruction: str, context: str = ""): | |
| """ | |
| A helper function to generate SQL from a natural language prompt. | |
| """ | |
| prompt = tokenizer.apply_chat_template( | |
| [ | |
| {"role": "system", "content": "You are a helpful assistant that generates SQL queries based on natural language questions and database schemas."}, | |
| {"role": "user", "content": f"### Instruction:\n{instruction}\n\n### Context:\n{context}"}, | |
| ], | |
| tokenize=False, | |
| add_generation_prompt=True, | |
| enable_thinking=False, # Ensures direct SQL output | |
| ) | |
| inputs = tokenizer([prompt], return_tensors="pt").to("cuda") | |
| text_streamer = TextStreamer(tokenizer, skip_prompt=True, clean_up_tokenization_spaces=True) | |
| print(f"User Instruction: {instruction}") | |
| print("\nModel Output:") | |
| print("---------------------------------") | |
| _ = model.generate( | |
| **inputs, | |
| streamer=text_streamer, | |
| max_new_tokens=256, | |
| do_sample=False, # Use greedy decoding for deterministic output | |
| use_cache=True, | |
| ) | |
| print("---------------------------------\n") | |
| # --- Example Usage --- | |
| generate_sql( | |
| instruction="Which department has the most number of employees?", | |
| context="CREATE TABLE department (name VARCHAR, num_employees INTEGER)" | |
| ) | |
| ``` | |
| ## Merging the Adapters | |
| If you wish to create a standalone, merged model from these adapters (as was done for `nnul/sqlchat`), you can do so easily. | |
| ```python | |
| # Load the model and adapters as shown above | |
| model, tokenizer = FastLanguageModel.from_pretrained(model_name="nnul/sqlchat-lora", ...) | |
| # Merge and save locally | |
| model.save_pretrained_merged("sqlchat_merged_4bit", tokenizer, save_method="merged_4bit_forced") | |
| # Or, push the merged model directly to a new Hub repository | |
| # model.push_to_hub_merged("your-username/your-new-merged-repo", tokenizer, save_method="merged_4bit_forced") | |
| ``` |