nnul
/

sqlchat-lora

Safetensors

Model card Files Files and versions

xet

Community

nnul commited on Jul 13, 2025

Commit

b976900

verified ·

1 Parent(s): a8bcc87

Update README.md

Browse files

Files changed (1) hide show

README.md +102 -22

README.md CHANGED Viewed

@@ -1,22 +1,102 @@
----
-base_model: unsloth/Qwen3-1.7B-unsloth-bnb-4bit
-tags:
-- text-generation-inference
-- transformers
-- unsloth
-- qwen3
-- trl
-license: apache-2.0
-language:
-- en
----
-# Uploaded  model
-- **Developed by:** nnul
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/Qwen3-1.7B-unsloth-bnb-4bit
-This qwen3 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

+# LoRA Adapters for `sqlchat` Model
+This repository contains the **LoRA (Low-Rank Adaptation) adapters** for the `nnul/sqlchat` model. These adapters represent the fine-tuned "knowledge layer" that specializes the base model for Text-to-SQL tasks.
+Using these adapters provides maximum flexibility. You can load them on top of the original base model to replicate the `sqlchat` model, or use them as a starting point for further fine-tuning. This approach is highly efficient for experimentation and allows for easy conversion to various quantized formats (like GGUF) with minimal quality loss.
+## Model Details
+*   **Base Model:** `Qwen/Qwen3-1.7B`
+*   **Fine-Tuning Library:** [Unsloth](https://github.com/unslothai/unsloth)
+*   **Technique:** LoRA (Low-Rank Adaptation)
+    *   **Rank (`r`):** 32
+    *   **Alpha (`lora_alpha`):** 32
+*   **Training Dataset:** `nnul/sql-chat-dataset` (a combination of `b-mc2/sql-create-context` and `gretelai/synthetic_text_to_sql`).
+## How to Use These Adapters
+To use these LoRA adapters, you must load them on top of the original base model using the Unsloth library. This ensures all performance optimizations are correctly applied.
+### Prerequisites
+First, install the necessary libraries.
+```bash
+pip install unsloth
+pip install "torch>=2.3.1"
+```
+### Running Inference with LoRA Adapters
+Here is a Python script demonstrating how to load the base model and apply these LoRA adapters for inference.
+```python
+import torch
+from unsloth import FastLanguageModel
+from transformers import TextStreamer
+# When loading LoRA adapters, you must specify the base model they were trained on.
+# Unsloth will first load the 4-bit base model, then fuse these adapters into it.
+print("Loading base model and applying sqlchat-lora adapters...")
+model, tokenizer = FastLanguageModel.from_pretrained(
+    model_name="nnul/sqlchat-lora", # YOUR LoRA adapter repository
+    max_seq_length=4096,
+    dtype=None,
+    load_in_4bit=True,
+)
+print("Model and adapters loaded successfully.")
+# Optimize the model for the fastest possible inference.
+FastLanguageModel.for_inference(model)
+def generate_sql(instruction: str, context: str = ""):
+    """
+    A helper function to generate SQL from a natural language prompt.
+    """
+    prompt = tokenizer.apply_chat_template(
+        [
+            {"role": "system", "content": "You are a helpful assistant that generates SQL queries based on natural language questions and database schemas."},
+            {"role": "user", "content": f"### Instruction:\n{instruction}\n\n### Context:\n{context}"},
+        ],
+        tokenize=False,
+        add_generation_prompt=True,
+        enable_thinking=False, # Ensures direct SQL output
+    )
+    inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
+    text_streamer = TextStreamer(tokenizer, skip_prompt=True, clean_up_tokenization_spaces=True)
+    print(f"User Instruction: {instruction}")
+    print("\nModel Output:")
+    print("---------------------------------")
+    _ = model.generate(
+        **inputs,
+        streamer=text_streamer,
+        max_new_tokens=256,
+        do_sample=False, # Use greedy decoding for deterministic output
+        use_cache=True,
+    )
+    print("---------------------------------\n")
+# --- Example Usage ---
+generate_sql(
+    instruction="Which department has the most number of employees?",
+    context="CREATE TABLE department (name VARCHAR, num_employees INTEGER)"
+)
+```
+## Merging the Adapters
+If you wish to create a standalone, merged model from these adapters (as was done for `nnul/sqlchat`), you can do so easily.
+```python
+# Load the model and adapters as shown above
+model, tokenizer = FastLanguageModel.from_pretrained(model_name="nnul/sqlchat-lora", ...)
+# Merge and save locally
+model.save_pretrained_merged("sqlchat_merged_4bit", tokenizer, save_method="merged_4bit_forced")
+# Or, push the merged model directly to a new Hub repository
+# model.push_to_hub_merged("your-username/your-new-merged-repo", tokenizer, save_method="merged_4bit_forced")
+```