File size: 5,202 Bytes
408699b
b6a6964
408699b
b6a6964
408699b
b6a6964
408699b
b6a6964
408699b
 
 
 
b6a6964
408699b
b6a6964
408699b
b6a6964
408699b
b6a6964
 
 
408699b
2d23776
b6a6964
408699b
b6a6964
408699b
b6a6964
 
 
408699b
b6a6964
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
408699b
 
 
 
 
b6a6964
 
 
 
 
 
 
 
 
 
 
 
 
408699b
 
 
b6a6964
408699b
b6a6964
408699b
b6a6964
408699b
b6a6964
 
 
408699b
b6a6964
 
 
 
 
 
 
 
 
408699b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
# `nnul/sqlchat`: A Conversational AI for SQL Generation

This repository contains `sqlchat`, a powerful and efficient language model designed specifically for **Text-to-SQL** tasks. It can understand natural language questions and database schemas to generate accurate SQL queries, including complex statements for creating and managing tables (Data Definition Language).

This model is provided as a standalone 4-bit quantized model, optimized for easy deployment and high-performance, low-resource inference. It was built using the [Unsloth](https://github.com/unslothai/unsloth) library to ensure maximum speed and memory efficiency.

## Model Capabilities

*   **Natural Language to SQL:** Translates complex English questions into executable SQL queries.
*   **Schema-Aware:** Understands `CREATE TABLE` contexts provided in the prompt to generate relevant queries.
*   **DDL Generation:** Capable of generating `CREATE TABLE` statements, including constraints like `PRIMARY KEY` and `FOREIGN KEY` relationships.
*   **Complex Query Logic:** Successfully handles `JOIN`s, aggregations (`COUNT`, `MAX`), and sorting (`ORDER BY ... LIMIT`).

## How to Use

The easiest way to use `sqlchat` is with the Unsloth library, which will ensure you get the best performance.

### Prerequisites

First, install the necessary libraries.

```bash
pip install unsloth
pip install "torch>=2.3.1"
```

### Running Inference

Here is a simple, reusable Python script to run inference with the model.

```python
import torch
from unsloth import FastLanguageModel
from transformers import TextStreamer

# Load the sqlchat model from the Hugging Face Hub
# This is a standalone 4-bit model, so we load it as such.
print("Loading sqlchat model...")
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="nnul/sqlchat",
    max_seq_length=4096,
    dtype=None,
    load_in_4bit=True,
)
print("Model loaded successfully.")

# This call optimizes the model for the fastest possible inference.
FastLanguageModel.for_inference(model)

def generate_sql(instruction: str, context: str = ""):
    """
    A helper function to generate SQL from a natural language prompt.
    """
    prompt = tokenizer.apply_chat_template(
        [
            {"role": "system", "content": "You are a helpful assistant that generates SQL queries based on natural language questions and database schemas."},
            {"role": "user", "content": f"### Instruction:\n{instruction}\n\n### Context:\n{context}"},
        ],
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False, # Ensures direct SQL output
    )

    inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
    text_streamer = TextStreamer(tokenizer, skip_prompt=True, clean_up_tokenization_spaces=True)

    print(f"User Instruction: {instruction}")
    print("\nModel Output:")
    print("---------------------------------")
    _ = model.generate(
        **inputs,
        streamer=text_streamer,
        max_new_tokens=256,
        do_sample=False, # Use greedy decoding for deterministic output
        use_cache=True,
    )
    print("---------------------------------\n")

# --- Example 1: Querying Data ---
generate_sql(
    instruction="Which department has the most number of employees?",
    context="CREATE TABLE department (name VARCHAR, num_employees INTEGER)"
)

# --- Example 2: Creating a Table (DDL) ---
generate_sql(
    instruction="We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.",
    context="""
CREATE TABLE students (student_id INTEGER PRIMARY KEY, student_name VARCHAR(255));
CREATE TABLE courses (course_id INTEGER PRIMARY KEY, course_title VARCHAR(255));
"""
)
```

### Expected Output

```
User Instruction: Which department has the most number of employees?

Model Output:
---------------------------------
SELECT name FROM department ORDER BY num_employees DESC LIMIT 1;
---------------------------------

User Instruction: We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.

Model Output:
---------------------------------
CREATE TABLE student_enrollment (student_id INTEGER, course_id INTEGER, PRIMARY KEY (student_id, course_id), FOREIGN KEY (student_id) REFERENCES students(student_id), FOREIGN KEY (course_id) REFERENCES courses(course_id));
---------------------------------
```

## Performance

The model was benchmarked on an NVIDIA A40 GPU. In a batch-processing scenario, it achieves a throughput of **~55-70 tokens/second**. Single-prompt latency is well within real-time requirements for interactive applications.

*   **Peak VRAM Usage (Inference):** ~6.1 GB

## Prompt Template

To get the best results, your prompts should follow this structure:

```
<|im_start|>system
You are a helpful assistant that generates SQL queries based on natural language questions and database schemas.<|im_end|>
<|im_start|>user
### Instruction:
{Your natural language question}

### Context:
{The CREATE TABLE statements for the relevant tables}<|im_end|>
<|im_start|>assistant
```