sqlchat / README.md

Update README.md

2d23776 verified 6 months ago

5.2 kB

	# `nnul/sqlchat`: A Conversational AI for SQL Generation

	This repository contains `sqlchat`, a powerful and efficient language model designed specifically for Text-to-SQL tasks. It can understand natural language questions and database schemas to generate accurate SQL queries, including complex statements for creating and managing tables (Data Definition Language).

	This model is provided as a standalone 4-bit quantized model, optimized for easy deployment and high-performance, low-resource inference. It was built using the [Unsloth](https://github.com/unslothai/unsloth) library to ensure maximum speed and memory efficiency.

	## Model Capabilities

	* Natural Language to SQL: Translates complex English questions into executable SQL queries.
	* Schema-Aware: Understands `CREATE TABLE` contexts provided in the prompt to generate relevant queries.
	* DDL Generation: Capable of generating `CREATE TABLE` statements, including constraints like `PRIMARY KEY` and `FOREIGN KEY` relationships.
	* Complex Query Logic: Successfully handles `JOIN`s, aggregations (`COUNT`, `MAX`), and sorting (`ORDER BY ... LIMIT`).

	## How to Use

	The easiest way to use `sqlchat` is with the Unsloth library, which will ensure you get the best performance.

	### Prerequisites

	First, install the necessary libraries.

	```bash
	pip install unsloth
	pip install "torch>=2.3.1"
	```

	### Running Inference

	Here is a simple, reusable Python script to run inference with the model.

	```python
	import torch
	from unsloth import FastLanguageModel
	from transformers import TextStreamer

	# Load the sqlchat model from the Hugging Face Hub
	# This is a standalone 4-bit model, so we load it as such.
	print("Loading sqlchat model...")
	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name="nnul/sqlchat",
	max_seq_length=4096,
	dtype=None,
	load_in_4bit=True,
	)
	print("Model loaded successfully.")

	# This call optimizes the model for the fastest possible inference.
	FastLanguageModel.for_inference(model)

	def generate_sql(instruction: str, context: str = ""):
	"""
	A helper function to generate SQL from a natural language prompt.
	"""
	prompt = tokenizer.apply_chat_template(
	[
	{"role": "system", "content": "You are a helpful assistant that generates SQL queries based on natural language questions and database schemas."},
	{"role": "user", "content": f"### Instruction:\n{instruction}\n\n### Context:\n{context}"},
	],
	tokenize=False,
	add_generation_prompt=True,
	enable_thinking=False, # Ensures direct SQL output
	)

	inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
	text_streamer = TextStreamer(tokenizer, skip_prompt=True, clean_up_tokenization_spaces=True)

	print(f"User Instruction: {instruction}")
	print("\nModel Output:")
	print("---------------------------------")
	_ = model.generate(
	**inputs,
	streamer=text_streamer,
	max_new_tokens=256,
	do_sample=False, # Use greedy decoding for deterministic output
	use_cache=True,
	)
	print("---------------------------------\n")

	# --- Example 1: Querying Data ---
	generate_sql(
	instruction="Which department has the most number of employees?",
	context="CREATE TABLE department (name VARCHAR, num_employees INTEGER)"
	)

	# --- Example 2: Creating a Table (DDL) ---
	generate_sql(
	instruction="We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.",
	context="""
	CREATE TABLE students (student_id INTEGER PRIMARY KEY, student_name VARCHAR(255));
	CREATE TABLE courses (course_id INTEGER PRIMARY KEY, course_title VARCHAR(255));
	"""
	)
	```

	### Expected Output

	```
	User Instruction: Which department has the most number of employees?

	Model Output:
	---------------------------------
	SELECT name FROM department ORDER BY num_employees DESC LIMIT 1;
	---------------------------------

	User Instruction: We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.

	Model Output:
	---------------------------------
	CREATE TABLE student_enrollment (student_id INTEGER, course_id INTEGER, PRIMARY KEY (student_id, course_id), FOREIGN KEY (student_id) REFERENCES students(student_id), FOREIGN KEY (course_id) REFERENCES courses(course_id));
	---------------------------------
	```

	## Performance

	The model was benchmarked on an NVIDIA A40 GPU. In a batch-processing scenario, it achieves a throughput of ~55-70 tokens/second. Single-prompt latency is well within real-time requirements for interactive applications.

	* Peak VRAM Usage (Inference): ~6.1 GB

	## Prompt Template

	To get the best results, your prompts should follow this structure:

	```
	<\|im_start\|>system
	You are a helpful assistant that generates SQL queries based on natural language questions and database schemas.<\|im_end\|>
	<\|im_start\|>user
	### Instruction:
	{Your natural language question}

	### Context:
	{The CREATE TABLE statements for the relevant tables}<\|im_end\|>
	<\|im_start\|>assistant
	```