Instructions to use saadkhi/SQL_Chat_finetuned_model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use saadkhi/SQL_Chat_finetuned_model with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/Phi-3-mini-4k-instruct-bnb-4bit")
model = PeftModel.from_pretrained(base_model, "saadkhi/SQL_Chat_finetuned_model")

Transformers

How to use saadkhi/SQL_Chat_finetuned_model with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="saadkhi/SQL_Chat_finetuned_model")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("saadkhi/SQL_Chat_finetuned_model", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use saadkhi/SQL_Chat_finetuned_model with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "saadkhi/SQL_Chat_finetuned_model"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saadkhi/SQL_Chat_finetuned_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/saadkhi/SQL_Chat_finetuned_model

SGLang

How to use saadkhi/SQL_Chat_finetuned_model with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "saadkhi/SQL_Chat_finetuned_model" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saadkhi/SQL_Chat_finetuned_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "saadkhi/SQL_Chat_finetuned_model" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "saadkhi/SQL_Chat_finetuned_model",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use saadkhi/SQL_Chat_finetuned_model with Docker Model Runner:
```
docker model run hf.co/saadkhi/SQL_Chat_finetuned_model
```

🧠 SQL Chat – Phi-3-mini SQL Assistant

Model ID: saadkhi/SQL_Chat_finetuned_model
Base model: unsloth/Phi-3-mini-4k-instruct-bnb-4bit
Model type: LoRA (merged)
Task: Natural Language → SQL query generation + conversational SQL assistance
Language: English
License: Apache 2.0

This model is a fine-tuned version of Phi-3-mini-4k-instruct (4-bit quantized) specialized in understanding natural language questions about databases and generating correct, clean SQL queries.

✨ Key Features

Very good balance between size, speed and SQL generation quality
Works well with common database dialects (PostgreSQL, MySQL, SQLite, SQL Server, etc.)
Can explain queries, suggest improvements and handle follow-up questions
Fast inference even on consumer hardware (especially with 4-bit quantization)

🎯 Intended Use & Capabilities

Best for:

Converting natural language questions → SQL queries
Helping beginners learn SQL through explanations
Quick prototyping of SQL queries in development
Building SQL chat interfaces / tools / assistants
Educational purposes

Limitations / Not recommended for:

Extremely complex analytical/business intelligence queries
Real-time query optimization advice
Very database-specific or proprietary SQL extensions
Production systems without human review (always validate generated SQL!)

🛠️ Quick Start (merged LoRA version)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "saadkhi/SQL_Chat_finetuned_model"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

# Simple prompt style (chat template is recommended)
prompt = """Show all customers who placed more than 5 orders in 2024"""

messages = [{"role": "user", "content": prompt}]

inputs = tokenizer.apply_chat_template(
    messages,
    tokenize=True,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    inputs,
    max_new_tokens=180,
    do_sample=False,
    temperature=0.0,
    pad_token_id=tokenizer.eos_token_id
)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 2

Model tree for saadkhi/SQL_Chat_finetuned_model

Base model

unsloth/Phi-3-mini-4k-instruct-bnb-4bit

Adapter

(42)

this model

Collection including saadkhi/SQL_Chat_finetuned_model

Talk2db Collection

Collection

talk2db is my AI-native SQL ecosystem dataset → fine-tuned model → deployed API. Built to let humans talk and databases listen. • 3 items • Updated Feb 10