Instructions to use melihemin/qwen2.5-0.5b-text2sql-full with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use melihemin/qwen2.5-0.5b-text2sql-full with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="melihemin/qwen2.5-0.5b-text2sql-full")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("melihemin/qwen2.5-0.5b-text2sql-full")
model = AutoModelForCausalLM.from_pretrained("melihemin/qwen2.5-0.5b-text2sql-full")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use melihemin/qwen2.5-0.5b-text2sql-full with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "melihemin/qwen2.5-0.5b-text2sql-full"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "melihemin/qwen2.5-0.5b-text2sql-full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/melihemin/qwen2.5-0.5b-text2sql-full

SGLang

How to use melihemin/qwen2.5-0.5b-text2sql-full with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "melihemin/qwen2.5-0.5b-text2sql-full" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "melihemin/qwen2.5-0.5b-text2sql-full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "melihemin/qwen2.5-0.5b-text2sql-full" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "melihemin/qwen2.5-0.5b-text2sql-full",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use melihemin/qwen2.5-0.5b-text2sql-full with Docker Model Runner:
```
docker model run hf.co/melihemin/qwen2.5-0.5b-text2sql-full
```

Model Card for Qwen2.5-0.5B Text-to-SQL

Model Summary

This model converts natural language questions into SQL queries.
It is a fine-tuned version of Qwen2.5-0.5B, adapted specifically for the Text-to-SQL task using the LoRA (Low-Rank Adaptation) method.

The model is designed to be lightweight, efficient, and suitable for local experimentation and educational purposes.

Model Details

Model Description

Developed by: Melih Emin
Model type: Causal Language Model (Text-to-SQL)
Language(s): English
License: Apache 2.0
Finetuned from model: Qwen/Qwen2.5-0.5B
Fine-tuning method: LoRA (Low-Rank Adaptation)

This model was fine-tuned as part of a Generative Artificial Intelligence course assignment.
The primary goal was to explore parameter-efficient fine-tuning techniques on limited local hardware.

Model Sources

Base Model: https://huggingface.co/Qwen/Qwen2.5-0.5B
Repository: https://huggingface.co/melihemin/qwen2.5-0.5b-text2sql-full

Uses

Direct Use

Converting English questions into SQL queries
Educational demonstrations of Text-to-SQL systems
Local experimentation with small language models

Downstream Use

Can be integrated into database query assistants
Can serve as a baseline for more advanced Text-to-SQL systems
Further fine-tuning with schema-specific datasets

Out-of-Scope Use

Production-grade database querying without validation
Complex multi-database or highly nested SQL queries
Security-critical or sensitive data environments

Bias, Risks, and Limitations

The model may generate syntactically valid but semantically incorrect SQL
It does not perform schema validation
Performance depends heavily on prompt structure
Trained on a limited dataset and may not generalize to unseen schemas

Recommendations

Always validate generated SQL before execution
Use schema-aware prompting for better results
Do not use directly in production without safeguards

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "melihemin/qwen2.5-0.5b-text2sql-full"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = """### Question:
How many heads of the departments are older than 56?

### SQL:
"""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: 8

Safetensors

Model size

0.5B params

Tensor type

F16

melihemin
/

qwen2.5-0.5b-text2sql-full