Instructions to use Alumin-Hydro/Qwen3.5-9B-Physics with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Alumin-Hydro/Qwen3.5-9B-Physics with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Alumin-Hydro/Qwen3.5-9B-Physics",
	filename="qwen3.5-9b-physics-q4_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Alumin-Hydro/Qwen3.5-9B-Physics with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M

Use Docker

docker model run hf.co/Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M

LM Studio
Jan

vLLM

How to use Alumin-Hydro/Qwen3.5-9B-Physics with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Alumin-Hydro/Qwen3.5-9B-Physics"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alumin-Hydro/Qwen3.5-9B-Physics",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M

Ollama
How to use Alumin-Hydro/Qwen3.5-9B-Physics with Ollama:
```
ollama run hf.co/Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M
```

Unsloth Studio new

How to use Alumin-Hydro/Qwen3.5-9B-Physics with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Alumin-Hydro/Qwen3.5-9B-Physics to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Alumin-Hydro/Qwen3.5-9B-Physics to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Alumin-Hydro/Qwen3.5-9B-Physics to start chatting

Docker Model Runner
How to use Alumin-Hydro/Qwen3.5-9B-Physics with Docker Model Runner:
```
docker model run hf.co/Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M
```

Lemonade

How to use Alumin-Hydro/Qwen3.5-9B-Physics with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Alumin-Hydro/Qwen3.5-9B-Physics:Q4_K_M

Run and chat with the model

lemonade run user.Qwen3.5-9B-Physics-Q4_K_M

List all available models

lemonade list

Qwen3.5-9B-Physics

A parameter-efficient fine-tuned LoRA adapter built on Qwen/Qwen3.5-9B, optimized for physics problem-solving. Trained with LLaMA Factory on the camel_physics dataset.

This repository provides both the lightweight LoRA adapter and the standalone quantized GGUF model for local deployment.

Model Details

Base Model: Qwen/Qwen3.5-9B
Fine-tuning Method: Supervised Fine-Tuning (SFT) + LoRA
Training Framework: LLaMA Factory
Training Dataset: camel_ai/physics (5k curated physics question-answer samples)
Training Precision: 4-bit quantized training (bitsandbytes, BF16)
LoRA Hyperparameters:
- LoRA Rank: 16
- LoRA Alpha: 32
- LoRA Dropout: 0.0
Quantized Model: Q4_K_M GGUF (6.4GB)

Model Capabilities

Specialized in high school and undergraduate physics problem-solving, formula derivation and conceptual analysis
Preserves the general conversational and reasoning ability of the original Qwen3.5-9B base model
Dual deployment support: lightweight LoRA adapter for development and optimized GGUF model for local inference
Compatible with Transformers, PEFT, llama.cpp and Ollama

Usage

1. Load LoRA Adapter (For Development)

Combine the LoRA adapter with the official Qwen3.5-9B base model for full fine-tuned capabilities:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model_id = "Qwen/Qwen3.5-9B"
lora_model_id = "Alumin-Hydro/Qwen3.5-9B-Physics"

# Load base model and tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    device_map="auto",
    torch_dtype="auto"
)

# Load physics LoRA adapter
model = PeftModel.from_pretrained(model, lora_model_id)
model.eval()

2. Run Quantized GGUF Model (For Local Deployment)

A standalone Q4_K_M quantized GGUF model is provided for fast local inference without additional dependencies.

Ollama Deployment

ollama create qwen3.5-9b-physics -f Modelfile
ollama run qwen3.5-9b-physics

llama.cpp Deployment

Directly load qwen3.5-9b-physics-q4_K_M.gguf with llama.cpp or any GGUF-compatible inference framework.

File Description

adapter_model.safetensors & adapter_config.json: Lightweight LoRA adapter (~169MB)
qwen3.5-9b-physics-q4_K_M.gguf: Merged & quantized full model (Q4_K_M, 6.4GB)
Modelfile: Official Ollama configuration file

License

This model follows the Apache 2.0 license, consistent with the original Qwen3.5-9B base model.

Downloads last month: 126

GGUF

Model size

9B params

Architecture

qwen35

Hardware compatibility

4-bit

Model tree for Alumin-Hydro/Qwen3.5-9B-Physics

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Quantized

(202)

this model

Alumin-Hydro
/

Qwen3.5-9B-Physics