🧠 Qwen3.5-4B Prompter — GGUF
A multilingual prompt engineer model fine-tuned on Yusiko/prompter — a 5,000-sample dataset covering 10 languages and 7 domains.
Given any short, vague user input, this model expands it into a fully structured, production-ready prompt with role assignment, context, step-by-step instructions, output format, and quality standards — following Google's Prompt Engineering Whitepaper best practices.
🚀 Trained 2x faster with Unsloth · Exported to GGUF · Ready for Ollama & llama.cpp
📦 Available Files
| File | Quantization | Size | Use case |
|---|---|---|---|
Qwen3.5-4B.Q4_0.gguf |
Q4_0 | ~2.54 GB | 💡 Recommended — fast, efficient |
Qwen3.5-4B.BF16-mmproj.gguf |
BF16 | larger | 🔬 Higher precision |
🚀 Quick Start
Ollama
ollama run hf.co/Yusiko/qwen3.5-prompter
llama.cpp
# Text-only
llama-cli -hf Yusiko/qwen3.5-prompter --jinja
# Multimodal
llama-mtmd-cli -hf Yusiko/qwen3.5-prompter --jinja
Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="Yusiko/qwen3.5-prompter",
filename="Qwen3.5-4B.Q4_0.gguf",
n_ctx=2048,
)
response = llm.create_chat_completion(
messages=[
{
"role": "user",
"content": (
"Below is an instruction that describes a task, paired with an input "
"that provides further context. Write a response that appropriately "
"completes the request.\n\n"
"### Instruction:\n"
"As a prompt engineer, transform this simple input into a fully detailed, professional prompt\n\n"
"### Input:\n"
"Write a Python function\n\n"
"### Response:"
)
}
],
max_tokens=512,
temperature=0.7,
)
print(response["choices"][0]["message"]["content"])
💡 Prompt Format (Alpaca)
Below is an instruction that describes a task, paired with an input that provides further context.
Write a response that appropriately completes the request.
### Instruction:
As a prompt engineer, transform this simple input into a fully detailed, professional prompt
### Input:
{your simple prompt here}
### Response:
⚠️ Always use string concatenation — not
.format()— when building prompts programmatically. The model's outputs contain{curly braces}that will causeKeyErrorwith Python's string formatter.
🎯 What This Model Does
Input: a short, vague prompt
Write a Python function
Output: a complete, structured, professional prompt
## System Prompt
You are a senior software engineer with 10+ years of Python experience.
Your task is to write a Python function with production-quality standards.
## Role & Context
Act as a senior engineer conducting a thorough implementation session...
## Step-by-Step Instructions
1. Clarify requirements, edge cases, and constraints before writing any code
2. Design the interface and data structures first, then implement logic
3. Write the implementation with comprehensive inline documentation
4. Add input validation and robust error handling for all edge cases
5. Write unit tests covering happy path, edge cases, and error scenarios
## Output Requirements
- Implementation: Complete, working code with no placeholders
- Documentation: Inline comments explaining non-obvious logic
- Tests: At minimum 3 test cases (happy path, edge case, error case)
...
📊 Training Details
| Field | Value |
|---|---|
| 🤖 Base model | Qwen/Qwen3.5-4B |
| 🗂️ Dataset | Yusiko/prompter |
| 📦 Dataset size | 5,000 samples |
| 🌍 Languages | 10 (az, en, tr, ru, de, fr, zh, ar, es, ja) |
| 🎯 Method | QLoRA (rank=16, alpha=16) |
| ⚙️ Framework | Unsloth + TRL SFTTrainer |
| 💻 Hardware | NVIDIA RTX5070 12GB |
| 🧮 Optimizer | AdamW (PyTorch) |
| 📐 Seq length | 1024 tokens |
| 🔢 Batch size | 1 × 8 grad accum = 8 effective |
| 📉 LR scheduler | Cosine |
| 🔁 Training steps | 500 |
| 🏷️ Export format | GGUF Q4_0 |
🏗️ Dataset Overview
The Yusiko/prompter dataset contains 4 output types, each following Google's Prompt Engineering Whitepaper:
| Type | Count | Description |
|---|---|---|
| 🔷 Standard | ~3,280 | Role + system + contextual prompting |
| 🔶 Few-shot | ~1,000 | 2 examples shown before the main task |
| 🔹 Chain-of-Thought | ~460 | Step-by-step reasoning structure |
| 🔸 Step-back | ~260 | General principles → specific implementation |
Domains covered: Coding · Writing · Analysis · ML/AI · DevOps · Data Engineering · Business Strategy
⚙️ Hardware Requirements
| Setup | VRAM / RAM | Speed |
|---|---|---|
| GPU (Q4_0) | 4–6 GB VRAM | Fast |
| CPU only (Q4_0) | ~6 GB RAM | Moderate |
| Apple Silicon (Q4_0) | ~6 GB unified RAM | Fast via Metal |
📜 Citation
@model{yusiko_qwen35_prompter_2025,
author = {Yusif},
title = {Qwen3.5-4B Prompter: Multilingual Prompt Engineering Model},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Yusiko/qwen3.5-prompter},
dataset = {https://huggingface.co/datasets/Yusiko/prompter}
}
🙏 Acknowledgements
- Unsloth — 2x faster fine-tuning, GGUF export
- Google Prompt Engineering Whitepaper — Lee Boonstra et al., Feb 2025
- TRL — SFTTrainer + SFTConfig
- Qwen Team — Qwen3.5 base model
Built with ❤️ by Yusif · Apache 2.0 License
- Downloads last month
- 333
4-bit