🚀 Z-Engineer V2.5 (4B)

The "Z-Engineer" is back — longer, deeper, and smarter.

This is Z-Engineer V2.5, a specialized 4B parameter model fine-tuned on the Qwen 3 architecture. It serves as a dedicated Creative Director for your image generation workflow, capable of extrapolating complex, cohesive visual narratives from minimal seed concepts. It doesn't just describe a scene; it engineers the light, lens, and atmosphere necessary to render it.

🧠 What is this?

Z-Engineer V2.5 is a merged LoRA fine-tuned version of high-performance text encoder from Tongyi-MAI/Z-Image-Turbo. It has been trained to specifically understand the nuances of AI Image Generation (Z-Image-Turbo, Flux2 Klein). It excels at:

Expanding Concepts: Turn "dog on a bike" into a cinematic narrative.
Technical Precision: It understands lenses (35mm vs 85mm), lighting (rembrandt, volumetric), and film stocks.
Stylistic Consistency: It avoids the robotic "AI feel" and writes with a distinct, creative voice.

🔑 Key Use Cases

✨ Prompt Enhancement: A lightweight, low-VRAM solution to create, edit, and enrich simple image ideas into detailed narratives.
🔌 Z-Image Turbo Encoder: Fully backwards compatible as a drop-in CLIP text encoder for Z-Image Turbo workflows, producing varied and unique results from the same seed.
🛡️ Local & Private: Runs entirely on your machine. No API fees, no data logging, no censorship.
⚡ Hybrid Power: Use it to expand a prompt, then use the model itself as the encoder for the generation stage.

📉 Key Improvements

Base Model Upgrade: Switched from standard Qwen3 Instruct to the native text encoder from Z-Image-Turbo for perfect alignment.
All-Layer Training: Unlike typical lightweight LoRAs, I trained adapters on all 36 layers of the model, ensuring deep behavioral alignment.
Massive Iteration Count: Trained for 10,000 iterations to fully saturate the weights with the dataset concepts.

📊 CLIP Model Comparison

Z-Engineer V2.5 can be used as a drop-in CLIP text encoder for Z-Image-Turbo workflows. Here's how it compares to previous versions and the base model:

Model	Result
Z-Engineer V2.5	✅ Clean, natural output with excellent detail and coherence.
Z-Engineer V2	✅ Good quality, but V2.5 shows improved texture and lighting.
Z-Engineer V1	❌ Broken: Produces severe visual artifacts and distortions.
Base Qwen3 4B	⚠️ Functional but generic; lacks the specialized prompt understanding.

Visual Comparison

Note: V1 exhibits catastrophic artifacts (bottom-left in each grid) due to training instabilities. V2.5 (top-left) consistently produces the cleanest, most natural results.

🔌 ComfyUI Integration (Recommended)

I have released a custom node for seamless integration with ComfyUI!

Features: Optimized for local OpenAI API compatible backends (LM Studio, Ollama, etc.).
Get it here: ComfyUI-Z-Engineer

💻 Training Facts

I believe in open science. Here is exactly how this was built:

Hardware: Trained locally on a Mac with 48GB Unified Memory (Apple Silicon).
Framework: MLX (Apple's native machine learning framework).
Dataset: Generated locally using Qwen3 VL 30B A3B Instruct
- Size: ~34,678 high-quality examples.
- Content: A curated mix of "Prompt Enhancement" pairs, teaching the model how to take a seed idea and "engineer" it into a final prompt.
Hyperparameters:
- Iterations: 10,000
- Batch Size: 4
- LoRA Layers: 36 (All Linear Layers)
- Learning Rate: 1e-5

📦 GGUF & Quantization

I provide a full suite of GGUF quantizations for use with llama.cpp, Ollama, and LM Studio.

Quantization	Size	Use Case
Q4_K_S	2.2 GB	🔻 Max Compression
Q4_K_M	2.3 GB	⚡️ Fast / Mobile / Edge
Q5_K_M	2.7 GB	⚖️ Recommended Balance
Q6_K	3.1 GB	💎 High Quality
Q8_0	4.0 GB	🎬 Near-Lossless
F16	7.5 GB	🧪 Reference / Conversion

⚠️ Disclaimer

This model generates text for image prompts. While I have filtered the dataset, users should use their best judgment. I am not responsible for the content you generate.

Downloads last month: 2,428

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

4-bit

5-bit

6-bit

8-bit

View +1 variant

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for BennyDaBall/Qwen3-4b-Z-Image-Engineer-V2.5

Base model

Tongyi-MAI/Z-Image-Turbo

Quantized

(27)

this model

Collection including BennyDaBall/Qwen3-4b-Z-Image-Engineer-V2.5

Z-Image-Engineer

Collection

Various versions of my Z-Image-Engineer models. • 6 items • Updated 7 days ago • 1