πŸš€ Z-Engineer V2.5 (4B)

The "Z-Engineer" is back β€” longer, deeper, and smarter.

This is Z-Engineer V2.5, a specialized 4B parameter model fine-tuned on the Qwen 3 architecture. It serves as a dedicated Creative Director for your image generation workflow, capable of extrapolating complex, cohesive visual narratives from minimal seed concepts. It doesn't just describe a scene; it engineers the light, lens, and atmosphere necessary to render it.

🧠 What is this?

Z-Engineer V2.5 is a merged LoRA fine-tuned version of high-performance text encoder from Tongyi-MAI/Z-Image-Turbo. It has been trained to specifically understand the nuances of AI Image Generation (Z-Image-Turbo, Flux2 Klein). It excels at:

  • Expanding Concepts: Turn "dog on a bike" into a cinematic narrative.
  • Technical Precision: It understands lenses (35mm vs 85mm), lighting (rembrandt, volumetric), and film stocks.
  • Stylistic Consistency: It avoids the robotic "AI feel" and writes with a distinct, creative voice.

πŸ”‘ Key Use Cases

  • ✨ Prompt Enhancement: A lightweight, low-VRAM solution to create, edit, and enrich simple image ideas into detailed narratives.
  • πŸ”Œ Z-Image Turbo Encoder: Fully backwards compatible as a drop-in CLIP text encoder for Z-Image Turbo workflows, producing varied and unique results from the same seed.
  • πŸ›‘οΈ Local & Private: Runs entirely on your machine. No API fees, no data logging, no censorship.
  • ⚑ Hybrid Power: Use it to expand a prompt, then use the model itself as the encoder for the generation stage.

πŸ“‰ Key Improvements

  • Base Model Upgrade: Switched from standard Qwen3 Instruct to the native text encoder from Z-Image-Turbo for perfect alignment.
  • All-Layer Training: Unlike typical lightweight LoRAs, I trained adapters on all 36 layers of the model, ensuring deep behavioral alignment.
  • Massive Iteration Count: Trained for 10,000 iterations to fully saturate the weights with the dataset concepts.

πŸ“Š CLIP Model Comparison

Z-Engineer V2.5 can be used as a drop-in CLIP text encoder for Z-Image-Turbo workflows. Here's how it compares to previous versions and the base model:

Model Result
Z-Engineer V2.5 βœ… Clean, natural output with excellent detail and coherence.
Z-Engineer V2 βœ… Good quality, but V2.5 shows improved texture and lighting.
Z-Engineer V1 ❌ Broken: Produces severe visual artifacts and distortions.
Base Qwen3 4B ⚠️ Functional but generic; lacks the specialized prompt understanding.

Visual Comparison

CLIP Comparison 1

CLIP Comparison 2

Note: V1 exhibits catastrophic artifacts (bottom-left in each grid) due to training instabilities. V2.5 (top-left) consistently produces the cleanest, most natural results.

πŸ”Œ ComfyUI Integration (Recommended)

I have released a custom node for seamless integration with ComfyUI!

  • Features: Optimized for local OpenAI API compatible backends (LM Studio, Ollama, etc.).
  • Get it here: ComfyUI-Z-Engineer

πŸ’» Training Facts

I believe in open science. Here is exactly how this was built:

  • Hardware: Trained locally on a Mac with 48GB Unified Memory (Apple Silicon).
  • Framework: MLX (Apple's native machine learning framework).
  • Dataset: Generated locally using Qwen3 VL 30B A3B Instruct
    • Size: ~34,678 high-quality examples.
    • Content: A curated mix of "Prompt Enhancement" pairs, teaching the model how to take a seed idea and "engineer" it into a final prompt.
  • Hyperparameters:
    • Iterations: 10,000
    • Batch Size: 4
    • LoRA Layers: 36 (All Linear Layers)
    • Learning Rate: 1e-5

πŸ“¦ GGUF & Quantization

I provide a full suite of GGUF quantizations for use with llama.cpp, Ollama, and LM Studio.

Quantization Size Use Case
Q4_K_S 2.2 GB πŸ”» Max Compression
Q4_K_M 2.3 GB ⚑️ Fast / Mobile / Edge
Q5_K_M 2.7 GB βš–οΈ Recommended Balance
Q6_K 3.1 GB πŸ’Ž High Quality
Q8_0 4.0 GB 🎬 Near-Lossless
F16 7.5 GB πŸ§ͺ Reference / Conversion

⚠️ Disclaimer

This model generates text for image prompts. While I have filtered the dataset, users should use their best judgment. I am not responsible for the content you generate.

Downloads last month
2,428
GGUF
Model size
4B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for BennyDaBall/Qwen3-4b-Z-Image-Engineer-V2.5

Quantized
(27)
this model

Collection including BennyDaBall/Qwen3-4b-Z-Image-Engineer-V2.5