Open to Collab

Erdem OZKAN PRO

erdemozkan

https://yolocoderai.com

AI & ML interests

VLMs, SLMs, AI Agents, Automation, Document Intelligence, Coding agents

Recent Activity

updated a model about 12 hours ago

penginlabs/Pengin-compact-v0.1b

published a model about 12 hours ago

penginlabs/Pengin-compact-v0.1b

posted an update 2 days ago

Releasing YOLO-Coder-8B and YOLO-Coder-1.5B — fine-tuned models for fixing broken CLI commands, running 100% locally. Both models are fine-tuned from Qwen2.5-Coder using MLX LoRA on Apple Silicon, trained on 6,719 real CLI error→fix pairs across 15 categories (Python, pip, Node.js, npm, Docker, Git, Cargo, SSH, database, and more). Unlike general-purpose coding assistants, these models are laser-focused on a single task: given a CLI error, output exactly one bare shell command that fixes it. No explanation. No markdown. One command. **Benchmark results (YOLO-Bench, 218 verified CLI errors, structural match scoring):** - YOLO-Coder-8B raw LLM: **59.2%** (vs GPT-4o 48.6%, Claude Sonnet 60.1%) - YOLO-Coder-8B full pipeline: **77.1%** - YOLO-Coder-1.5B raw LLM: **42.2%** - YOLO-Coder-1.5B full pipeline: **71.1%** The full pipeline layers 73 deterministic interceptors and fix memory on top of the LLM — roughly half of all fixes never reach the model. Both models are available as Q4_K_M GGUFs for Ollama: - 🔗 [YOLO-Coder-8B](https://huggingface.co/erdemozkan/YOLO-Coder-8B) - 🔗 [YOLO-Coder-1.5B](https://huggingface.co/erdemozkan/YOLO-Coder-1.5B) Benchmark dataset and scoring code: [github.com/erdemozkan/YOLO-CODER/tree/main/benchmark](https://github.com/erdemozkan/YOLO-CODER/tree/main/benchmark)

View all activity

Organizations

posted an update 2 days ago

Post

Releasing YOLO-Coder-8B and YOLO-Coder-1.5B — fine-tuned models for fixing broken CLI commands, running 100% locally.

Both models are fine-tuned from Qwen2.5-Coder using MLX LoRA on Apple Silicon, trained on 6,719 real CLI error→fix pairs across 15 categories (Python, pip, Node.js, npm, Docker, Git, Cargo, SSH, database, and more).

Unlike general-purpose coding assistants, these models are laser-focused on a single task: given a CLI error, output exactly one bare shell command that fixes it. No explanation. No markdown. One command.

**Benchmark results (YOLO-Bench, 218 verified CLI errors, structural match scoring):**
- YOLO-Coder-8B raw LLM: **59.2%** (vs GPT-4o 48.6%, Claude Sonnet 60.1%)
- YOLO-Coder-8B full pipeline: **77.1%**
- YOLO-Coder-1.5B raw LLM: **42.2%**
- YOLO-Coder-1.5B full pipeline: **71.1%**

The full pipeline layers 73 deterministic interceptors and fix memory on top of the LLM — roughly half of all fixes never reach the model.

Both models are available as Q4_K_M GGUFs for Ollama:
- 🔗 [YOLO-Coder-8B]( erdemozkan/YOLO-Coder-8B)
- 🔗 [YOLO-Coder-1.5B]( erdemozkan/YOLO-Coder-1.5B)

Benchmark dataset and scoring code: [github.com/erdemozkan/YOLO-CODER/tree/main/benchmark](https://github.com/erdemozkan/YOLO-CODER/tree/main/benchmark)

reacted to eaddario's post with 👍🔥 3 days ago

Post

3092

Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B.

Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target.

Key Advantages:
- VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM).
- Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs.

Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards.

eaddario/Qwen3.6-27B-GGUF
eaddario/Qwen3.6-35B-A3B-GGUF

Erdem OZKAN PRO

AI & ML interests

Recent Activity

Organizations

erdemozkan's activity