Daily Model Scout Report β€” 2026-04-03

#4
by msudharsanan - opened
Denali Advanced Integration org

Daily Model Scout Report β€” 2026-04-03

Current Denali-AI Baselines

Model Weighted Score Notes
qwen3-vl-8b-sft+grpo 0.9131 Best overall
qwen3-vl-8b-sft-grpo-nvfp4 0.8945 Best quantized
qwen3-vl-2b-sft-grpo-v9 0.8948 Best small model
qwen35-2b-base 0.8437 Best Qwen3.5 base

NEW MODELS (Last 7 Days: March 27 - April 3, 2026)

1. Google Gemma 4 (Released April 2, 2026) β€” HIGH RELEVANCE

  • Sizes: E2B, E4B, 26B-A4B (MoE), 31B (Dense)
  • HuggingFace: google/gemma-4-E2B, google/gemma-4-E4B, google/gemma-4-26B-A4B, google/gemma-4-31B
  • Architecture: Native multimodal (text + image + video + audio on E2B/E4B), Apache 2.0 license
  • Key features: OCR (multilingual), document/PDF parsing, chart comprehension, object detection, pointing. Variable aspect ratio. Context: 128K-256K.
  • Why relevant: E2B/E4B are extremely small and natively multimodal β€” could replace our 0.8B-2B tier. 26B-A4B (only 4B active) delivers large model quality at small model cost. Strong OCR aligns with garment label reading.
  • Relevance: HIGH β€” Must evaluate E4B and 26B-A4B immediately

2. IBM Granite 4.0 3B Vision (Released April 1, 2026) β€” MEDIUM RELEVANCE

  • Size: ~3.5B base + 0.5B LoRA adapter
  • HuggingFace: ibm-granite/granite-4.0-3b-vision
  • Architecture: LoRA adapter on Granite 4.0 Micro, specialized for document extraction
  • Key features: Chart/table extraction, semantic key-value pair extraction. 85.5% exact-match accuracy zero-shot.
  • Why relevant: KVP extraction directly analogous to garment JSON extraction task.
  • Relevance: MEDIUM β€” Worth a quick eval

3. Z.ai GLM-5V-Turbo (Released April 1, 2026) β€” LOW

  • API-only, coding/agentic focused. Not relevant for classification.

4. Qwen 3.6 Plus Preview (Released March 31, 2026) β€” MEDIUM (WATCH)

  • API-only preview, 1M context, strong reasoning. Watch for open weights.

5. Qwen3.5-Omni (Released March 30, 2026) β€” LOW

  • Closed source. Cannot fine-tune.

RECENT MODELS STILL WORTH EVALUATING

6. Phi-4-Reasoning-Vision-15B (March 4, 2026) β€” MEDIUM

  • HuggingFace: microsoft/Phi-4-reasoning-vision-15B
  • 15B params, SigLIP-2 vision encoder, strong visual reasoning

7. Keye-VL-1.5-8B (Recent) β€” MEDIUM

  • HuggingFace: Kwai-Keye/Keye-VL-1_5-8B
  • 8B params, RL-trained, strong image comprehension. Competitor to Qwen3-VL-8B.

RECOMMENDATIONS

Immediate (This Week):

  1. Evaluate Gemma 4 E4B and 26B-A4B as base models for garment classification
  2. Evaluate IBM Granite 4.0 3B Vision on garment JSON extraction

Watch List:
3. Qwen 3.6 Plus β€” monitor for open weight release
4. Keye-VL-1.5-8B β€” evaluate when bandwidth allows

No Action Needed: Qwen3.5-Omni (closed), GLM-5V-Turbo (API/coding)


Summary Table

Model Released Size Open Relevance Action
Gemma 4 (E4B, 26B-A4B) Apr 2 E4B / 26B MoE Yes (Apache 2.0) HIGH Eval now
IBM Granite 4.0 3B Vision Apr 1 ~4B Yes MEDIUM Eval this week
GLM-5V-Turbo Apr 1 Unknown No (API) LOW Skip
Qwen 3.6 Plus Preview Mar 31 Large No (API) MEDIUM Watch
Qwen3.5-Omni Mar 30 Multiple No LOW Skip
Phi-4-Reasoning-Vision Mar 4 15B Yes MEDIUM Eval when free
Keye-VL-1.5-8B Recent 8B Yes MEDIUM Eval when free

Sign up or log in to comment