Airy / README.md

nhannt201

Refresh research model card metadata

ba4ce94 verified 5 days ago

preview code

raw

history blame contribute delete

1.74 kB

metadata

language:
  - en
  - vi
license: other
library_name: gguf
tags:
  - acne
  - dermatology
  - skincare
  - gguf
  - research
  - quantization
  - qwen3.5
pipeline_tag: text-generation
base_model:
  - Qwen/Qwen3.5-0.8B

Acnoryx AI Research Bundle

Overview

Base model: Qwen/Qwen3.5-0.8B
Model size: 0.8b
Research quantizations: Q3_K_M, IQ3_M, Q2_K, IQ2_M, IQ2_XS, IQ2_XXS, IQ1_M, IQ1_S
Purpose: evaluate quality vs. size trade-offs below the production threshold

Notes

IQ1/IQ2 formats require an importance matrix (imatrix).
These files are more experimental than the release bundle.
Production-facing use should prefer the release bundle.
If prompting in Vietnamese, write with full accents for best consistency.

Evaluation Snapshot

Research GGUFs were continued from the existing results and merged with the latest rerun on the same curated 58-question bilingual benchmark.

Quant	Think	No-Think	Avg	Status
Q3_K_M	74.1%	72.4%	73.2%	Best current research quant
IQ3_M	60.3%	60.3%	60.3%	Heavy quality loss
IQ2_M	20.7%	19.0%	19.8%	Below usable threshold
IQ2_XS	5.2%	3.4%	4.3%	Triggered early-stop for lower bits

Research Guidance

Public research recommendation: Q3_K_M only
IQ3_M is still uploadable for experiments, but quality is clearly degraded
The rerun auto-stopped below IQ2_XS because average pass rate fell under 50%, so lower-bit quants should be considered archival artifacts rather than viable deployments
For any user-facing scenario, prefer the release bundle instead of this research branch

For cross-family ranking and release-vs-research comparison, see results/COMPARISON.md in the workspace.