Acnoryx
/

Airy

Text Generation

Model card Files Files and versions

Airy / README.md

nhannt201's picture

Refresh research model card metadata

ba4ce94 verified 5 days ago

|

history blame contribute delete

1.74 kB

	---
	language:
	- en
	- vi
	license: other
	library_name: gguf
	tags:
	- acne
	- dermatology
	- skincare
	- gguf
	- research
	- quantization
	- qwen3.5
	pipeline_tag: text-generation
	base_model:
	- Qwen/Qwen3.5-0.8B
	---

	# Acnoryx AI Research Bundle

	## Overview

	- Base model: Qwen/Qwen3.5-0.8B
	- Model size: 0.8b
	- Research quantizations: Q3_K_M, IQ3_M, Q2_K, IQ2_M, IQ2_XS, IQ2_XXS, IQ1_M, IQ1_S
	- Purpose: evaluate quality vs. size trade-offs below the production threshold

	## Notes

	- IQ1/IQ2 formats require an importance matrix (imatrix).
	- These files are more experimental than the release bundle.
	- Production-facing use should prefer the release bundle.
	- If prompting in Vietnamese, write with full accents for best consistency.

	## Evaluation Snapshot

	Research GGUFs were continued from the existing results and merged with the latest rerun on the same curated 58-question bilingual benchmark.

	\| Quant \| Think \| No-Think \| Avg \| Status \|
	\|---\|---:\|---:\|---:\|---\|
	\| Q3_K_M \| 74.1% \| 72.4% \| 73.2% \| Best current research quant \|
	\| IQ3_M \| 60.3% \| 60.3% \| 60.3% \| Heavy quality loss \|
	\| IQ2_M \| 20.7% \| 19.0% \| 19.8% \| Below usable threshold \|
	\| IQ2_XS \| 5.2% \| 3.4% \| 4.3% \| Triggered early-stop for lower bits \|

	## Research Guidance

	- Public research recommendation: Q3_K_M only
	- IQ3_M is still uploadable for experiments, but quality is clearly degraded
	- The rerun auto-stopped below IQ2_XS because average pass rate fell under 50%, so lower-bit quants should be considered archival artifacts rather than viable deployments
	- For any user-facing scenario, prefer the release bundle instead of this research branch

	For cross-family ranking and release-vs-research comparison, see `results/COMPARISON.md` in the workspace.