Acnoryx
/

Airy

@@ -1,31 +1,95 @@
-# Acnoryx AI Research Bundle
-## Overview
-- Base model: Qwen/Qwen3.5-0.8B
-- Model size: 0.8b
-- Research quantizations: Q3_K_M, IQ3_M, Q2_K, IQ2_M, IQ2_XS, IQ2_XXS, IQ1_M, IQ1_S
-- Purpose: evaluate quality vs. size trade-offs below the production threshold
-## Notes
-- IQ1/IQ2 formats require an importance matrix (imatrix).
-- These files are more experimental than the release bundle.
-- Production-facing use should prefer the release bundle.
-- If prompting in Vietnamese, write with full accents for best consistency.
-## Evaluation snapshot
-- Dataset: curated `release_eval_v1-58` benchmark with 58 bilingual cases aligned to the training system style.
-- Protocol: quants were evaluated from higher quality to lower quality, in both `thinking` and `non-thinking` modes.
-- Early-stop rule: stop the downward sweep when the average of the two modes drops below 50%.
-- Stop point reached at `acnoryx-0.8b-iq2_m`: 27.6% thinking, 34.5% non-thinking, 31.1% average.
-- Because of that stop rule, lower research quants such as `Q2_K`, `IQ2_XS`, `IQ2_XXS`, `IQ1_M`, and `IQ1_S` were not executed in the combined all-model release gate.
-## Tested research results
-| Quant | Size | Thinking | Non-thinking | Avg | Interpretation |
-|---|---:|---:|---:|---:|---|
-| Q3_K_M | 445 MB | 81.0% | 79.3% | 80.2% | Smallest clearly usable research quant |
-| IQ3_M | 433 MB | 70.7% | 70.7% | 70.7% | Experimental but still above minimum viability |
-| IQ2_M | 360 MB | 27.6% | 34.5% | 31.1% | Below threshold, triggered early stop |

+---
+language:
+- en
+- vi
+license: other
+library_name: gguf
+pipeline_tag: text-generation
+base_model:
+- Qwen/Qwen3.5-0.8B
+tags:
+- Airy
+- dermatology
+- skincare
+- acne
+- gguf
+- quantization
+- research
+- bilingual
+---
+# Airy
+Airy is the public research branch of the Acnoryx dermatology model family. It contains the lower-bit GGUF experiments derived from Airy-Core-0.8B, with emphasis on size reduction, edge deployment trade-offs, and empirical failure tracking rather than production safety.
+## Scope
+- Public repo: `Acnoryx/Airy`
+- Production counterpart: `Acnoryx/Airy-Core-0.8B`
+- Base model: `Qwen/Qwen3.5-0.8B`
+- Languages: Vietnamese and English
+- Runtime format: GGUF for llama.cpp-style runtimes
+- Goal: identify the smallest research quant that still remains meaningfully useful
+## Intended use
+- Quantization research
+- Edge-device experiments
+- Quality-vs-size benchmarking
+- Comparative analysis against the release branch
+## Not intended use
+- Default production deployment
+- Unsupervised medical advice
+- General-purpose assistant behavior outside skincare and dermatology
+## Evaluation protocol
+- Benchmark: `release_eval_v1-58`
+- Coverage: 58 bilingual cases across identity, knowledge, refusal, scan interpretation, language, format, hallucination, subtitle, rude-user handling, and medical-defer behavior
+- Modes: `thinking` and `non-thinking`
+- Sweep order: higher quality to lower quality
+- Early-stop rule: stop when the average of the 2 modes falls below 50%
+- Result: the sweep stopped at `Airy-0.8b-IQ2_M` with 31.1% average, so lower public research quants were not considered viable in the combined release-gate run
+## Research vs release comparison
+| Branch | Representative quant | Size | Thinking | Non-thinking | Avg | Interpretation |
+|---|---|---:|---:|---:|---:|---|
+| Release | Airy-Core-0.8b-Q4_K_M | 505 MB | 93.1% | 82.8% | 87.9% | Best practical release balance |
+| Release | Airy-Core-0.8b-Q4_0 | 478 MB | 87.9% | 82.8% | 85.4% | Smallest release quant still solid |
+| Research | Airy-0.8b-Q3_K_M | 445 MB | 81.0% | 79.3% | 80.2% | Strongest research quant tested |
+| Research | Airy-0.8b-IQ3_M | 433 MB | 70.7% | 70.7% | 70.7% | Usable only for experimental work |
+| Research | Airy-0.8b-IQ2_M | 360 MB | 27.6% | 34.5% | 31.1% | Collapse point, triggered early stop |
+## Detailed research results
+| Public model file | Local source file | Size | Thinking | Non-thinking | Avg | Notes |
+|---|---|---:|---:|---:|---:|---|
+| Airy-0.8b-Q3_K_M.gguf | acnoryx-0.8b-q3_k_m.gguf | 445 MB | 81.0% | 79.3% | 80.2% | Smallest clearly useful research quant |
+| Airy-0.8b-IQ3_M.gguf | acnoryx-0.8b-iq3_m.gguf | 433 MB | 70.7% | 70.7% | 70.7% | Experimental floor before major degradation |
+| Airy-0.8b-IQ2_M.gguf | acnoryx-0.8b-iq2_m.gguf | 360 MB | 27.6% | 34.5% | 31.1% | Below threshold, not recommended |
+| Airy-0.8b-Q2_K.gguf | acnoryx-0.8b-q2_k.gguf | 403 MB | not run | not run | not run | Skipped after early stop |
+| Airy-0.8b-IQ2_XS.gguf | acnoryx-0.8b-iq2_xs.gguf | 347 MB | not run | not run | not run | Skipped after early stop |
+| Airy-0.8b-IQ2_XXS.gguf | acnoryx-0.8b-iq2_xxs.gguf | 336 MB | not run | not run | not run | Skipped after early stop |
+| Airy-0.8b-IQ1_M.gguf | acnoryx-0.8b-iq1_m.gguf | 323 MB | not run | not run | not run | Skipped after early stop |
+| Airy-0.8b-IQ1_S.gguf | acnoryx-0.8b-iq1_s.gguf | 315 MB | not run | not run | not run | Skipped after early stop |
+## Interpretation
+- `Airy-0.8b-Q3_K_M` is the best public research checkpoint when file size matters more than absolute quality.
+- `Airy-0.8b-IQ3_M` is still measurable but already loses substantial domain fidelity.
+- `Airy-0.8b-IQ2_M` marks the quality collapse point for this benchmark.
+- If you need stable app-facing behavior, use the release branch instead of this repo.
+## Public naming
+Public Hugging Face files in this repository are published with the `Airy-...` prefix for clarity. The local workspace may still keep the original `acnoryx-...` filenames for build compatibility.
+## Notes
+- IQ1 and IQ2 formats require an importance matrix.
+- These quants are more experimental than the release branch.
+- For Vietnamese prompts, use full accents for the most consistent behavior.
+- Outputs remain reference-only and do not replace dermatologist care.