nhannt201 commited on
Commit
2009f9e
·
verified ·
1 Parent(s): 36b8b8e

Update Airy research model card (0.8b)

Browse files
Files changed (1) hide show
  1. README.md +10 -120
README.md CHANGED
@@ -1,125 +1,15 @@
1
- ---
2
- language:
3
- - en
4
- - vi
5
- license: other
6
- library_name: gguf
7
- pipeline_tag: text-generation
8
- base_model:
9
- - Qwen/Qwen3.5-0.8B
10
- model-index:
11
- - name: Airy
12
- results:
13
- - task:
14
- type: text-generation
15
- metrics:
16
- - name: Q3_K_M Pass Rate (58 questions)
17
- type: custom
18
- value: 81.0
19
- - name: IQ3_M Pass Rate (58 questions)
20
- type: custom
21
- value: 70.7
22
- tags:
23
- - Airy
24
- - dermatology
25
- - skincare
26
- - acne
27
- - gguf
28
- - quantization
29
- - research
30
- - bilingual
31
- ---
32
 
33
- # Airy
34
 
35
- Airy is the public research branch of the Acnoryx dermatology model family. It contains lower-bit GGUF experiments focused on size reduction, edge deployment trade-offs, and empirical failure tracking rather than production safety.
36
-
37
- ## Model Overview
38
-
39
- - Model name: Airy
40
- - Public repo: `Acnoryx/Airy`
41
- - Base model: `Qwen/Qwen3.5-0.8B`
42
- - Languages: Vietnamese and English
43
- - Runtime format: GGUF for llama.cpp-style runtimes
44
- - Goal: identify the smallest research quant that still remains meaningfully useful
45
- - App: [Acnoryx on Google Play](https://play.google.com/store/apps/details?id=com.fivecanh.acnoryx)
46
-
47
- ## Intended use
48
-
49
- - Quantization research
50
- - Edge-device experiments
51
- - Quality-vs-size benchmarking
52
- - Comparative analysis against the release branch
53
-
54
- ## Not intended use
55
-
56
- - Default production deployment
57
- - Unsupervised medical advice
58
- - General-purpose assistant behavior outside skincare and dermatology
59
-
60
- ## Hugging Face Summary
61
-
62
- | Field | Value |
63
- |---|---|
64
- | Model | Airy |
65
- | Public repo | Acnoryx/Airy |
66
- | Base model | Qwen/Qwen3.5-0.8B |
67
- | Languages | English + Vietnamese |
68
- | Format | GGUF |
69
- | License | other |
70
- | Pipeline | text-generation |
71
- | App link | [Google Play](https://play.google.com/store/apps/details?id=com.fivecanh.acnoryx) |
72
-
73
- ## Evaluation protocol
74
-
75
- - Benchmark: `release_eval_v1-58`
76
- - Coverage: 58 bilingual cases across identity, knowledge, refusal, scan interpretation, language, format, hallucination, subtitle, rude-user handling, and medical-defer behavior
77
- - Modes: `thinking` and `non-thinking`
78
- - Sweep order: higher quality to lower quality
79
- - Early-stop rule: stop when the average of the 2 modes falls below 50%
80
- - Result: the sweep stopped at `Airy-0.8b-IQ2_M` with 31.1% average, so lower public research quants were not considered viable in the combined release-gate run
81
-
82
- ## Best Scores
83
-
84
- | Rank | Public model file | Size | Thinking | Non-thinking | Avg | Status |
85
- |---|---|---:|---:|---:|---:|---|
86
- | 1 | Airy-0.8b-Q3_K_M.gguf | 445 MB | 81.0% | 79.3% | 80.2% | Best viable research quant |
87
- | 2 | Airy-0.8b-IQ3_M.gguf | 433 MB | 70.7% | 70.7% | 70.7% | Experimental, still usable |
88
- | 3 | Airy-0.8b-IQ2_M.gguf | 360 MB | 27.6% | 34.5% | 31.1% | Threshold failure |
89
-
90
- ## Research vs release comparison
91
-
92
- | Branch | Representative quant | Size | Thinking | Non-thinking | Avg | Interpretation |
93
- |---|---|---:|---:|---:|---:|---|
94
- | Release | Q4_K_M | 505 MB | 93.1% | 82.8% | 87.9% | Best practical release balance |
95
- | Release | Q4_0 | 478 MB | 87.9% | 82.8% | 85.4% | Smallest release quant still solid |
96
- | Research | Airy-0.8b-Q3_K_M | 445 MB | 81.0% | 79.3% | 80.2% | Strongest research quant tested |
97
- | Research | Airy-0.8b-IQ3_M | 433 MB | 70.7% | 70.7% | 70.7% | Usable only for experimental work |
98
- | Research | Airy-0.8b-IQ2_M | 360 MB | 27.6% | 34.5% | 31.1% | Collapse point, triggered early stop |
99
-
100
- ## Detailed research results
101
-
102
- | Public model file | Local source file | Size | Thinking | Non-thinking | Avg | Notes |
103
- |---|---|---:|---:|---:|---:|---|
104
- | Airy-0.8b-Q3_K_M.gguf | gguf/Airy-0.8b-Q3_K_M.gguf | 445 MB | 81.0% | 79.3% | 80.2% | Smallest clearly useful research quant |
105
- | Airy-0.8b-IQ3_M.gguf | gguf/Airy-0.8b-IQ3_M.gguf | 433 MB | 70.7% | 70.7% | 70.7% | Experimental floor before major degradation |
106
- | Airy-0.8b-IQ2_M.gguf | gguf/Airy-0.8b-IQ2_M.gguf | 360 MB | 27.6% | 34.5% | 31.1% | Below threshold, not recommended |
107
- | Airy-0.8b-Q2_K.gguf | gguf/Airy-0.8b-Q2_K.gguf | 403 MB | not run | not run | not run | Skipped after early stop |
108
- | Airy-0.8b-IQ2_XS.gguf | gguf/Airy-0.8b-IQ2_XS.gguf | 347 MB | not run | not run | not run | Skipped after early stop |
109
- | Airy-0.8b-IQ2_XXS.gguf | gguf/Airy-0.8b-IQ2_XXS.gguf | 336 MB | not run | not run | not run | Skipped after early stop |
110
- | Airy-0.8b-IQ1_M.gguf | gguf/Airy-0.8b-IQ1_M.gguf | 323 MB | not run | not run | not run | Skipped after early stop |
111
- | Airy-0.8b-IQ1_S.gguf | gguf/Airy-0.8b-IQ1_S.gguf | 315 MB | not run | not run | not run | Skipped after early stop |
112
-
113
- ## Interpretation
114
-
115
- - `Airy-0.8b-Q3_K_M` is the best public research checkpoint when file size matters more than absolute quality.
116
- - `Airy-0.8b-IQ3_M` is still measurable but already loses substantial domain fidelity.
117
- - `Airy-0.8b-IQ2_M` marks the quality collapse point for this benchmark.
118
- - If you need stable app-facing behavior, use the release branch instead of this repo.
119
 
120
  ## Notes
121
 
122
- - IQ1 and IQ2 formats require an importance matrix.
123
- - These quants are more experimental than the release branch.
124
- - For Vietnamese prompts, use full accents for the most consistent behavior.
125
- - Outputs remain reference-only and do not replace dermatologist care.
 
1
+ # Acnoryx AI Research Bundle
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
 
3
+ ## Overview
4
 
5
+ - Base model: Qwen/Qwen3.5-0.8B
6
+ - Model size: 0.8b
7
+ - Research quantizations: Q3_K_M, IQ3_M, Q2_K, IQ2_M, IQ2_XS, IQ2_XXS, IQ1_M, IQ1_S
8
+ - Purpose: evaluate quality vs. size trade-offs below the production threshold
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  ## Notes
11
 
12
+ - IQ1/IQ2 formats require an importance matrix (imatrix).
13
+ - These files are more experimental than the release bundle.
14
+ - Production-facing use should prefer the release bundle.
15
+ - If prompting in Vietnamese, write with full accents for best consistency.