nhannt201 commited on
Commit
82a7317
·
verified ·
1 Parent(s): e5479b7

feat: upload airy acne assistant gguf for mobile

Browse files
.gitattributes CHANGED
@@ -33,3 +33,16 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ gguf/airy-iq1_m.gguf filter=lfs diff=lfs merge=lfs -text
37
+ gguf/airy-iq1_s.gguf filter=lfs diff=lfs merge=lfs -text
38
+ gguf/airy-iq2_m.gguf filter=lfs diff=lfs merge=lfs -text
39
+ gguf/airy-iq2_s.gguf filter=lfs diff=lfs merge=lfs -text
40
+ gguf/airy-iq2_xs.gguf filter=lfs diff=lfs merge=lfs -text
41
+ gguf/airy-iq2_xxs.gguf filter=lfs diff=lfs merge=lfs -text
42
+ gguf/airy-iq3_s.gguf filter=lfs diff=lfs merge=lfs -text
43
+ gguf/airy-iq3_xs.gguf filter=lfs diff=lfs merge=lfs -text
44
+ gguf/airy-iq3_xxs.gguf filter=lfs diff=lfs merge=lfs -text
45
+ gguf/airy-q2_k_s.gguf filter=lfs diff=lfs merge=lfs -text
46
+ gguf/airy-q2_k.gguf filter=lfs diff=lfs merge=lfs -text
47
+ gguf/airy-tq1_0.gguf filter=lfs diff=lfs merge=lfs -text
48
+ gguf/airy-tq2_0.gguf filter=lfs diff=lfs merge=lfs -text
PROMPT_TEMPLATE.txt ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Default chat template behavior:
2
+ - If there is no system message, the tokenizer injects a short default identity prompt for Acnoryx AI.
3
+ - Preferred languages: Vietnamese or English.
4
+ - Preferred use: chat mode with short direct user prompts.
5
+
6
+ Example user prompts:
7
+ - Xin chào
8
+ - Bạn là ai?
9
+ - Da em nhiều mụn ẩn ở trán thì nên bắt đầu từ đâu?
10
+ - Hello, I have inflamed acne and dark marks. What should I prioritize?
11
+ - Kết quả quét của tôi: mụn cám (28%), mụn nang (19%), mụn mủ (35%). Hãy tóm tắt.
12
+ - Phân tích AI vùng má phải: mụn mủ (41%), thâm đỏ sau mụn (24%), mụn cám (20%), mụn nang (15%).
README.md ADDED
@@ -0,0 +1,177 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - vi
4
+ - en
5
+ license: apache-2.0
6
+ tags:
7
+ - gguf
8
+ - qwen3
9
+ - quantization
10
+ - llama-cpp
11
+ - acne
12
+ - skincare
13
+ base_model: Qwen/Qwen3.5-0.8B
14
+ model_type: qwen3
15
+ pipeline_tag: text-generation
16
+ library_name: gguf
17
+ ---
18
+
19
+ # Acnoryx/Airy GGUFs
20
+
21
+ GGUF pack for the fine-tuned Acnoryx acne-care assistant model, prepared for direct publishing under `Acnoryx/Airy`.
22
+
23
+ - Hugging Face repo: `Acnoryx/Airy`
24
+ - Product app: [Acnoryx - AI for Acne Care](https://play.google.com/store/apps/details?id=com.fivecanh.acnoryx)
25
+
26
+ ## Important scope
27
+
28
+ This repo is a **research snapshot**: it explores how far the model can be compressed while still keeping acceptable acne-domain quality.
29
+
30
+ It is **not** the production model used inside the Acnoryx app; the app may use different quantization, prompt format, runtime settings, and evaluation criteria.
31
+
32
+ ## What this repo contains
33
+
34
+ This repo focuses on a small set of high-value research GGUFs for Acnoryx/Airy. The most important published candidates are listed in the evaluation table below.
35
+
36
+ ## Model Quality Summary
37
+
38
+ The tables below group the models by practical quality. Models with <50% thinking accuracy are separated into a “Low-quality / not recommended” section so the “Core + Airy” table stays focused on the usable candidates.
39
+
40
+ ### Core (release floor)
41
+
42
+ | Model | Size | Thinking Pass | Thinking % | Non-Think Pass | Non-Think % | Think t/s | Non-Think t/s |
43
+ |---|---:|---:|---:|---:|---:|---:|---:|
44
+ | `acnoryx-0.8b-f16` | 1446 MB | 99/100 | 99.0% | 100/100 | 100.0% | 44 | 44 |
45
+ | `acnoryx-0.8b-q8_0` | 774 MB | 98/100 | 98.0% | 98/100 | 98.0% | 69 | 69 |
46
+ | `acnoryx-0.8b-q4_k_m` | 505 MB | 98/100 | 98.0% | 100/100 | 100.0% | 75 | 80 |
47
+ | `acnoryx-0.8b-iq4_nl` | 493 MB | 99/100 | 99.0% | 100/100 | 100.0% | 93 | 93 |
48
+ | `acnoryx-0.8b-iq4_xs` | 482 MB | 99/100 | 99.0% | 98/100 | 98.0% | 92 | 92 |
49
+ | `acnoryx-0.8b-q3_k_m` | 445 MB | 99/100 | 99.0% | 97/100 | 97.0% | 96 | 95 |
50
+ | `acnoryx-0.8b-iq3_m` | 433 MB | 91/100 | 91.0% | 96/100 | 96.0% | 88 | 88 |
51
+
52
+ ### Airy candidates (acceptable quality)
53
+
54
+ | Model | Size | Thinking Pass | Thinking % | Non-Think Pass | Non-Think % | Think t/s | Non-Think t/s |
55
+ |---|---:|---:|---:|---:|---:|---:|---:|
56
+ | `airy-iq3_s.gguf` | 415.6 MB | 100/100 | 100% | 100/100 | 100% | ~20 | ~20 |
57
+ | `airy-iq3_xs.gguf` | 408.4 MB | 100/100 | 100% | 98/100 | 98% | ~19 | ~19 |
58
+ | `airy-q2_k.gguf` | 377.4 MB | 95/100 | 95% | 97/100 | 97% | ~23 | ~23 |
59
+ | `airy-q2_k_s.gguf` | 370.2 MB | 98/100 | 98% | 93/100 | 93% | ~20 | ~20 |
60
+ | `airy-iq2_s.gguf` | 320.2 MB | 78/100 | 78% | 81/100 | 81% | ~21 | ~21 |
61
+ | `airy-iq2_xs.gguf` | 317.5 MB | 82/100 | 82% | 74/100 | 74% | ~23 | ~23 |
62
+
63
+ ### Low-quality / not recommended (thinking < 50%)
64
+
65
+ These models are included for transparency but are not recommended for deployment due to very low thinking accuracy.
66
+
67
+ | Model | Size | Thinking Pass | Thinking % | Non-Think Pass | Non-Think % | Think t/s | Non-Think t/s |
68
+ |---|---:|---:|---:|---:|---:|---:|---:|
69
+ | `airy-iq2_xxs.gguf` | 303.1 MB | 49/100 | 49% | 50/100 | 50% | ~21 | ~21 |
70
+ | `airy-iq2_m.gguf` | 334.3 MB | 44/100 | 44% | 48/100 | 48% | ~22 | ~22 |
71
+ | `airy-tq1_0.gguf` | 311.4 MB | 11/100 | 11% | 10/100 | 10% | (high) | (high) |
72
+ | `airy-iq1_m.gguf` | 285.5 MB | 14/100 | 14% | 14/100 | 14% | ~30 | ~30 |
73
+ | `airy-iq1_s.gguf` | 275.0 MB | 7/100 | 7% | 6/100 | 6% | ~30 | ~30 |
74
+
75
+ ## Key Takeaways
76
+
77
+ | Finding | Notes |
78
+ |---|---|
79
+ | `airy-iq3_s.gguf` is the best overall file | Highest quality, matches or exceeds reference floor while being smaller.
80
+ | `airy-q2_k.gguf` is the best value | Largest size reduction while keeping high accuracy.
81
+ | `airy-q2_k_s.gguf` is the best aggressive option | Very high thinking accuracy with a modest non-thinking drop.
82
+ | `airy-iq2_s.gguf` and `airy-iq2_xs.gguf` are the lower usable edge | These are the smallest usable models before quality drops sharply.
83
+ | `airy-iq2_xxs.gguf` and below are not reliable | Performance and accuracy degrade too far for typical deployment.
84
+ | Some files lack final benchmark records | `airy-iq3_xxs.gguf` and `airy-tq2_0.gguf` were generated but not merged into final evaluation.
85
+
86
+ ## Recommendations
87
+
88
+ | Recommendation | Model(s) | Why |
89
+ |---|---|---|
90
+ | Default (publish) | `airy-iq3_s.gguf` | Best quality for minimal size; safest drop-in.
91
+ | Best compression/value | `airy-q2_k.gguf` | Great size reduction with minimal quality loss.
92
+ | Aggressive small | `airy-q2_k_s.gguf` | Strong thinking mode; OK non-thinking.
93
+ | Smallest still usable | `airy-iq2_s.gguf` | Lowest model that retains practical utility.
94
+ | Archive / not recommended | `airy-iq2_xxs.gguf`, `airy-iq2_m.gguf`, `airy-tq1_0.gguf`, `airy-iq1_m.gguf`, `airy-iq1_s.gguf` | Included for transparency but not for deployment.
95
+
96
+ ## How To Run
97
+
98
+ ### 1. Install Python packages
99
+
100
+ ```bash
101
+ pip install huggingface_hub llama-cpp-python
102
+ ```
103
+
104
+ `huggingface_hub` is used to download the GGUF file from Hugging Face.
105
+
106
+ `llama-cpp-python` is used to actually load and run the GGUF model in Python.
107
+
108
+ ### 2. Download a GGUF from Hugging Face in Python
109
+
110
+ ```python
111
+ from huggingface_hub import hf_hub_download
112
+
113
+ model_path = hf_hub_download(
114
+ repo_id="Acnoryx/Airy",
115
+ filename="airy-iq3_s.gguf",
116
+ local_dir="gguf",
117
+ )
118
+
119
+ print(model_path)
120
+ ```
121
+
122
+ ### 3. Run inference in Python
123
+
124
+ ```python
125
+ from huggingface_hub import hf_hub_download
126
+ from llama_cpp import Llama
127
+
128
+ model_path = hf_hub_download(
129
+ repo_id="Acnoryx/Airy",
130
+ filename="airy-iq3_s.gguf",
131
+ local_dir="gguf",
132
+ )
133
+
134
+ llm = Llama(
135
+ model_path=model_path,
136
+ n_ctx=4096,
137
+ n_gpu_layers=-1,
138
+ verbose=False,
139
+ )
140
+
141
+ result = llm.create_chat_completion(
142
+ messages=[
143
+ {"role": "system", "content": "You are Acnoryx AI, a dermatology assistant focused on acne and skincare."},
144
+ {"role": "user", "content": "What are blackheads?"},
145
+ ],
146
+ temperature=0.2,
147
+ )
148
+
149
+ print(result["choices"][0]["message"]["content"])
150
+ ```
151
+
152
+ ### 4. Swap model files quickly
153
+
154
+ Just change the `filename` value:
155
+
156
+ - `airy-iq3_s.gguf`
157
+ - `airy-q2_k.gguf`
158
+ - `airy-iq2_s.gguf`
159
+
160
+ ### 5. Quick pick guide
161
+
162
+ - Choose `airy-iq3_s.gguf` for the strongest overall result.
163
+ - Choose `airy-q2_k.gguf` for the best compression/value balance.
164
+ - Choose `airy-iq2_s.gguf` only if you need a much smaller file and can accept a visible quality drop.
165
+
166
+ ## Final Conclusion
167
+
168
+ The current Airy lineup proves that the model can go below the old release-floor size while still keeping strong acne-domain quality.
169
+
170
+ The best files in this repo are:
171
+
172
+ - `airy-iq3_s.gguf`
173
+ - `airy-iq3_xs.gguf`
174
+ - `airy-q2_k.gguf`
175
+ - `airy-q2_k_s.gguf`
176
+
177
+ Among them, `airy-iq3_s.gguf` is the safest default publish choice, while `airy-q2_k.gguf` is the best efficiency result.
chat_template.jinja ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- set default_system = 'You are Acnoryx AI, a friendly dermatology assistant in the Acnoryx acne scanner app. Your expertise: acne types, causes, treatments, skincare routines. Only answer skincare/dermatology topics. Politely decline unrelated questions. You are NOT a doctor. Be clear, friendly, and helpful. Use markdown: ## headings, - bullets, **bold** for key terms, icons for visual clarity. For medical/skin analysis topics, always add a disclaimer note. Keep the answer clear, useful, and moderately detailed. Reason carefully before answering. Put your reasoning inside <think>...</think>, then provide the final user-facing answer after the closing tag.' -%}
2
+ {%- if messages|length == 0 or messages[0].role != 'system' -%}
3
+ {{- '<|im_start|>system
4
+ ' + default_system + '<|im_end|>
5
+ ' -}}
6
+ {%- endif -%}
7
+ {%- for message in messages -%}
8
+ {%- if message.role == 'assistant' and '</think>' in message.content -%}
9
+ {%- set parts = message.content.split('</think>') -%}
10
+ {%- set thinking = parts[0].replace('<think>', '').strip() -%}
11
+ {%- set answer = parts[1].strip() -%}
12
+ {{- '<|im_start|>assistant
13
+ <think>
14
+ ' + thinking + '
15
+ </think>
16
+
17
+ ' + answer + '<|im_end|>
18
+ ' -}}
19
+ {%- else -%}
20
+ {{- '<|im_start|>' + message.role + '
21
+ ' + message.content + '<|im_end|>
22
+ ' -}}
23
+ {%- endif -%}
24
+ {%- endfor -%}
25
+ {%- if add_generation_prompt -%}
26
+ {{- '<|im_start|>assistant
27
+ <think>
28
+ ' -}}
29
+ {%- endif -%}
gguf/airy-iq1_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c5d6deda2dff96a7c81dc7e2d7371a74f062fd0d5a45d1f005b39cce1f840e6b
3
+ size 299396448
gguf/airy-iq1_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5ed73a47d42cdb52594278b861dfced31c626aca64dd7ab939246e747e0abf95
3
+ size 288360288
gguf/airy-iq2_m.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f727572b5992b34888b9f48a1799cdac88baa754021813874eb9e9c89a9b51b6
3
+ size 350500704
gguf/airy-iq2_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:159f900fba886589df07bb00890a4b9aa3077e751cbbb485cc560d18ce576788
3
+ size 335785824
gguf/airy-iq2_xs.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5b7ed630bddcabd7af7a7a6fcad53baa5c5fcea81911aa443e712e8fadb9cea0
3
+ size 332898144
gguf/airy-iq2_xxs.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c9025f75968790a54b52578c92048e2203e7d615a96e56245df595099fbf79ae
3
+ size 317790048
gguf/airy-iq3_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:04720a6c9202d63f96f6bcb90d9ffa48c13534f2615bcaa82ca50bbb46778a7c
3
+ size 435774304
gguf/airy-iq3_xs.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bcb901f60e12b3fff5e94d9cb9c9d9f6c23e761635cd63fdd67e1b017615d325
3
+ size 428254048
gguf/airy-iq3_xxs.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9e1951d762635c33a0ea81554e29d086b984f3fff58ccd6d9d0cd1e560d804c
3
+ size 377644896
gguf/airy-q2_k.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bf2ebc1be1a88978d64875a9ca65e4bd256e895d2b998e03d64c9f4d20745e04
3
+ size 395778400
gguf/airy-q2_k_s.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f55c7b25071ebb8eb0cf599d758802f780139fe158f6b8d42b5c70c83d8b4784
3
+ size 388135264
gguf/airy-tq1_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f9a8e6f60503d6277c77e5a9fb106475030252b3ba90cc1f779191d8b700cbf5
3
+ size 326503264
gguf/airy-tq2_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01cb2a675cc6c84666e0d55b910fb0753d404946cf098c983311f8b4f9599240
3
+ size 349828960