riverkan commited on
Commit
4a99e73
·
verified ·
1 Parent(s): 4266a32

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -9
README.md CHANGED
@@ -6,11 +6,11 @@ base_model_relation: quantized
6
  pipeline_tag: text-generation
7
  tags:
8
  - chatllm.cpp
9
- - ggml
10
  - quantization
11
  - int4
12
  - int8
13
  - cpu-inference
 
14
  quantized_by: riverkan
15
  language:
16
  - en
@@ -27,7 +27,7 @@ language:
27
 
28
  Author and distribution: [Riverkan](https://riverkan.com)
29
 
30
- This repository provides CPU/GPU-friendly quantized builds of Ling‑Mini‑2.0 for [ChatLLM.cpp](https://github.com/foldl/chatllm.cpp). It is not a LLaMA model, is not affiliated with Meta, and does not use the LLaMA license. Files are distributed in ChatLLM.cpp’s GGML-based format (.bin), ready for local inference.
31
 
32
  - Available quantizations: Q4_0 (int4), Q8_0 (int8)
33
  - Tested runtime: ChatLLM.cpp
@@ -38,7 +38,7 @@ Notes:
38
 
39
  ## ChatLLM.cpp Quantizations of Ling‑Mini‑2.0
40
 
41
- Quantized with the ChatLLM.cpp toolchain for GGML-format inference (.bin). These builds are intended for the ChatLLM.cpp runtime (CPU and optional GPU acceleration as provided by ChatLLM’s GGML backends). Use ChatLLM.cpp’s convert and run flow described below.
42
 
43
  Original (float) model: to be announced by Riverkan.
44
 
@@ -80,7 +80,7 @@ No special tokens are required by the model itself; most UIs can just send user
80
 
81
  Notes:
82
  - File sizes depend on the base model size; check the release or hosting page for exact sizes.
83
- - These are GGML (.bin) files for ChatLLM.cpp, not GGUF.
84
 
85
  ## How to use with ChatLLM.cpp
86
 
@@ -163,19 +163,19 @@ pip install -U "huggingface_hub[cli]"
163
 
164
  Download a specific file:
165
  ```bash
166
- huggingface-cli download riverkan/ling-mini-2.0-GGML --include "Ling‑Mini‑2.0‑Q4_0.bin" --local-dir ./
167
  ```
168
 
169
  Or the Q8_0 build:
170
  ```bash
171
- huggingface-cli download riverkan/ling-mini-2.0-GGML --include "Ling‑Mini‑2.0‑Q8_0.bin" --local-dir ./
172
  ```
173
 
174
  Replace the model repo path with the actual hosting path if different.
175
 
176
  ## Building your own quant (optional)
177
 
178
- If you have the float/base weights and want to generate your own GGML quantized file for ChatLLM.cpp:
179
 
180
  1) Install Python deps for ChatLLM.cpp’s conversion pipeline:
181
  ```bash
@@ -193,13 +193,13 @@ python convert.py -i /path/to/base/model -t q4_0 -o Ling‑Mini‑2.0‑Q4_0.bin
193
  ```
194
 
195
  Notes:
196
- - ChatLLM.cpp uses GGML-based .bin files (not GGUF).
197
  - See ChatLLM.cpp docs for model-specific flags and supported architectures.
198
 
199
  ## Credits
200
 
201
  - Model and quantized distributions by Riverkan
202
- - Runtime and tooling: ChatLLM.cpp (thanks to the maintainers and the GGML community)
203
  - Thanks to the InclusionAI team for their foundational work and support!
204
  - Everyone in the open-source LLM community who provided benchmarks, ideas, and tools
205
 
 
6
  pipeline_tag: text-generation
7
  tags:
8
  - chatllm.cpp
 
9
  - quantization
10
  - int4
11
  - int8
12
  - cpu-inference
13
+ - ggmm
14
  quantized_by: riverkan
15
  language:
16
  - en
 
27
 
28
  Author and distribution: [Riverkan](https://riverkan.com)
29
 
30
+ This repository provides CPU/GPU-friendly quantized builds of Ling‑Mini‑2.0 for [ChatLLM.cpp](https://github.com/foldl/chatllm.cpp). It is not a LLaMA model, is not affiliated with Meta, and does not use the LLaMA license. Files are distributed in ChatLLM.cpp’s GGMM-based format (.bin), ready for local inference.
31
 
32
  - Available quantizations: Q4_0 (int4), Q8_0 (int8)
33
  - Tested runtime: ChatLLM.cpp
 
38
 
39
  ## ChatLLM.cpp Quantizations of Ling‑Mini‑2.0
40
 
41
+ Quantized with the ChatLLM.cpp toolchain for GGMM-format inference (.bin). These builds are intended for the ChatLLM.cpp runtime (CPU and optional GPU acceleration as provided by ChatLLM’s GGMM backends). Use ChatLLM.cpp’s convert and run flow described below.
42
 
43
  Original (float) model: to be announced by Riverkan.
44
 
 
80
 
81
  Notes:
82
  - File sizes depend on the base model size; check the release or hosting page for exact sizes.
83
+ - These are GGMM (.bin) files for ChatLLM.cpp, not GGUF.
84
 
85
  ## How to use with ChatLLM.cpp
86
 
 
163
 
164
  Download a specific file:
165
  ```bash
166
+ huggingface-cli download RiverkanIT/Ling-mini-2.0-Quantized --include "Ling‑Mini‑2.0‑Q4_0.bin" --local-dir ./
167
  ```
168
 
169
  Or the Q8_0 build:
170
  ```bash
171
+ huggingface-cli download RiverkanIT/Ling-mini-2.0-Quantized --include "Ling‑Mini‑2.0‑Q8_0.bin" --local-dir ./
172
  ```
173
 
174
  Replace the model repo path with the actual hosting path if different.
175
 
176
  ## Building your own quant (optional)
177
 
178
+ If you have the float/base weights and want to generate your own GGMM quantized file for ChatLLM.cpp:
179
 
180
  1) Install Python deps for ChatLLM.cpp’s conversion pipeline:
181
  ```bash
 
193
  ```
194
 
195
  Notes:
196
+ - ChatLLM.cpp uses GGMM-based .bin files (not GGUF).
197
  - See ChatLLM.cpp docs for model-specific flags and supported architectures.
198
 
199
  ## Credits
200
 
201
  - Model and quantized distributions by Riverkan
202
+ - Runtime and tooling: ChatLLM.cpp (thanks to the maintainers and the GGMM community)
203
  - Thanks to the InclusionAI team for their foundational work and support!
204
  - Everyone in the open-source LLM community who provided benchmarks, ideas, and tools
205