Text Generation
Transformers
GGUF
inikitin commited on
Commit
012805d
·
verified ·
1 Parent(s): 61b5552

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -31
README.md CHANGED
@@ -144,7 +144,7 @@ This work validates Fortytwo’s thesis: **intelligence can scale horizontally t
144
 
145
  ---
146
 
147
- ## 🔬 Research & References
148
 
149
  - [Self-Supervised Inference of Agents in Trustless Environments](https://arxiv.org/abs/2409.08386) – *High-level overview of Fortytwo architecture*
150
 
@@ -172,47 +172,38 @@ To run a Fortytwo node or contribute your own models and fine-tunes, visit: [for
172
 
173
  ---
174
 
175
- ## Inference Examples
176
 
177
- ### Using `pipeline`
178
 
179
- ```python
180
- from transformers import pipeline
181
 
182
- pipe = pipeline("text-generation", model="Fortytwo-Network/Strand-Rust-Coder-14B-v1")
183
- messages = [
184
- {"role": "user", "content": "Write a Rust function that finds the first string longer than 10 characters in a vector."},
185
- ]
186
- pipe(messages)
187
- ```
188
-
189
- ### Using Transformers Directly
190
 
191
- ```python
192
- # Load model directly
193
- from transformers import AutoTokenizer, AutoModelForCausalLM
194
-
195
- tokenizer = AutoTokenizer.from_pretrained("Fortytwo-Network/Strand-Rust-Coder-14B-v1")
196
- model = AutoModelForCausalLM.from_pretrained("Fortytwo-Network/Strand-Rust-Coder-14B-v1")
197
 
198
- messages = [
199
- {"role": "user", "content": "Write a Rust function that finds the first string longer than 10 characters in a vector."},
200
- ]
201
 
202
- inputs = tokenizer.apply_chat_template(
203
- messages,
204
- add_generation_prompt=True,
205
- tokenize=True,
206
- return_dict=True,
207
- return_tensors="pt",
208
- ).to(model.device)
209
 
210
- outputs = model.generate(**inputs, max_new_tokens=40)
211
- print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
212
  ```
213
 
 
 
214
  ---
215
 
 
 
 
 
 
216
  **Fortytwo – An open, networked intelligence shaped collectively by its participants**
217
 
218
  Join the swarm: [fortytwo.network](https://fortytwo.network)
 
144
 
145
  ---
146
 
147
+ ## Research & References
148
 
149
  - [Self-Supervised Inference of Agents in Trustless Environments](https://arxiv.org/abs/2409.08386) – *High-level overview of Fortytwo architecture*
150
 
 
172
 
173
  ---
174
 
175
+ ## GGUF Quantized Versions
176
 
177
+ This repository provides **GGUF-format quantizations** of the model [Fortytwo-Network/Strand-Rust-Coder-14B-v1](https://huggingface.co/Fortytwo-Network/Strand-Rust-Coder-14B-v1), optimized for local inference using tools such as **llama.cpp**, **Jan**, **Ollama**, **LM Studio** and other compatible runtimes.
178
 
179
+ These quantizations significantly reduce memory requirements while preserving near-original accuracy, making deployment possible on a wide range of consumer hardware.
 
180
 
181
+ | **Quantization** | **File Size** | **Bit Precision** | **Description** |
182
+ |------------------|-----------|------------------|----------------|
183
+ | **Q8_0** | 15.7 GB | **8-bit** | Near-full precision, for most demanding local inference |
184
+ | **Q6_K** | 12.1 GB | **6-bit** | Balanced performance and efficiency |
185
+ | **Q5_K_M** | 10.5 GB | **5-bit** | Lightweight deployment with strong accuracy retention |
186
+ | **Q4_K_M** | 8.99 GB | **4-bit** | Ultra-fast, compact variant for consumer GPUs and laptops |
 
 
187
 
188
+ ---
 
 
 
 
 
189
 
190
+ ### Usage
 
 
191
 
192
+ You can load the GGUF models with **llama.cpp** or compatible backends:
 
 
 
 
 
 
193
 
194
+ ```bash
195
+ ./main -m models/Strand-Rust-Coder-14B-v1.Q5_K_M.gguf -p "Write a Rust function that reads a file line by line."
196
  ```
197
 
198
+ Or run interactively in **Jan**, **LM Studio** or **Ollama** by simply importing the model.
199
+
200
  ---
201
 
202
+ ### License
203
+
204
+ These quantized weights are distributed under the same **Apache 2.0 License** as the original model.
205
+
206
+
207
  **Fortytwo – An open, networked intelligence shaped collectively by its participants**
208
 
209
  Join the swarm: [fortytwo.network](https://fortytwo.network)