Fortytwo-Network
/

Strand-Rust-Coder-14B-v1-GGUF

Text Generation

Transformers

GGUF

Model card Files Files and versions

xet

Community

inikitin commited on Oct 15, 2025

Commit

012805d

verified ·

1 Parent(s): 61b5552

Update README.md

Browse files

Files changed (1) hide show

README.md +22 -31

README.md CHANGED Viewed

@@ -144,7 +144,7 @@ This work validates Fortytwo’s thesis: **intelligence can scale horizontally t
 ---
-## 🔬 Research & References
 - [Self-Supervised Inference of Agents in Trustless Environments](https://arxiv.org/abs/2409.08386) – *High-level overview of Fortytwo architecture*
@@ -172,47 +172,38 @@ To run a Fortytwo node or contribute your own models and fine-tunes, visit: [for
 ---
-## Inference Examples
-### Using `pipeline`
-```python
-from transformers import pipeline
-pipe = pipeline("text-generation", model="Fortytwo-Network/Strand-Rust-Coder-14B-v1")
-messages = [
-    {"role": "user", "content": "Write a Rust function that finds the first string longer than 10 characters in a vector."},
-]
-pipe(messages)
-```
-### Using Transformers Directly
-```python
-# Load model directly
-from transformers import AutoTokenizer, AutoModelForCausalLM
-tokenizer = AutoTokenizer.from_pretrained("Fortytwo-Network/Strand-Rust-Coder-14B-v1")
-model = AutoModelForCausalLM.from_pretrained("Fortytwo-Network/Strand-Rust-Coder-14B-v1")
-messages = [
-    {"role": "user", "content": "Write a Rust function that finds the first string longer than 10 characters in a vector."},
-]
-inputs = tokenizer.apply_chat_template(
-    messages,
-    add_generation_prompt=True,
-    tokenize=True,
-    return_dict=True,
-    return_tensors="pt",
-).to(model.device)
-outputs = model.generate(**inputs, max_new_tokens=40)
-print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
 ```
 ---
 **Fortytwo – An open, networked intelligence shaped collectively by its participants**
 Join the swarm: [fortytwo.network](https://fortytwo.network)

 ---
+## Research & References
 - [Self-Supervised Inference of Agents in Trustless Environments](https://arxiv.org/abs/2409.08386) – *High-level overview of Fortytwo architecture*
 ---
+## GGUF Quantized Versions
+This repository provides **GGUF-format quantizations** of the model [Fortytwo-Network/Strand-Rust-Coder-14B-v1](https://huggingface.co/Fortytwo-Network/Strand-Rust-Coder-14B-v1), optimized for local inference using tools such as **llama.cpp**, **Jan**, **Ollama**, **LM Studio** and other compatible runtimes.
+These quantizations significantly reduce memory requirements while preserving near-original accuracy, making deployment possible on a wide range of consumer hardware.
+| **Quantization** | **File Size** | **Bit Precision** | **Description** |
+|------------------|-----------|------------------|----------------|
+| **Q8_0** | 15.7 GB | **8-bit** | Near-full precision, for most demanding local inference |
+| **Q6_K** | 12.1 GB | **6-bit** | Balanced performance and efficiency |
+| **Q5_K_M** | 10.5 GB | **5-bit** | Lightweight deployment with strong accuracy retention |
+| **Q4_K_M** | 8.99 GB | **4-bit** | Ultra-fast, compact variant for consumer GPUs and laptops |
+---
+### Usage
+You can load the GGUF models with **llama.cpp** or compatible backends:
+```bash
+./main -m models/Strand-Rust-Coder-14B-v1.Q5_K_M.gguf -p "Write a Rust function that reads a file line by line."
 ```
+Or run interactively in **Jan**, **LM Studio** or **Ollama** by simply importing the model.
 ---
+### License
+These quantized weights are distributed under the same **Apache 2.0 License** as the original model.
 **Fortytwo – An open, networked intelligence shaped collectively by its participants**
 Join the swarm: [fortytwo.network](https://fortytwo.network)