Update README.md
Browse files
README.md
CHANGED
|
@@ -25,11 +25,11 @@ It is trained on the high-quality [FineWeb-Edu](https://huggingface.co/datasets/
|
|
| 25 |
|
| 26 |
- **Model Name:** RessAI Onner-300m
|
| 27 |
- **Organization:** RessAI
|
| 28 |
-
- **Architecture:** `RessAiForCausalLM`
|
| 29 |
- **Model Type:** `onner`
|
| 30 |
- **Parameters:** ~199.9 Million (0.20B)
|
| 31 |
- **Context Window:** 4,096 tokens
|
| 32 |
-
- **Vocabulary:** 128,256
|
| 33 |
- **Training Precision:** Bfloat16
|
| 34 |
- **License:** Apache 2.0
|
| 35 |
|
|
@@ -45,41 +45,4 @@ This model uses a custom configuration inspired by BERT-base sizing but with Lla
|
|
| 45 |
| **KV Heads** | 2 | Grouped Query Attention (GQA 6:1) |
|
| 46 |
| **Intermediate Size** | 3,072 | MLP Width |
|
| 47 |
| **RoPE Theta** | 500,000 | Rotary Embeddings Base |
|
| 48 |
-
| **Max Sequence** | 4,096 | Context Length |
|
| 49 |
-
|
| 50 |
-
## 💻 Usage
|
| 51 |
-
|
| 52 |
-
### Python Code (Transformers)
|
| 53 |
-
|
| 54 |
-
Since this model uses a custom architecture configuration (`onner`), ensure you have `transformers` installed.
|
| 55 |
-
|
| 56 |
-
```python
|
| 57 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 58 |
-
import torch
|
| 59 |
-
|
| 60 |
-
model_id = "RessAI/Onner-300m"
|
| 61 |
-
|
| 62 |
-
# 1. Load Tokenizer
|
| 63 |
-
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 64 |
-
|
| 65 |
-
# 2. Load Model
|
| 66 |
-
model = AutoModelForCausalLM.from_pretrained(
|
| 67 |
-
model_id,
|
| 68 |
-
torch_dtype=torch.bfloat16, # Use float16 if bfloat16 not supported
|
| 69 |
-
device_map="auto",
|
| 70 |
-
trust_remote_code=True
|
| 71 |
-
)
|
| 72 |
-
|
| 73 |
-
# 3. Inference
|
| 74 |
-
prompt = "The future of artificial intelligence is"
|
| 75 |
-
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
| 76 |
-
|
| 77 |
-
outputs = model.generate(
|
| 78 |
-
**inputs,
|
| 79 |
-
max_new_tokens=50,
|
| 80 |
-
temperature=0.7,
|
| 81 |
-
top_p=0.9,
|
| 82 |
-
do_sample=True
|
| 83 |
-
)
|
| 84 |
-
|
| 85 |
-
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
|
|
| 25 |
|
| 26 |
- **Model Name:** RessAI Onner-300m
|
| 27 |
- **Organization:** RessAI
|
| 28 |
+
- **Architecture:** `RessAiForCausalLM`
|
| 29 |
- **Model Type:** `onner`
|
| 30 |
- **Parameters:** ~199.9 Million (0.20B)
|
| 31 |
- **Context Window:** 4,096 tokens
|
| 32 |
+
- **Vocabulary:** 128,256
|
| 33 |
- **Training Precision:** Bfloat16
|
| 34 |
- **License:** Apache 2.0
|
| 35 |
|
|
|
|
| 45 |
| **KV Heads** | 2 | Grouped Query Attention (GQA 6:1) |
|
| 46 |
| **Intermediate Size** | 3,072 | MLP Width |
|
| 47 |
| **RoPE Theta** | 500,000 | Rotary Embeddings Base |
|
| 48 |
+
| **Max Sequence** | 4,096 | Context Length |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|