RessAI commited on
Commit
b37d57d
ยท
verified ยท
1 Parent(s): 6b70a03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -25
README.md CHANGED
@@ -11,70 +11,72 @@ tags:
11
  - text-generation
12
  - onner
13
  ---
14
- # ๐Ÿš€ RessAI-Ultra 2B
15
 
16
- **RessAI-Ultra 2B** is a custom 2.56 Billion parameter language model built on the highly optimized `onner` architecture. Designed for deep reasoning and long-context understanding, it features a 128k context window and a "Deep & Dense" layer structure.
 
 
17
 
18
  <div align="center">
19
- <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers_logo_name.png" width="300"/>
20
  </div>
21
 
22
  ## ๐Ÿ” Model Details
23
 
24
- - **Model Name:** RessAI-Ultra 2B
25
  - **Organization:** RessAI
26
- - **Architecture:** `RessAiForCausalLM` (Custom Llama-based structure)
27
  - **Model Type:** `onner`
28
- - **Parameters:** ~2.56 Billion
29
- - **Context Window:** 131,072 tokens (128k)
 
30
  - **Training Precision:** Bfloat16
31
  - **License:** Apache 2.0
32
 
33
  ## ๐Ÿง  Technical Specifications
34
 
35
- RessAI-Ultra utilizes a custom configuration designed for efficiency and long-range dependencies:
36
 
37
  | Hyperparameter | Value | Description |
38
  | :--- | :--- | :--- |
39
- | **Hidden Size** | 2560 | Custom embedding dimension |
40
- | **Layers** | 32 | Deep network structure |
41
- | **Attention Heads** | 32 | Standard query heads |
42
- | **KV Heads** | 4 | Grouped Query Attention (GQA) 8:1 Ratio |
43
- | **Intermediate Size** | 7168 | Wide MLP for high capacity |
44
- | **RoPE Theta** | 2,000,000 | Enhanced for long context stability |
45
- | **Vocab Size** | 128,256 | Llama-3 Tokenizer compatibility |
46
 
47
  ## ๐Ÿ’ป Usage
48
 
49
- Because this model uses a custom architecture type (`onner`) and configuration (`RessAiConfig`), you can load it using the standard `transformers` library.
50
 
51
- ### Python Code
52
 
53
  ```python
54
  from transformers import AutoModelForCausalLM, AutoTokenizer
55
  import torch
56
 
57
- model_id = "RessAI/RessAI-Ultra"
58
 
59
- # Load Tokenizer
60
  tokenizer = AutoTokenizer.from_pretrained(model_id)
61
 
62
- # Load Model
63
- # Note: Ensure you have the latest transformers version
64
  model = AutoModelForCausalLM.from_pretrained(
65
  model_id,
66
- torch_dtype=torch.bfloat16,
67
  device_map="auto",
68
- trust_remote_code=True # Required for custom config/arch if code is present
69
  )
70
 
71
- # Inference
72
  prompt = "The future of artificial intelligence is"
73
  inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
74
 
75
  outputs = model.generate(
76
  **inputs,
77
- max_new_tokens=100,
78
  temperature=0.7,
79
  top_p=0.9,
80
  do_sample=True
 
11
  - text-generation
12
  - onner
13
  ---
14
+ # ๐Ÿš€ RessAI Onner-300m
15
 
16
+ **Onner-300m** (internally `RessAI-Ultra-300M`) is a compact, high-efficiency language model designed for educational reasoning and lightweight deployment. With approximately **200 Million parameters**, it follows a "Dense & Deep" philosophy scaled down for speed and accessibility.
17
+
18
+ It is trained on the high-quality [FineWeb-Edu](https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu) dataset, utilizing a custom architecture (`RessAiForCausalLM`) optimized for efficient inference.
19
 
20
  <div align="center">
21
+ <img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers_logo_name.png" width="200"/>
22
  </div>
23
 
24
  ## ๐Ÿ” Model Details
25
 
26
+ - **Model Name:** RessAI Onner-300m
27
  - **Organization:** RessAI
28
+ - **Architecture:** `RessAiForCausalLM` (Custom Llama-style structure)
29
  - **Model Type:** `onner`
30
+ - **Parameters:** ~199.9 Million (0.20B)
31
+ - **Context Window:** 4,096 tokens
32
+ - **Vocabulary:** 128,256 (Llama-3 Compatible)
33
  - **Training Precision:** Bfloat16
34
  - **License:** Apache 2.0
35
 
36
  ## ๐Ÿง  Technical Specifications
37
 
38
+ This model uses a custom configuration inspired by BERT-base sizing but with Llama's causal attention mechanisms:
39
 
40
  | Hyperparameter | Value | Description |
41
  | :--- | :--- | :--- |
42
+ | **Hidden Size** | 768 | Embedding dimension (Compact) |
43
+ | **Layers** | 12 | Network depth |
44
+ | **Attention Heads** | 12 | Query heads |
45
+ | **KV Heads** | 2 | Grouped Query Attention (GQA 6:1) |
46
+ | **Intermediate Size** | 3,072 | MLP Width |
47
+ | **RoPE Theta** | 500,000 | Rotary Embeddings Base |
48
+ | **Max Sequence** | 4,096 | Context Length |
49
 
50
  ## ๐Ÿ’ป Usage
51
 
52
+ ### Python Code (Transformers)
53
 
54
+ Since this model uses a custom architecture configuration (`onner`), ensure you have `transformers` installed.
55
 
56
  ```python
57
  from transformers import AutoModelForCausalLM, AutoTokenizer
58
  import torch
59
 
60
+ model_id = "RessAI/Onner-300m"
61
 
62
+ # 1. Load Tokenizer
63
  tokenizer = AutoTokenizer.from_pretrained(model_id)
64
 
65
+ # 2. Load Model
 
66
  model = AutoModelForCausalLM.from_pretrained(
67
  model_id,
68
+ torch_dtype=torch.bfloat16, # Use float16 if bfloat16 not supported
69
  device_map="auto",
70
+ trust_remote_code=True
71
  )
72
 
73
+ # 3. Inference
74
  prompt = "The future of artificial intelligence is"
75
  inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
76
 
77
  outputs = model.generate(
78
  **inputs,
79
+ max_new_tokens=50,
80
  temperature=0.7,
81
  top_p=0.9,
82
  do_sample=True