RishAILabs commited on
Commit
88b3bf6
·
verified ·
1 Parent(s): 552ee30

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -2
README.md CHANGED
@@ -5,9 +5,94 @@ tags:
5
  - transformer
6
  - pytorch
7
  - causal-lm
 
 
 
8
  ---
9
 
10
-
11
  # RLLM (Base Model)
12
 
13
- This is a base model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  - transformer
6
  - pytorch
7
  - causal-lm
8
+ - moe
9
+ - mixture-of-experts
10
+ - rish-ai-labs
11
  ---
12
 
 
13
  # RLLM (Base Model)
14
 
15
+ ## Model Description
16
+
17
+ RLLM is a base language model developed by **Rish AI Labs**, an applied artificial intelligence lab focused on LLMs, Generative AI, AI consulting, and research.
18
+
19
+ This model features a **Mixture of Experts (MoE)** architecture with 16 experts, providing efficient scaling and specialization capabilities. It was trained using identity-focused pretraining to establish a strong foundation for downstream tasks.
20
+
21
+ ## Key Features
22
+
23
+ - **Architecture**: Transformer with MoE (16 experts, top-2 routing)
24
+ - **Parameters**: ~275M total parameters
25
+ - **Training**: Identity-focused pretraining
26
+ - **Precision**: FP32 training, optimized for inference
27
+ - **Framework**: PyTorch + Transformers
28
+
29
+ ## Intended Use
30
+
31
+ This base model serves as a foundation for:
32
+ - Fine-tuning on specific domains
33
+ - Research in efficient language model architectures
34
+ - Development of specialized AI applications
35
+ - Understanding MoE dynamics and scaling
36
+
37
+ ## About Rish AI Labs
38
+
39
+ **Rish AI Labs** is pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation. Based in Bangalore, India, we focus on:
40
+
41
+ - **Applied AI Solutions**: Enterprise-grade AI implementations
42
+ - **Research**: Cutting-edge AI research and publications
43
+ - **LLM Development**: Large language model research and deployment
44
+ - **AI Consulting**: Expert guidance for AI transformation
45
+
46
+ ### Mission
47
+ "Pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation."
48
+
49
+ ### Contact
50
+ - Website: [rishailabs.com](https://rishailabs.com)
51
+ - Location: Bangalore, India
52
+ - Focus: Enterprise AI, LLMs, Generative AI, AI Research
53
+
54
+ ## Model Architecture Details
55
+
56
+ - **Layers**: 12 transformer layers
57
+ - **Heads**: 12 attention heads
58
+ - **Hidden Size**: 768
59
+ - **Experts**: 16 (MoE)
60
+ - **Top-K Routing**: 2
61
+ - **Vocabulary**: 50,304 tokens
62
+ - **Sequence Length**: Configurable (trained on various lengths)
63
+
64
+ ## Usage
65
+
66
+ ```python
67
+ from transformers import AutoTokenizer, AutoModelForCausalLM
68
+
69
+ tokenizer = AutoTokenizer.from_pretrained("RishAILabs/RLLM-Base")
70
+ model = AutoModelForCausalLM.from_pretrained("RishAILabs/RLLM-Base")
71
+
72
+ inputs = tokenizer("Hello, how are you?", return_tensors="pt")
73
+ outputs = model.generate(**inputs, max_length=50)
74
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
75
+ ```
76
+
77
+ ## Training Details
78
+
79
+ - **Dataset**: Identity-focused dataset for stable pretraining
80
+ - **Precision**: FP32 for training stability
81
+ - **Optimization**: AdamW optimizer
82
+ - **Framework**: Custom Rish-Core training framework
83
+ - **Hardware**: Optimized for both CPU and GPU inference
84
+
85
+ ## Limitations
86
+
87
+ - Base model - may require fine-tuning for specific tasks
88
+ - English language focus
89
+ - Generated content should be reviewed for appropriateness
90
+
91
+ ## Citation
92
+
93
+ If you use this model in your research, please cite:
94
+
95
+
96
+ ---
97
+
98
+ *Developed by Rish AI Labs - Applied Artificial Intelligence & Research*