File size: 3,321 Bytes
e0b0035 88b3bf6 e0b0035 88b3bf6 d365caf 88b3bf6 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 | ---
language: en
license: apache-2.0
tags:
- transformer
- pytorch
- causal-lm
- moe
- mixture-of-experts
- rish-ai-labs
---
# RLLM (Base Model)
## Model Description
RLLM is a base language model developed by **Rish AI Labs**, an applied artificial intelligence lab focused on LLMs, Generative AI, AI consulting, and research.
This model features a **Mixture of Experts (MoE)** architecture with 16 experts, providing efficient scaling and specialization capabilities. It was trained using identity-focused pretraining to establish a strong foundation for downstream tasks.
## Key Features
- **Architecture**: Transformer with MoE (16 experts, top-2 routing)
- **Parameters**: ~275M total parameters
- **Training**: Identity-focused pretraining
- **Precision**: FP32 training, optimized for inference
- **Framework**: PyTorch + Transformers
## Intended Use
This base model serves as a foundation for:
- Fine-tuning on specific domains
- Research in efficient language model architectures
- Development of specialized AI applications
- Understanding MoE dynamics and scaling
## About Rish AI Labs
**Rish AI Labs** is pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation. Based in Bangalore, India, we focus on:
- **Applied AI Solutions**: Enterprise-grade AI implementations
- **Research**: Cutting-edge AI research and publications
- **LLM Development**: Large language model research and deployment
- **AI Consulting**: Expert guidance for AI transformation
### Mission
"Pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation."
### Contact
- Website: [rishailabs.com](https://rishailabs.com)
- Location: Bangalore, India
- Focus: Enterprise AI, LLMs, Generative AI, AI Research
## Model Architecture Details
- **Layers**: 12 transformer layers
- **Heads**: 12 attention heads
- **Hidden Size**: 768
- **Experts**: 16 (MoE)
- **Top-K Routing**: 2
- **Vocabulary**: 50,304 tokens
- **Sequence Length**: Configurable (trained on various lengths)
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("RishAILabs/RLLM-Base")
model = AutoModelForCausalLM.from_pretrained("RishAILabs/RLLM-Base")
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Training Details
- **Dataset**: Identity-focused dataset for stable pretraining
- **Precision**: FP32 for training stability
- **Optimization**: AdamW optimizer
- **Framework**: Custom Rish-Core training framework
- **Hardware**: Optimized for both CPU and GPU inference
## Limitations
- Base model - may require fine-tuning for specific tasks
- English language focus
- Generated content should be reviewed for appropriateness
## Citation
If you use this model in your research, please cite:
---
```bibtex
@misc{rishailabs_2026,
author = { RishAILabs },
title = { RLLM-Base (Revision 552ee30) },
year = 2026,
url = { https://huggingface.co/RishAILabs/RLLM-Base },
doi = { 10.57967/hf/7560 },
publisher = { Hugging Face }
}
```
*Developed by Rish AI Labs - Applied Artificial Intelligence & Research*
|