RLLM-Base / README.md
RishAILabs's picture
Update README.md
d365caf verified
---
language: en
license: apache-2.0
tags:
- transformer
- pytorch
- causal-lm
- moe
- mixture-of-experts
- rish-ai-labs
---
# RLLM (Base Model)
## Model Description
RLLM is a base language model developed by **Rish AI Labs**, an applied artificial intelligence lab focused on LLMs, Generative AI, AI consulting, and research.
This model features a **Mixture of Experts (MoE)** architecture with 16 experts, providing efficient scaling and specialization capabilities. It was trained using identity-focused pretraining to establish a strong foundation for downstream tasks.
## Key Features
- **Architecture**: Transformer with MoE (16 experts, top-2 routing)
- **Parameters**: ~275M total parameters
- **Training**: Identity-focused pretraining
- **Precision**: FP32 training, optimized for inference
- **Framework**: PyTorch + Transformers
## Intended Use
This base model serves as a foundation for:
- Fine-tuning on specific domains
- Research in efficient language model architectures
- Development of specialized AI applications
- Understanding MoE dynamics and scaling
## About Rish AI Labs
**Rish AI Labs** is pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation. Based in Bangalore, India, we focus on:
- **Applied AI Solutions**: Enterprise-grade AI implementations
- **Research**: Cutting-edge AI research and publications
- **LLM Development**: Large language model research and deployment
- **AI Consulting**: Expert guidance for AI transformation
### Mission
"Pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation."
### Contact
- Website: [rishailabs.com](https://rishailabs.com)
- Location: Bangalore, India
- Focus: Enterprise AI, LLMs, Generative AI, AI Research
## Model Architecture Details
- **Layers**: 12 transformer layers
- **Heads**: 12 attention heads
- **Hidden Size**: 768
- **Experts**: 16 (MoE)
- **Top-K Routing**: 2
- **Vocabulary**: 50,304 tokens
- **Sequence Length**: Configurable (trained on various lengths)
## Usage
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("RishAILabs/RLLM-Base")
model = AutoModelForCausalLM.from_pretrained("RishAILabs/RLLM-Base")
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```
## Training Details
- **Dataset**: Identity-focused dataset for stable pretraining
- **Precision**: FP32 for training stability
- **Optimization**: AdamW optimizer
- **Framework**: Custom Rish-Core training framework
- **Hardware**: Optimized for both CPU and GPU inference
## Limitations
- Base model - may require fine-tuning for specific tasks
- English language focus
- Generated content should be reviewed for appropriateness
## Citation
If you use this model in your research, please cite:
---
```bibtex
@misc{rishailabs_2026,
author = { RishAILabs },
title = { RLLM-Base (Revision 552ee30) },
year = 2026,
url = { https://huggingface.co/RishAILabs/RLLM-Base },
doi = { 10.57967/hf/7560 },
publisher = { Hugging Face }
}
```
*Developed by Rish AI Labs - Applied Artificial Intelligence & Research*