File size: 3,321 Bytes

---
language: en
license: apache-2.0
tags:
- transformer
- pytorch
- causal-lm
- moe
- mixture-of-experts
- rish-ai-labs
---

# RLLM (Base Model)

## Model Description

RLLM is a base language model developed by **Rish AI Labs**, an applied artificial intelligence lab focused on LLMs, Generative AI, AI consulting, and research.

This model features a **Mixture of Experts (MoE)** architecture with 16 experts, providing efficient scaling and specialization capabilities. It was trained using identity-focused pretraining to establish a strong foundation for downstream tasks.

## Key Features

- **Architecture**: Transformer with MoE (16 experts, top-2 routing)
- **Parameters**: ~275M total parameters
- **Training**: Identity-focused pretraining
- **Precision**: FP32 training, optimized for inference
- **Framework**: PyTorch + Transformers

## Intended Use

This base model serves as a foundation for:
- Fine-tuning on specific domains
- Research in efficient language model architectures
- Development of specialized AI applications
- Understanding MoE dynamics and scaling

## About Rish AI Labs

**Rish AI Labs** is pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation. Based in Bangalore, India, we focus on:

- **Applied AI Solutions**: Enterprise-grade AI implementations
- **Research**: Cutting-edge AI research and publications
- **LLM Development**: Large language model research and deployment
- **AI Consulting**: Expert guidance for AI transformation

### Mission
"Pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation."

### Contact
- Website: [rishailabs.com](https://rishailabs.com)
- Location: Bangalore, India
- Focus: Enterprise AI, LLMs, Generative AI, AI Research

## Model Architecture Details

- **Layers**: 12 transformer layers
- **Heads**: 12 attention heads
- **Hidden Size**: 768
- **Experts**: 16 (MoE)
- **Top-K Routing**: 2
- **Vocabulary**: 50,304 tokens
- **Sequence Length**: Configurable (trained on various lengths)

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("RishAILabs/RLLM-Base")
model = AutoModelForCausalLM.from_pretrained("RishAILabs/RLLM-Base")

inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
```

## Training Details

- **Dataset**: Identity-focused dataset for stable pretraining
- **Precision**: FP32 for training stability
- **Optimization**: AdamW optimizer
- **Framework**: Custom Rish-Core training framework
- **Hardware**: Optimized for both CPU and GPU inference

## Limitations

- Base model - may require fine-tuning for specific tasks
- English language focus
- Generated content should be reviewed for appropriateness

## Citation

If you use this model in your research, please cite:


---


```bibtex
@misc{rishailabs_2026,
	author       = { RishAILabs },
	title        = { RLLM-Base (Revision 552ee30) },
	year         = 2026,
	url          = { https://huggingface.co/RishAILabs/RLLM-Base },
	doi          = { 10.57967/hf/7560 },
	publisher    = { Hugging Face }
}
```


*Developed by Rish AI Labs - Applied Artificial Intelligence & Research*