| | --- |
| | language: en |
| | license: apache-2.0 |
| | tags: |
| | - transformer |
| | - pytorch |
| | - causal-lm |
| | - moe |
| | - mixture-of-experts |
| | - rish-ai-labs |
| | --- |
| | |
| | # RLLM (Base Model) |
| |
|
| | ## Model Description |
| |
|
| | RLLM is a base language model developed by **Rish AI Labs**, an applied artificial intelligence lab focused on LLMs, Generative AI, AI consulting, and research. |
| |
|
| | This model features a **Mixture of Experts (MoE)** architecture with 16 experts, providing efficient scaling and specialization capabilities. It was trained using identity-focused pretraining to establish a strong foundation for downstream tasks. |
| |
|
| | ## Key Features |
| |
|
| | - **Architecture**: Transformer with MoE (16 experts, top-2 routing) |
| | - **Parameters**: ~275M total parameters |
| | - **Training**: Identity-focused pretraining |
| | - **Precision**: FP32 training, optimized for inference |
| | - **Framework**: PyTorch + Transformers |
| |
|
| | ## Intended Use |
| |
|
| | This base model serves as a foundation for: |
| | - Fine-tuning on specific domains |
| | - Research in efficient language model architectures |
| | - Development of specialized AI applications |
| | - Understanding MoE dynamics and scaling |
| |
|
| | ## About Rish AI Labs |
| |
|
| | **Rish AI Labs** is pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation. Based in Bangalore, India, we focus on: |
| |
|
| | - **Applied AI Solutions**: Enterprise-grade AI implementations |
| | - **Research**: Cutting-edge AI research and publications |
| | - **LLM Development**: Large language model research and deployment |
| | - **AI Consulting**: Expert guidance for AI transformation |
| |
|
| | ### Mission |
| | "Pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation." |
| |
|
| | ### Contact |
| | - Website: [rishailabs.com](https://rishailabs.com) |
| | - Location: Bangalore, India |
| | - Focus: Enterprise AI, LLMs, Generative AI, AI Research |
| |
|
| | ## Model Architecture Details |
| |
|
| | - **Layers**: 12 transformer layers |
| | - **Heads**: 12 attention heads |
| | - **Hidden Size**: 768 |
| | - **Experts**: 16 (MoE) |
| | - **Top-K Routing**: 2 |
| | - **Vocabulary**: 50,304 tokens |
| | - **Sequence Length**: Configurable (trained on various lengths) |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("RishAILabs/RLLM-Base") |
| | model = AutoModelForCausalLM.from_pretrained("RishAILabs/RLLM-Base") |
| | |
| | inputs = tokenizer("Hello, how are you?", return_tensors="pt") |
| | outputs = model.generate(**inputs, max_length=50) |
| | response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | - **Dataset**: Identity-focused dataset for stable pretraining |
| | - **Precision**: FP32 for training stability |
| | - **Optimization**: AdamW optimizer |
| | - **Framework**: Custom Rish-Core training framework |
| | - **Hardware**: Optimized for both CPU and GPU inference |
| |
|
| | ## Limitations |
| |
|
| | - Base model - may require fine-tuning for specific tasks |
| | - English language focus |
| | - Generated content should be reviewed for appropriateness |
| |
|
| | ## Citation |
| |
|
| | If you use this model in your research, please cite: |
| |
|
| |
|
| | --- |
| |
|
| |
|
| | ```bibtex |
| | @misc{rishailabs_2026, |
| | author = { RishAILabs }, |
| | title = { RLLM-Base (Revision 552ee30) }, |
| | year = 2026, |
| | url = { https://huggingface.co/RishAILabs/RLLM-Base }, |
| | doi = { 10.57967/hf/7560 }, |
| | publisher = { Hugging Face } |
| | } |
| | ``` |
| |
|
| |
|
| | *Developed by Rish AI Labs - Applied Artificial Intelligence & Research* |
| |
|