--- language: en license: apache-2.0 tags: - transformer - pytorch - causal-lm - moe - mixture-of-experts - rish-ai-labs --- # RLLM (Base Model) ## Model Description RLLM is a base language model developed by **Rish AI Labs**, an applied artificial intelligence lab focused on LLMs, Generative AI, AI consulting, and research. This model features a **Mixture of Experts (MoE)** architecture with 16 experts, providing efficient scaling and specialization capabilities. It was trained using identity-focused pretraining to establish a strong foundation for downstream tasks. ## Key Features - **Architecture**: Transformer with MoE (16 experts, top-2 routing) - **Parameters**: ~275M total parameters - **Training**: Identity-focused pretraining - **Precision**: FP32 training, optimized for inference - **Framework**: PyTorch + Transformers ## Intended Use This base model serves as a foundation for: - Fine-tuning on specific domains - Research in efficient language model architectures - Development of specialized AI applications - Understanding MoE dynamics and scaling ## About Rish AI Labs **Rish AI Labs** is pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation. Based in Bangalore, India, we focus on: - **Applied AI Solutions**: Enterprise-grade AI implementations - **Research**: Cutting-edge AI research and publications - **LLM Development**: Large language model research and deployment - **AI Consulting**: Expert guidance for AI transformation ### Mission "Pioneering the future of Enterprise AI through research, applied solutions, and LLM-driven innovation." ### Contact - Website: [rishailabs.com](https://rishailabs.com) - Location: Bangalore, India - Focus: Enterprise AI, LLMs, Generative AI, AI Research ## Model Architecture Details - **Layers**: 12 transformer layers - **Heads**: 12 attention heads - **Hidden Size**: 768 - **Experts**: 16 (MoE) - **Top-K Routing**: 2 - **Vocabulary**: 50,304 tokens - **Sequence Length**: Configurable (trained on various lengths) ## Usage ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("RishAILabs/RLLM-Base") model = AutoModelForCausalLM.from_pretrained("RishAILabs/RLLM-Base") inputs = tokenizer("Hello, how are you?", return_tensors="pt") outputs = model.generate(**inputs, max_length=50) response = tokenizer.decode(outputs[0], skip_special_tokens=True) ``` ## Training Details - **Dataset**: Identity-focused dataset for stable pretraining - **Precision**: FP32 for training stability - **Optimization**: AdamW optimizer - **Framework**: Custom Rish-Core training framework - **Hardware**: Optimized for both CPU and GPU inference ## Limitations - Base model - may require fine-tuning for specific tasks - English language focus - Generated content should be reviewed for appropriateness ## Citation If you use this model in your research, please cite: --- ```bibtex @misc{rishailabs_2026, author = { RishAILabs }, title = { RLLM-Base (Revision 552ee30) }, year = 2026, url = { https://huggingface.co/RishAILabs/RLLM-Base }, doi = { 10.57967/hf/7560 }, publisher = { Hugging Face } } ``` *Developed by Rish AI Labs - Applied Artificial Intelligence & Research*