| --- |
| license: mit |
| library_name: transformers |
| --- |
| |
| # Nepalaya-R |
|
|
| Nepalaya-R is a large language model project with full source, configs, and deployment tooling for local and Hugging Face usage. |
|
|
| ## About This Model |
|
|
| This repository contains the Nepalaya-R model implementation with: |
|
|
| - ✅ Full source code and inference implementations |
| - ✅ Tokenizer configuration adapted for Nepalaya-R |
| - ✅ Easy-to-use inference scripts |
| - ✅ Documentation and setup guides |
|
|
| ## Quick Start |
|
|
| ### Installation |
|
|
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| ### Download & Setup |
|
|
| Option 1: Download from Hugging Face |
| ```bash |
| export HF_TOKEN=your_token |
| python download_model.py --model-id your-username/Nepalaya-R --local-dir ./model_weights |
| ``` |
|
|
| Option 2: Run Quick Inference |
| ```bash |
| python quick_inference.py --prompt "Your prompt here" |
| ``` |
|
|
| ### Mirror Setup |
|
|
| To create your own Nepalaya-R repo mirror: |
| ```bash |
| export HF_TOKEN=your_token |
| python mirror_to_hf.py \ |
| --source source-org/source-model \ |
| --dest your-username/Nepalaya-R |
| ``` |
|
|
| ## Documentation |
|
|
| - **[SETUP.md](SETUP.md)** - Detailed setup and configuration guide |
| - **[GITHUB_DEPLOY.md](GITHUB_DEPLOY.md)** - Deployment instructions |
| - **[inference/README.md](inference/README.md)** - Inference code documentation |
|
|
| ## Model Architecture |
|
|
| Nepalaya-R architecture summary: |
| - **Parameters:** 671B |
| - **Context Length:** Extended via sparse attention |
| - **Training:** Sparse attention based training pipeline |
| - **Architecture:** Optimized transformer with mixture-of-experts |
|
|
| ## Key Features |
|
|
| - Multi-expert routing for efficient inference |
| - Sparse attention for long-context processing |
| - Chat template support |
| - Distributed inference capabilities |
|
|
| ## System Requirements |
|
|
| - **GPU Memory:** 48GB+ VRAM recommended |
| - **RAM:** 64GB+ system memory |
| - **Storage:** ~300GB for full model weights |
| - **SSD:** Fast storage recommended |
|
|
| ## Usage Examples |
|
|
| ### Basic Generation |
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "your-username/Nepalaya-R", |
| torch_dtype="auto", |
| device_map="auto" |
| ) |
| tokenizer = AutoTokenizer.from_pretrained("your-username/Nepalaya-R") |
| |
| inputs = tokenizer("Hello", return_tensors="pt") |
| outputs = model.generate(**inputs, max_new_tokens=100) |
| print(tokenizer.decode(outputs[0])) |
| ``` |
|
|
| ### Chat Mode |
| ```python |
| messages = [ |
| {"role": "user", "content": "What is machine learning?"} |
| ] |
| inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt") |
| outputs = model.generate(**inputs, max_new_tokens=256) |
| ``` |
|
|
| ## Repository Structure |
|
|
| ``` |
| Nepalaya-R/ |
| ├── README.md # This file |
| ├── SETUP.md # Setup guide |
| ├── GITHUB_DEPLOY.md # Deployment guide |
| ├── requirements.txt # Python dependencies |
| ├── config.json # Model configuration |
| ├── tokenizer.json # Tokenizer |
| ├── quick_inference.py # Quick inference script |
| ├── download_model.py # Model downloader |
| ├── mirror_to_hf.py # HF mirroring tool |
| ├── inference/ # Inference code |
| │ ├── generate.py # Generation script |
| │ ├── model.py # Model implementation |
| │ ├── convert.py # Weight converter |
| │ └── config_671B_nepalaya.json # Inference config |
| └── assets/ # Chat templates |
| ``` |
|
|
| ## Files Included |
|
|
| - **Source Code:** Full inference implementation |
| - **Configuration:** Model and generation configs |
| - **Tokenizer:** Complete tokenizer setup |
| - **Documentation:** Setup and usage guides |
| - **Utilities:** Download and mirror scripts |
|
|
| ## License |
|
|
| MIT License - See [LICENSE](LICENSE) file |
|
|
| ## Support |
|
|
| For documentation, see [SETUP.md](SETUP.md) |
| For deployment, see [GITHUB_DEPLOY.md](GITHUB_DEPLOY.md) |
|
|
| --- |
|
|
| Nepalaya-R model card and repository maintained by the Nepalaya-R project. |
|
|