File size: 4,072 Bytes
f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 f5979c5 8f2bfa9 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 | ---
license: mit
library_name: transformers
---
# Nepalaya-R
Nepalaya-R is a large language model project with full source, configs, and deployment tooling for local and Hugging Face usage.
## About This Model
This repository contains the Nepalaya-R model implementation with:
- ✅ Full source code and inference implementations
- ✅ Tokenizer configuration adapted for Nepalaya-R
- ✅ Easy-to-use inference scripts
- ✅ Documentation and setup guides
## Quick Start
### Installation
```bash
pip install -r requirements.txt
```
### Download & Setup
Option 1: Download from Hugging Face
```bash
export HF_TOKEN=your_token
python download_model.py --model-id your-username/Nepalaya-R --local-dir ./model_weights
```
Option 2: Run Quick Inference
```bash
python quick_inference.py --prompt "Your prompt here"
```
### Mirror Setup
To create your own Nepalaya-R repo mirror:
```bash
export HF_TOKEN=your_token
python mirror_to_hf.py \
--source source-org/source-model \
--dest your-username/Nepalaya-R
```
## Documentation
- **[SETUP.md](SETUP.md)** - Detailed setup and configuration guide
- **[GITHUB_DEPLOY.md](GITHUB_DEPLOY.md)** - Deployment instructions
- **[inference/README.md](inference/README.md)** - Inference code documentation
## Model Architecture
Nepalaya-R architecture summary:
- **Parameters:** 671B
- **Context Length:** Extended via sparse attention
- **Training:** Sparse attention based training pipeline
- **Architecture:** Optimized transformer with mixture-of-experts
## Key Features
- Multi-expert routing for efficient inference
- Sparse attention for long-context processing
- Chat template support
- Distributed inference capabilities
## System Requirements
- **GPU Memory:** 48GB+ VRAM recommended
- **RAM:** 64GB+ system memory
- **Storage:** ~300GB for full model weights
- **SSD:** Fast storage recommended
## Usage Examples
### Basic Generation
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"your-username/Nepalaya-R",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("your-username/Nepalaya-R")
inputs = tokenizer("Hello", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))
```
### Chat Mode
```python
messages = [
{"role": "user", "content": "What is machine learning?"}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256)
```
## Repository Structure
```
Nepalaya-R/
├── README.md # This file
├── SETUP.md # Setup guide
├── GITHUB_DEPLOY.md # Deployment guide
├── requirements.txt # Python dependencies
├── config.json # Model configuration
├── tokenizer.json # Tokenizer
├── quick_inference.py # Quick inference script
├── download_model.py # Model downloader
├── mirror_to_hf.py # HF mirroring tool
├── inference/ # Inference code
│ ├── generate.py # Generation script
│ ├── model.py # Model implementation
│ ├── convert.py # Weight converter
│ └── config_671B_nepalaya.json # Inference config
└── assets/ # Chat templates
```
## Files Included
- **Source Code:** Full inference implementation
- **Configuration:** Model and generation configs
- **Tokenizer:** Complete tokenizer setup
- **Documentation:** Setup and usage guides
- **Utilities:** Download and mirror scripts
## License
MIT License - See [LICENSE](LICENSE) file
## Support
For documentation, see [SETUP.md](SETUP.md)
For deployment, see [GITHUB_DEPLOY.md](GITHUB_DEPLOY.md)
---
Nepalaya-R model card and repository maintained by the Nepalaya-R project.
|