File size: 2,431 Bytes

---
license: other
license_name: deepseek-license
license_link: LICENSE
pipeline_tag: text-generation
tags:
- code
- mixture-of-experts
- SarvaCode
- india-stack
language:
- en
base_model:
- deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
---

# SarvaCode-16B-Indigenous

**SarvaCode** is an indigenously customized, open-source Mixture-of-Experts (MoE) code language model. It is built upon the DeepSeek-Coder-V2 architecture but optimized for the **Indian Software Ecosystem**. 

While global models focus on general code, SarvaCode is fine-tuned to understand **Indian English instructions**, local financial protocols (GST, TDS), and the technical frameworks of **India Stack** (UPI, ONDC, Aadhaar/UIDAI).

## 1. Key Improvements
Compared to the base Lite model, **SarvaCode** features:
- **Higher Active Parameters:** Increased from 6 to **8 active experts per token**, boosting reasoning power to **~3.2B active parameters** per message.
- **Indigenous Logic:** Enhanced accuracy for Indian-specific tasks like GST calculation logic, IFSC validation, and regional date/currency formatting.
- **India Stack Awareness:** Pre-loaded context for integrating with NPCI (UPI), ONDC, and DigiLocker APIs.
- **Massive Context:** Maintains a **128K context window** to digest entire Indian government technical gazettes or large codebases in one go.

## 2. Model Specifications

| **Model** | **#Total Params** | **#Active Params** | **Context Length** | **Specialization** |
| :---: | :---: | :---: | :---: | :---: |
| **SarvaCode-16B** | 16B | **3.2B** | 128k | India Stack & Fintech |

## 3. How to Run Locally

### Inference with Transformers
Ensure you use `trust_remote_code=True` to load the specialized MoE configuration.

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_path = "./SarvaCode" # Your local directory
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, torch_dtype=torch.bfloat16).cuda()

# Example: Indian Financial Logic
input_text = "User: Write a Python function to calculate the GST for a service with an 18% slab, ensuring the output separates CGST and SGST.\n\nAssistant:"

inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))