Indic-mobile

Indic-mobile is a 0.5B parameter language model built completely from scratch — no fine-tuning, no adapter on top of an existing checkpoint. Every weight was pretrained from zero, purpose-built for all 22 officially recognized Indian languages and designed for efficient deployment on mobile and edge devices.

🤗 GGUF quantized versions are available at mradermacher/Indic-mobile-GGUF


Model Details

Property Value
Developed by Rocky Singh Rajput
Model type Causal Language Model
Architecture Custom (from scratch)
Parameters 0.5B
Precision BF16
Languages All 22 official Indian languages
License Apache 2.0
Trained from scratch ✅ Yes — not a fine-tune

Supported Languages

Indic-mobile covers all 22 languages recognized under the 8th Schedule of the Indian Constitution:

Assamese, Bengali, Bodo, Dogri, Gujarati, Hindi, Kannada, Kashmiri, Konkani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Odia, Punjabi, Sanskrit, Santali, Sindhi, Tamil, Telugu, Urdu


Why Indic-mobile?

India has 1.4 billion people and 22 officially recognized languages — yet most language models were never built with this diversity in mind. Indic-mobile is designed to change that:

  • Built from scratch — not a fine-tune or adapter on an existing English-centric model
  • Truly multilingual — trained across all 22 Indian languages from the ground up
  • Mobile-first — 0.5B parameters means it runs efficiently on edge devices and smartphones
  • Open source — weights, architecture, and everything else, freely available

Usage

Load with 🤗 Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "RockySinghRajput/Indic-mobile"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

prompt = "भारत एक विविधताओं से भरा देश है।"  # Example Hindi prompt
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Run with Ollama

ollama run hf.co/RockySinghRajput/Indic-mobile

Run with vLLM

vllm serve RockySinghRajput/Indic-mobile

Model Architecture

  • Architecture: Custom (trained from scratch)
  • Parameters: 0.5B
  • Precision: BF16
  • Training: Pretrained from scratch (no base model used)
  • Objective: Causal language modeling across 22 Indic languages

Intended Uses

Direct Use

  • Text generation in any of the 22 official Indian languages
  • Multilingual Indic chatbots and assistants
  • On-device / mobile NLP applications
  • Low-resource language research and experimentation

Downstream Use

  • Fine-tuning for specific Indic language tasks (classification, summarization, translation, QA)
  • Integration into larger Indic NLP pipelines
  • RAG (Retrieval-Augmented Generation) systems for Indian language content

Out-of-Scope Use

  • High-stakes decision making without human oversight
  • Generation of harmful, misleading, or abusive content in any language
  • Tasks requiring deep factual accuracy without verification

Bias, Risks, and Limitations

  • As a small 0.5B model, it may struggle with complex reasoning or long-form generation compared to larger models
  • Training data distribution across all 22 languages may not be perfectly balanced; lower-resource languages may underperform
  • Like all language models, it may reflect biases present in the training data
  • Not intended for use in safety-critical or high-stakes applications without further evaluation and fine-tuning

Recommendations

Users should evaluate the model on their specific use case and language before deployment, particularly for lower-resource Indic languages.


Evaluation

Formal benchmarks are in progress. Community evaluations and feedback are welcome — please open a Discussion to share results!


Citation

If you use Indic-mobile in your research or projects, please consider citing:

@misc{indic-mobile-2025,
  author    = {Rocky Singh Rajput},
  title     = {Indic-mobile: A 0.5B Language Model for All 22 Official Indian Languages},
  year      = {2025},
  publisher = {Hugging Face},
  url       = {https://huggingface.co/RockySinghRajput/Indic-mobile}
}

Model Card Author

Rocky Singh Rajput

For questions, feedback, or collaboration, please open a Community Discussion.

Downloads last month
66
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RockySinghRajput/Indic-mobile

Quantizations
1 model