🔱 Indro-Veda: The Sovereign Reasoning Model (500M)

Indro-Veda is a state-of-the-art Small Language Model (SLM) developed by Indro-ai. With 500 million parameters, it is specifically engineered to demonstrate high-level reasoning, logical deduction, and structured problem-solving—capabilities typically reserved for much larger models.

The name "Indro-Veda" signifies the fusion of supreme intelligence (Indro) and profound knowledge (Veda).

🚀 Model Highlights

Architecture: Optimized Transformer-based architecture (Llama-style).
Parameters: 500 Million.
Training Tokens: 3 Billion curated tokens.
Specialization: Mathematics, Algorithmic Code, and High-Quality Educational Reasoning.
Framework: Trained using PyTorch/XLA on TPU infrastructure for maximum efficiency.

🧠 Training Philosophy: "Reasoning over Recall"

Unlike traditional small models that focus on memorizing facts, Indro-Veda is trained on a Reasoning-Heavy Dataset Mixture:

Logical Core (Math): Powered by UltraData-Math to ensure the model understands step-by-step derivation.
Structural Core (Code): Trained on starcoderdata to enhance algorithmic thinking and syntax awareness.
Knowledge Core (Education): Built on FineWeb-Edu to provide a clean, high-signal educational foundation.

📊 Dataset Distribution

The model was pre-trained on the Indro-Veda Dataset (3B Tokens) with a fixed-ratio mixture designed to prevent catastrophic forgetting and maintain balanced intelligence across domains.

Component	Focus	Data Source
Reasoning	Mathematics & Logic	UltraData-Math
Structure	Programming & Algorithms	Starcoderdata
Knowledge	High-Quality Educational Web	FineWeb-Edu
Identity	Sovereign Alignment	Indro-ai Proprietary

🛠️ Usage

You can use Indro-Veda with the Hugging Face transformers library:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Indro-ai/Indro-Veda-500M"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = "Explain the concept of logical deduction."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Downloads last month: -; Downloads are not tracked for this model. How to track

Indro-ai
/

Indro-Veda-v1-500M

🔱 Indro-Veda: The Sovereign Reasoning Model (500M)

🚀 Model Highlights

🧠 Training Philosophy: "Reasoning over Recall"

📊 Dataset Distribution

🛠️ Usage

Dataset used to train Indro-ai/Indro-Veda-v1-500M