Spaces:

Indro-ai
/

Indrohelper

Sleeping

App Files Files Community

Apply for a GPU community grant: Personal project

pinned

by abhinav337463 - opened Feb 24

Discussion

abhinav337463

indro-veda org Feb 24

•

edited Feb 24

We are developing Indro-Veda-500M, a state-of-the-art 500M parameter Llama-style architecture designed to excel in logical reasoning, algorithmic thinking, and educational depth. Our goal is to prove that high-quality, curated data can enable sub-billion parameter models to exhibit reasoning capabilities typically seen in much larger architectures.

Technical Specifications:

Architecture: Llama-based Transformer (500M parameters).

Dataset: 3 Billion high-quality tokens (Indro-Veda-Dataset).

Dataset Mixture: A fixed-ratio blend of UltraData-Math (for logical derivation), Starcoderdata (for algorithmic structure), and FineWeb-Edu (for high-signal educational knowledge).

Framework: PyTorch/XLA optimized for distributed training.

Current Progress & Bottleneck:

We have successfully completed initial training phases (up to Step 400+) using Kaggle’s TPU v5e-8 infrastructure. However, we have hit a critical bottleneck: the 9-hour session limit. The overhead of frequent checkpointing (3.5GB+ optimizer states) and the lack of persistent, continuous compute is hindering our ability to finish the full 3B token pre-training run.

The Grant Request:

We are requesting a Community Compute Grant (A100/H100 tier) to overcome the "9-hour wall." Stable, continuous compute will allow us to:

1.Complete the full 3-billion-token pre-training uninterrupted.

2.Experiment with larger batch sizes and Flash Attention 2 for maximum efficiency.

3.Open-source the final weights and training recipes for the global research community.

Open Source Commitment:

Indro-Veda is a "Sovereign AI" initiative. All our models, datasets, and training configurations are hosted under the Indro-ai Organization on Hugging Face and will remain fully open-source under the Apache-2.0 license.

Links:

Organization: https://huggingface.co/Indro-ai

Dataset: https://huggingface.co/datasets/Indro-ai/indro-web-data

Model Page: https://huggingface.co/Indro-ai/Indro-Veda-V1-500M

abhinav337463

indro-veda org Feb 25

abhinav337463 pinned discussion Feb 26

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment