AI & ML interests
Interested in Colab and Seeking Internships!!
Recent Activity
AutonomousX
Open Source Research for Building Large Language Models from Scratch and Finetuning on TPUs & GPUs
Mission
AutonomousX aims to make LLM training infrastructure accessible and reproducible for researchers, students, and developers. While modern language models are widely used, complete end-to-end guides for training LLMs from scratch on TPUs remain scarce, particularly for beginners working with JAX and distributed TPU training. AutonomousX focuses on filling this gap by publishing fully reproducible open-source pipelines that demonstrate how to train language models from scratch using limited compute resources.
Compute supporting the development of this organization and its models was provided by Google's TRC Program (TPU Research Cloud).
Research Focus
The organization explores multiple aspects of efficient LLM training on TPUs, including:
- Custom transformer architectures
- Variants with and without RoPE (Rotary Positional Embeddings)
- Memory-efficient training techniques
- Custom optimizer experiments
- Training pipeline optimization using JAX + pmap
- Efficient dataset streaming and preprocessing
The goal is to demonstrate how meaningful LLM research can be conducted even with compute-limited environments.
Instinct Model Family
AutonomousX develops the Instinct family of language models. These models are built entirely from scratch, including tokenizer, architecture, training pipeline, and TPU training infrastructure. Instinct models explore different configurations such as:
- Transformer architectures with and without RoPE
- Custom training optimizers
- TPU-optimized training pipelines using JAX + pmap
- Memory-efficient training for limited hardware environments
The models are designed to demonstrate how modern language models can be trained on small TPU pods such as TPU v4-8.
Compute Strategy
One of the core goals of AutonomousX is to explore efficient training on limited compute resources. Research focuses on training models:
- Up to ~1.5B parameters
- On small TPU v4-8 pods
- Across hundreds of billions of tokens
By optimizing training pipelines and architecture design, AutonomousX investigates how far efficient training can scale without access to massive GPU clusters.
Open Source Philosophy
AutonomousX publishes complete reproducible implementations including Dataset pipelines, Tokenizer training, Model architectures, TPU training scripts, Checkpointing systems, and Inference pipelines. All repositories aim to provide transparent and educational implementations so the open-source community can learn how large language models are trained from the ground up.
Why This Matters
Many tutorials focus only on using pretrained models, but very few resources explain:
- How to train LLMs from scratch
- How to run training pipelines on TPUs
- How distributed JAX training works
- How datasets like The PILE are processed at scale
AutonomousX aims to make these processes accessible, understandable, and reproducible.