Spaces:
Configuration error
Configuration error
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,10 +1,128 @@
|
|
| 1 |
---
|
| 2 |
title: README
|
| 3 |
-
emoji: 🐢
|
| 4 |
colorFrom: blue
|
| 5 |
colorTo: blue
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
title: README
|
|
|
|
| 3 |
colorFrom: blue
|
| 4 |
colorTo: blue
|
| 5 |
sdk: static
|
| 6 |
pinned: false
|
| 7 |
---
|
| 8 |
|
| 9 |
+
---
|
| 10 |
+
# AutonomousX
|
| 11 |
+
### Open Source Research for Building Large Language Models from Scratch on TPUs
|
| 12 |
+
|
| 13 |
+
**AutonomousX** is an open-source initiative focused on developing **Large Language Models (LLMs) from scratch** using custom training pipelines built with **JAX on Google TPUs**.
|
| 14 |
+
|
| 15 |
+
Compute supporting the development of this organization and its models was provided by **Google's TRC Program (TPU Research Cloud)**.
|
| 16 |
+
|
| 17 |
+
The organization and its research projects are developed by **Rohit Yadav**, a **B.Tech 3rd year student at Dr. B.R. Ambedkar National Institute of Technology (NIT) Jalandhar, India**.
|
| 18 |
+
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
# Mission
|
| 22 |
+
|
| 23 |
+
AutonomousX aims to make **LLM training infrastructure accessible and reproducible** for researchers, students, and developers.
|
| 24 |
+
|
| 25 |
+
While modern language models are widely used, **complete end-to-end guides for training LLMs from scratch on TPUs remain scarce**, particularly for beginners working with **JAX and distributed TPU training**.
|
| 26 |
+
|
| 27 |
+
AutonomousX focuses on filling this gap by publishing **fully reproducible open-source pipelines** that demonstrate how to train language models from scratch using limited compute resources.
|
| 28 |
+
|
| 29 |
+
---
|
| 30 |
+
|
| 31 |
+
# Research Focus
|
| 32 |
+
|
| 33 |
+
The organization explores multiple aspects of **efficient LLM training on TPUs**, including:
|
| 34 |
+
|
| 35 |
+
• Custom transformer architectures
|
| 36 |
+
• Variants with and without **RoPE (Rotary Positional Embeddings)**
|
| 37 |
+
• Memory-efficient training techniques
|
| 38 |
+
• Custom optimizer experiments
|
| 39 |
+
• Training pipeline optimization using **JAX + pmap**
|
| 40 |
+
• Efficient dataset streaming and preprocessing
|
| 41 |
+
|
| 42 |
+
The goal is to demonstrate how meaningful LLM research can be conducted even with **compute-limited environments**.
|
| 43 |
+
|
| 44 |
+
---
|
| 45 |
+
|
| 46 |
+
# Instinct Model Family
|
| 47 |
+
|
| 48 |
+
AutonomousX develops the **Instinct** family of language models.
|
| 49 |
+
|
| 50 |
+
These models are built **entirely from scratch**, including tokenizer, architecture, training pipeline, and TPU training infrastructure.
|
| 51 |
+
|
| 52 |
+
Instinct models explore different configurations such as:
|
| 53 |
+
|
| 54 |
+
• Transformer architectures with and without **RoPE**
|
| 55 |
+
• Custom training optimizers
|
| 56 |
+
• TPU-optimized training pipelines using **JAX + pmap**
|
| 57 |
+
• Memory-efficient training for limited hardware environments
|
| 58 |
+
|
| 59 |
+
The models are designed to demonstrate how **modern language models can be trained on small TPU pods such as TPU v4-8**.
|
| 60 |
+
|
| 61 |
+
---
|
| 62 |
+
|
| 63 |
+
# Compute Strategy
|
| 64 |
+
|
| 65 |
+
One of the core goals of AutonomousX is to explore **efficient training on limited compute resources**.
|
| 66 |
+
|
| 67 |
+
Research focuses on training models:
|
| 68 |
+
|
| 69 |
+
• Up to **~1.5B parameters**
|
| 70 |
+
• On **small TPU v4-8 pods**
|
| 71 |
+
• Across **hundreds of billions of tokens**
|
| 72 |
+
|
| 73 |
+
By optimizing training pipelines and architecture design, AutonomousX investigates how far **efficient training can scale without access to massive GPU clusters**.
|
| 74 |
+
|
| 75 |
+
---
|
| 76 |
+
|
| 77 |
+
# Open Source Philosophy
|
| 78 |
+
|
| 79 |
+
AutonomousX publishes **complete reproducible implementations** including:
|
| 80 |
+
|
| 81 |
+
• Dataset pipelines
|
| 82 |
+
• Tokenizer training
|
| 83 |
+
• Model architectures
|
| 84 |
+
• TPU training scripts
|
| 85 |
+
• Checkpointing systems
|
| 86 |
+
• Inference pipelines
|
| 87 |
+
|
| 88 |
+
All repositories aim to provide **transparent and educational implementations** so the open-source community can learn how large language models are trained from the ground up.
|
| 89 |
+
|
| 90 |
+
---
|
| 91 |
+
|
| 92 |
+
# Why This Matters
|
| 93 |
+
|
| 94 |
+
Many tutorials focus only on **using pretrained models**, but very few resources explain:
|
| 95 |
+
|
| 96 |
+
• How to train LLMs from scratch
|
| 97 |
+
• How to run training pipelines on TPUs
|
| 98 |
+
• How distributed JAX training works
|
| 99 |
+
• How datasets like **The PILE** are processed at scale
|
| 100 |
+
|
| 101 |
+
AutonomousX aims to make these processes **accessible, understandable, and reproducible**.
|
| 102 |
+
|
| 103 |
+
---
|
| 104 |
+
|
| 105 |
+
# Author
|
| 106 |
+
|
| 107 |
+
**Rohit Yadav**
|
| 108 |
+
|
| 109 |
+
B.Tech 3rd Year
|
| 110 |
+
Dr. B.R. Ambedkar National Institute of Technology (NIT) Jalandhar, India
|
| 111 |
+
|
| 112 |
+
E-mail: yrohit1825@gmail.com
|
| 113 |
+
|
| 114 |
+
LinkedIN:
|
| 115 |
+
https://www.linkedin.com/in/rohit-yadav-25535b256/
|
| 116 |
+
|
| 117 |
+
Github:
|
| 118 |
+
https://github.com/YADAV1825
|
| 119 |
+
|
| 120 |
+
---
|
| 121 |
+
|
| 122 |
+
# Organization
|
| 123 |
+
|
| 124 |
+
**AutonomousX**
|
| 125 |
+
|
| 126 |
+
Open-source research initiative for **training Large Language Models from scratch on TPUs using JAX**.
|
| 127 |
+
|
| 128 |
+
---
|