YADAV0206 commited on
Commit
8e2f90c
·
verified ·
1 Parent(s): 7f7ac19

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -2
README.md CHANGED
@@ -1,10 +1,128 @@
1
  ---
2
  title: README
3
- emoji: 🐢
4
  colorFrom: blue
5
  colorTo: blue
6
  sdk: static
7
  pinned: false
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  title: README
 
3
  colorFrom: blue
4
  colorTo: blue
5
  sdk: static
6
  pinned: false
7
  ---
8
 
9
+ ---
10
+ # AutonomousX
11
+ ### Open Source Research for Building Large Language Models from Scratch on TPUs
12
+
13
+ **AutonomousX** is an open-source initiative focused on developing **Large Language Models (LLMs) from scratch** using custom training pipelines built with **JAX on Google TPUs**.
14
+
15
+ Compute supporting the development of this organization and its models was provided by **Google's TRC Program (TPU Research Cloud)**.
16
+
17
+ The organization and its research projects are developed by **Rohit Yadav**, a **B.Tech 3rd year student at Dr. B.R. Ambedkar National Institute of Technology (NIT) Jalandhar, India**.
18
+
19
+ ---
20
+
21
+ # Mission
22
+
23
+ AutonomousX aims to make **LLM training infrastructure accessible and reproducible** for researchers, students, and developers.
24
+
25
+ While modern language models are widely used, **complete end-to-end guides for training LLMs from scratch on TPUs remain scarce**, particularly for beginners working with **JAX and distributed TPU training**.
26
+
27
+ AutonomousX focuses on filling this gap by publishing **fully reproducible open-source pipelines** that demonstrate how to train language models from scratch using limited compute resources.
28
+
29
+ ---
30
+
31
+ # Research Focus
32
+
33
+ The organization explores multiple aspects of **efficient LLM training on TPUs**, including:
34
+
35
+ • Custom transformer architectures
36
+ • Variants with and without **RoPE (Rotary Positional Embeddings)**
37
+ • Memory-efficient training techniques
38
+ • Custom optimizer experiments
39
+ • Training pipeline optimization using **JAX + pmap**
40
+ • Efficient dataset streaming and preprocessing
41
+
42
+ The goal is to demonstrate how meaningful LLM research can be conducted even with **compute-limited environments**.
43
+
44
+ ---
45
+
46
+ # Instinct Model Family
47
+
48
+ AutonomousX develops the **Instinct** family of language models.
49
+
50
+ These models are built **entirely from scratch**, including tokenizer, architecture, training pipeline, and TPU training infrastructure.
51
+
52
+ Instinct models explore different configurations such as:
53
+
54
+ • Transformer architectures with and without **RoPE**
55
+ • Custom training optimizers
56
+ • TPU-optimized training pipelines using **JAX + pmap**
57
+ • Memory-efficient training for limited hardware environments
58
+
59
+ The models are designed to demonstrate how **modern language models can be trained on small TPU pods such as TPU v4-8**.
60
+
61
+ ---
62
+
63
+ # Compute Strategy
64
+
65
+ One of the core goals of AutonomousX is to explore **efficient training on limited compute resources**.
66
+
67
+ Research focuses on training models:
68
+
69
+ • Up to **~1.5B parameters**
70
+ • On **small TPU v4-8 pods**
71
+ • Across **hundreds of billions of tokens**
72
+
73
+ By optimizing training pipelines and architecture design, AutonomousX investigates how far **efficient training can scale without access to massive GPU clusters**.
74
+
75
+ ---
76
+
77
+ # Open Source Philosophy
78
+
79
+ AutonomousX publishes **complete reproducible implementations** including:
80
+
81
+ • Dataset pipelines
82
+ • Tokenizer training
83
+ • Model architectures
84
+ • TPU training scripts
85
+ • Checkpointing systems
86
+ • Inference pipelines
87
+
88
+ All repositories aim to provide **transparent and educational implementations** so the open-source community can learn how large language models are trained from the ground up.
89
+
90
+ ---
91
+
92
+ # Why This Matters
93
+
94
+ Many tutorials focus only on **using pretrained models**, but very few resources explain:
95
+
96
+ • How to train LLMs from scratch
97
+ • How to run training pipelines on TPUs
98
+ • How distributed JAX training works
99
+ • How datasets like **The PILE** are processed at scale
100
+
101
+ AutonomousX aims to make these processes **accessible, understandable, and reproducible**.
102
+
103
+ ---
104
+
105
+ # Author
106
+
107
+ **Rohit Yadav**
108
+
109
+ B.Tech 3rd Year
110
+ Dr. B.R. Ambedkar National Institute of Technology (NIT) Jalandhar, India
111
+
112
+ E-mail: yrohit1825@gmail.com
113
+
114
+ LinkedIN:
115
+ https://www.linkedin.com/in/rohit-yadav-25535b256/
116
+
117
+ Github:
118
+ https://github.com/YADAV1825
119
+
120
+ ---
121
+
122
+ # Organization
123
+
124
+ **AutonomousX**
125
+
126
+ Open-source research initiative for **training Large Language Models from scratch on TPUs using JAX**.
127
+
128
+ ---