Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ tags:
|
|
| 14 |
- physicswallah
|
| 15 |
language:
|
| 16 |
- en
|
| 17 |
-
model_name: PhysicsWallah/
|
| 18 |
model_creator: Physics Wallah AI Research
|
| 19 |
model_type: Causal decoder-based model
|
| 20 |
base_model: Qwen/Qwen2.5-Math-7B
|
|
@@ -23,10 +23,10 @@ pipeline_tag: text-generation
|
|
| 23 |
|
| 24 |
# Aryabhatta 1.0 🌟
|
| 25 |
|
| 26 |
-
**
|
| 27 |
|
| 28 |
|
| 29 |
-
> 🚧 *
|
| 30 |
---
|
| 31 |
|
| 32 |
## 🧠 Key Features
|
|
@@ -51,7 +51,7 @@ pipeline_tag: text-generation
|
|
| 51 |
- **Reinforcement Learning with Verifiable Rewards (RLVR)**
|
| 52 |
|
| 53 |
### 🔀 Model Merging
|
| 54 |
-
We began with model merging (Weighted average) to build a strong initialization (
|
| 55 |
* Qwen 2.5 Math: A robust math-centric LLM with solid symbolic math foundations.
|
| 56 |
* Ace Math: An enhanced version of Qwen 2.5 Math, fine-tuned by NVIDIA for improved accuracy in mathematics benchmarks.
|
| 57 |
* DeepSeek R1 Distill Qwen: A long-form reasoning model, fine-tuned on reasoning traces distilled from DeepSeek R1.
|
|
@@ -63,7 +63,7 @@ We extracted ~250K raw questions from Physics Wallah's internal database and app
|
|
| 63 |
Final curated dataset: ~130K high-quality questions.
|
| 64 |
|
| 65 |
For each question:
|
| 66 |
-
* Generated 4 CoTs using
|
| 67 |
* Retained only those leading to correct final answers.
|
| 68 |
|
| 69 |
Resulting Dataset:
|
|
@@ -79,7 +79,7 @@ We used a custom in-house variant of Group Relative Policy Optimization (GRPO),
|
|
| 79 |
|
| 80 |
We used RLVR on the remaining ~30K questions.
|
| 81 |
|
| 82 |
-
This multi-phase training strategy allows
|
| 83 |
|
| 84 |
---
|
| 85 |
|
|
@@ -111,11 +111,11 @@ We used a composite evaluation metric to reflect real-world grading rigor and re
|
|
| 111 |
|
| 112 |
### 🔹 Accuracy Comparison Across Models
|
| 113 |

|
| 114 |
-
> *
|
| 115 |
|
| 116 |
### 🔹 Accuracy vs Token Usage
|
| 117 |

|
| 118 |
-
> *
|
| 119 |
|
| 120 |
---
|
| 121 |
|
|
@@ -134,7 +134,7 @@ We used a composite evaluation metric to reflect real-world grading rigor and re
|
|
| 134 |
```python
|
| 135 |
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
| 136 |
|
| 137 |
-
model_id = "PhysicsWallahAI/
|
| 138 |
|
| 139 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 140 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
|
@@ -181,7 +181,7 @@ To run the model efficiently using vLLM:
|
|
| 181 |
from vllm import LLM, SamplingParams
|
| 182 |
|
| 183 |
# Initialize model (downloads from Hugging Face if not local)
|
| 184 |
-
llm = LLM(model="PhysicsWallahAI/
|
| 185 |
|
| 186 |
# Define prompt and sampling configuration
|
| 187 |
query = 'Find all the values of \\sqrt[3]{1}'
|
|
@@ -200,7 +200,7 @@ print(results[0].outputs[0].text.strip())
|
|
| 200 |
|
| 201 |
## 🚀 Roadmap
|
| 202 |
|
| 203 |
-
**
|
| 204 |
- Extending domain coverage to **Physics** and **Chemistry**
|
| 205 |
- Supporting **JEE Advanced**, **NEET**, and **Foundation syllabus**
|
| 206 |
- Further optimization for affordability and accuracy in real-time deployments
|
|
@@ -212,9 +212,9 @@ print(results[0].outputs[0].text.strip())
|
|
| 212 |
If you use this model, please cite:
|
| 213 |
|
| 214 |
```bibtex
|
| 215 |
-
@misc{
|
| 216 |
-
title = {
|
| 217 |
author = {Physics Wallah AI Research},
|
| 218 |
year = {2025},
|
| 219 |
-
note = {\url{https://huggingface.co/PhysicsWallahAI/
|
| 220 |
}
|
|
|
|
| 14 |
- physicswallah
|
| 15 |
language:
|
| 16 |
- en
|
| 17 |
+
model_name: PhysicsWallah/Aryabhata-1.0
|
| 18 |
model_creator: Physics Wallah AI Research
|
| 19 |
model_type: Causal decoder-based model
|
| 20 |
base_model: Qwen/Qwen2.5-Math-7B
|
|
|
|
| 23 |
|
| 24 |
# Aryabhatta 1.0 🌟
|
| 25 |
|
| 26 |
+
**Aryabhata 1.0** is a 7B parameter small language model for mathematics developed by **Physics Wallah AI Research**, optimized for high-stakes Indian competitive exams like **JEE Mains**. Despite its compact size, Aryabhata 1.0 achieves **state-of-the-art performance** on exam-centric reasoning tasks with impressive **token efficiency** and low inference cost.
|
| 27 |
|
| 28 |
|
| 29 |
+
> 🚧 *Aryabhata 1.0 is an **experimental release**. We are actively seeking feedback — please contribute in the Discussion tab of this repo.*
|
| 30 |
---
|
| 31 |
|
| 32 |
## 🧠 Key Features
|
|
|
|
| 51 |
- **Reinforcement Learning with Verifiable Rewards (RLVR)**
|
| 52 |
|
| 53 |
### 🔀 Model Merging
|
| 54 |
+
We began with model merging (Weighted average) to build a strong initialization (Aryabhata 0.5) by combining diverse model capabilities:
|
| 55 |
* Qwen 2.5 Math: A robust math-centric LLM with solid symbolic math foundations.
|
| 56 |
* Ace Math: An enhanced version of Qwen 2.5 Math, fine-tuned by NVIDIA for improved accuracy in mathematics benchmarks.
|
| 57 |
* DeepSeek R1 Distill Qwen: A long-form reasoning model, fine-tuned on reasoning traces distilled from DeepSeek R1.
|
|
|
|
| 63 |
Final curated dataset: ~130K high-quality questions.
|
| 64 |
|
| 65 |
For each question:
|
| 66 |
+
* Generated 4 CoTs using Aryabhata 0.5.
|
| 67 |
* Retained only those leading to correct final answers.
|
| 68 |
|
| 69 |
Resulting Dataset:
|
|
|
|
| 79 |
|
| 80 |
We used RLVR on the remaining ~30K questions.
|
| 81 |
|
| 82 |
+
This multi-phase training strategy allows Aryabhata 1.0 to capture **pedagogy-aligned reasoning patterns**, making it highly effective for solving real student queries in mathematics.
|
| 83 |
|
| 84 |
---
|
| 85 |
|
|
|
|
| 111 |
|
| 112 |
### 🔹 Accuracy Comparison Across Models
|
| 113 |

|
| 114 |
+
> *Aryabhata has the best accuracy on JEE Main Maths, on par with frontier models*
|
| 115 |
|
| 116 |
### 🔹 Accuracy vs Token Usage
|
| 117 |

|
| 118 |
+
> *Aryabhata is on par with frontier models in terms of accuracy vs token usage*
|
| 119 |
|
| 120 |
---
|
| 121 |
|
|
|
|
| 134 |
```python
|
| 135 |
from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
|
| 136 |
|
| 137 |
+
model_id = "PhysicsWallahAI/Aryabhata-1.0"
|
| 138 |
|
| 139 |
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 140 |
model = AutoModelForCausalLM.from_pretrained(model_id)
|
|
|
|
| 181 |
from vllm import LLM, SamplingParams
|
| 182 |
|
| 183 |
# Initialize model (downloads from Hugging Face if not local)
|
| 184 |
+
llm = LLM(model="PhysicsWallahAI/Aryabhata-1.0")
|
| 185 |
|
| 186 |
# Define prompt and sampling configuration
|
| 187 |
query = 'Find all the values of \\sqrt[3]{1}'
|
|
|
|
| 200 |
|
| 201 |
## 🚀 Roadmap
|
| 202 |
|
| 203 |
+
**Aryabhata 2.0** (Upcoming):
|
| 204 |
- Extending domain coverage to **Physics** and **Chemistry**
|
| 205 |
- Supporting **JEE Advanced**, **NEET**, and **Foundation syllabus**
|
| 206 |
- Further optimization for affordability and accuracy in real-time deployments
|
|
|
|
| 212 |
If you use this model, please cite:
|
| 213 |
|
| 214 |
```bibtex
|
| 215 |
+
@misc{Aryabhata2025,
|
| 216 |
+
title = {Aryabhata 1.0: A compact, exam-focused language model tailored for mathematics in Indian competitive exams, especially JEE Main.},
|
| 217 |
author = {Physics Wallah AI Research},
|
| 218 |
year = {2025},
|
| 219 |
+
note = {\url{https://huggingface.co/PhysicsWallahAI/Aryabhata-1.0}},
|
| 220 |
}
|