Spaces:
Running
Running
File size: 3,149 Bytes
75faadb f87fb66 75faadb f87fb66 75faadb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 | ---
title: README
emoji: π»
colorFrom: pink
colorTo: gray
sdk: static
pinned: false
---
# Nalanda Data
**Open data and models for Indian STEM education AI.**
We build and publish datasets and fine-tuned models grounded in Indian academic content β JEE & NEET competitive exam questions, school and college textbooks, and multimodal science material. The goal is to make Indian education AI workable for researchers, builders, and edtech companies who can't easily source this data elsewhere.
π [nalandadata.ai](https://nalandadata.ai)
π§ [info@nalandadata.ai](mailto:info@nalandadata.ai) (commercial) Β· [tech@nalandadata.ai](mailto:tech@nalandadata.ai) (technical)
---
## What we publish
### Datasets
| Dataset | Focus | License |
|---|---|---|
| [`NalandaJEENEETBench`](https://huggingface.co/datasets/Nalandadata/NalandaJEENEETBench) | JEE & NEET benchmark across Physics, Chemistry, Mathematics, Biology | CC-BY-NC-4.0 |
| [`DrishtiTable`](https://huggingface.co/datasets/Nalandadata/DrishtiTable) | Table structure recognition in Indian textbooks (EN + HI) | Apache-2.0 |
| [`nalanda-image-qa`](https://huggingface.co/datasets/Nalandadata/nalanda-image-qa) | 1,000 multimodal STEM Q&A pairs (image + text) | CC-BY-4.0 |
### Models
| Model | Built on | Use case | License |
|---|---|---|---|
| [`nalanda-qwen-7b-grpo`](https://huggingface.co/Nalandadata/nalanda-qwen-7b-grpo) | Qwen 2.5 7B Instruct + GRPO | JEE / NEET problem solving | Apache-2.0 |
| [`nalanda-image-vl`](https://huggingface.co/Nalandadata/nalanda-image-vl) | Llama 3.2 11B Vision (LoRA) | Multimodal STEM Q&A | Llama 3.2 |
| [`DrishtiTable-Qwen2.5-VL-7B`](https://huggingface.co/Nalandadata/DrishtiTable-Qwen2.5-VL-7B) | Qwen 2.5 VL 7B (LoRA) | Table structure recognition | Apache-2.0 |
### Demos
- [`nalanda-jee-neet-solver`](https://huggingface.co/spaces/Nalandadata/nalanda-jee-neet-solver) β try the JEE/NEET solver in your browser.
---
## Who this is for
- **Researchers** building or evaluating models on Indian-context STEM
- **Edtech companies** training tutoring, grading, or content-generation systems
- **AI labs** that need benchmarks reflecting non-Western curricula and multilingual (EN / HI) educational content
---
## Data and licensing
The public artifacts on Hugging Face are samples of larger internal datasets. Each repo has its own license, listed in its dataset/model card.
- **Open-licensed releases** (Apache-2.0, CC-BY-4.0) can be used commercially under the terms of those licenses.
- **Non-commercial releases** (CC-BY-NC-4.0) require a separate commercial license for production or revenue-generating use.
- **Full-scale versions, custom slices, and licensed access to the parent corpora** are available on request.
**For commercial licensing, full dataset access, custom data work, or partnerships:**
π§ **[info@nalandadata.ai](mailto:info@nalandadata.ai)**
**For technical questions, integration help, or fine-tuning support:**
π§ **[tech@nalandadata.ai](mailto:tech@nalandadata.ai)**
π **[nalandadata.ai](https://nalandadata.ai)**
|