Nalan-data commited on
Commit
75faadb
Β·
verified Β·
1 Parent(s): 54dd8b1

Replace placeholder org README with commercial-grade org card

Browse files
Files changed (1) hide show
  1. README.md +64 -10
README.md CHANGED
@@ -1,10 +1,64 @@
1
- ---
2
- title: README
3
- emoji: 😻
4
- colorFrom: pink
5
- colorTo: gray
6
- sdk: static
7
- pinned: false
8
- ---
9
-
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: README
3
+ emoji: 😻
4
+ colorFrom: pink
5
+ colorTo: gray
6
+ sdk: static
7
+ pinned: false
8
+ ---
9
+
10
+ # Nalanda Data
11
+
12
+ **Open data and models for Indian STEM education AI.**
13
+
14
+ We build and publish datasets and fine-tuned models grounded in Indian academic content β€” JEE & NEET competitive exam questions, school and college textbooks, and multimodal science material. The goal is to make Indian education AI workable for researchers, builders, and edtech companies who can't easily source this data elsewhere.
15
+
16
+ 🌐 [nalandadata.ai](https://nalandadata.ai)
17
+ πŸ“§ tech@nalandadata.ai
18
+
19
+ ---
20
+
21
+ ## What we publish
22
+
23
+ ### Datasets
24
+
25
+ | Dataset | Focus | License |
26
+ |---|---|---|
27
+ | [`NalandaJEENEETBench`](https://huggingface.co/datasets/Nalandadata/NalandaJEENEETBench) | JEE & NEET benchmark across Physics, Chemistry, Mathematics, Biology | CC-BY-NC-4.0 |
28
+ | [`DrishtiTable`](https://huggingface.co/datasets/Nalandadata/DrishtiTable) | Table structure recognition in Indian textbooks (EN + HI) | Apache-2.0 |
29
+ | [`nalanda-image-qa`](https://huggingface.co/datasets/Nalandadata/nalanda-image-qa) | 1,000 multimodal STEM Q&A pairs (image + text) | CC-BY-4.0 |
30
+
31
+ ### Models
32
+
33
+ | Model | Built on | Use case | License |
34
+ |---|---|---|---|
35
+ | [`nalanda-qwen-7b-grpo`](https://huggingface.co/Nalandadata/nalanda-qwen-7b-grpo) | Qwen 2.5 7B Instruct + GRPO | JEE / NEET problem solving | Apache-2.0 |
36
+ | [`nalanda-image-vl`](https://huggingface.co/Nalandadata/nalanda-image-vl) | Llama 3.2 11B Vision (LoRA) | Multimodal STEM Q&A | Llama 3.2 |
37
+ | [`DrishtiTable-Qwen2.5-VL-7B`](https://huggingface.co/Nalandadata/DrishtiTable-Qwen2.5-VL-7B) | Qwen 2.5 VL 7B (LoRA) | Table structure recognition | Apache-2.0 |
38
+
39
+ ### Demos
40
+
41
+ - [`nalanda-jee-neet-solver`](https://huggingface.co/spaces/Nalandadata/nalanda-jee-neet-solver) β€” try the JEE/NEET solver in your browser.
42
+
43
+ ---
44
+
45
+ ## Who this is for
46
+
47
+ - **Researchers** building or evaluating models on Indian-context STEM
48
+ - **Edtech companies** training tutoring, grading, or content-generation systems
49
+ - **AI labs** that need benchmarks reflecting non-Western curricula and multilingual (EN / HI) educational content
50
+
51
+ ---
52
+
53
+ ## Data and licensing
54
+
55
+ The public artifacts on Hugging Face are samples of larger internal datasets. Each repo has its own license, listed in its dataset/model card.
56
+
57
+ - **Open-licensed releases** (Apache-2.0, CC-BY-4.0) can be used commercially under the terms of those licenses.
58
+ - **Non-commercial releases** (CC-BY-NC-4.0) require a separate commercial license for production or revenue-generating use.
59
+ - **Full-scale versions, custom slices, and licensed access to the parent corpora** are available on request.
60
+
61
+ For commercial licensing, full dataset access, custom data work, or partnerships:
62
+
63
+ πŸ“§ **tech@nalandadata.ai**
64
+ 🌐 **[nalandadata.ai](https://nalandadata.ai)**