Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
|
@@ -6,5 +6,14 @@ colorTo: gray
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
|
|
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
---
|
| 9 |
+
We are a publishing house with 2,000+ titles, dedicated to the digitization and preservation of literary classics and linguistic resources. Our focus is on building high-quality, aligned datasets for Low-Resource Languages, specifically within the Dravidian and Indic language families.
|
| 10 |
|
| 11 |
+
Currently, we are working on large-scale projects including:
|
| 12 |
+
|
| 13 |
+
Parallel Corpora: Multilingual alignment of classic literature (English, Malayalam, Hindi, Kannada, and Tamil).
|
| 14 |
+
|
| 15 |
+
Lexical Datasets: Digitizing comprehensive dictionaries like Shabdatharavali for AI training and NLP research.
|
| 16 |
+
|
| 17 |
+
Classic Literature Digitization: Converting a vast catalog of public domain titles into AI-ready formats (e-Pub/JSON).
|
| 18 |
+
|
| 19 |
+
Our goal is to bridge the gap in Machine Translation and NLU for Indian languages by providing clean, human-verified, and culturally rich data.
|