insightpublica commited on
Commit
96fa118
·
verified ·
1 Parent(s): 5603d5c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -1
README.md CHANGED
@@ -6,5 +6,14 @@ colorTo: gray
6
  sdk: static
7
  pinned: false
8
  ---
 
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
6
  sdk: static
7
  pinned: false
8
  ---
9
+ We are a publishing house with 2,000+ titles, dedicated to the digitization and preservation of literary classics and linguistic resources. Our focus is on building high-quality, aligned datasets for Low-Resource Languages, specifically within the Dravidian and Indic language families.
10
 
11
+ Currently, we are working on large-scale projects including:
12
+
13
+ Parallel Corpora: Multilingual alignment of classic literature (English, Malayalam, Hindi, Kannada, and Tamil).
14
+
15
+ Lexical Datasets: Digitizing comprehensive dictionaries like Shabdatharavali for AI training and NLP research.
16
+
17
+ Classic Literature Digitization: Converting a vast catalog of public domain titles into AI-ready formats (e-Pub/JSON).
18
+
19
+ Our goal is to bridge the gap in Machine Translation and NLU for Indian languages by providing clean, human-verified, and culturally rich data.