|
|
--- |
|
|
title: README |
|
|
emoji: ๐ |
|
|
colorFrom: pink |
|
|
colorTo: purple |
|
|
sdk: static |
|
|
pinned: false |
|
|
--- |
|
|
|
|
|
# Orature AI - Pioneering Urdu Language AI |
|
|
|
|
|
**Mission:** Orature AI is dedicated to advancing the frontiers of Artificial Intelligence, and Language Models for the Urdu language. We aim to develop computationally efficient, culturally-aware, and accessible language technologies that empower local communities, researchers, and businesses. Our work focuses on bridging the linguistic digital divide and promoting equitable and sustainable AI development. |
|
|
|
|
|
**Vision:** To be a leading force in creating and democratizing state-of-the-art NLP resources for Urdu, a low-resource language, fostering innovation and inclusivity in the global AI landscape. |
|
|
|
|
|
## About Us |
|
|
|
|
|
Orature AI has emerged from the foundational work of the ALIF ุงูู project, a Final Year Project at Habib University (Spring 2025). Our core team comprises passionate researchers and engineers committed to open-source principles and collaborative innovation. |
|
|
|
|
|
**Core Team (Founders of ALIF ุงูู):** |
|
|
* Syed Muhammad Ali Naqvi |
|
|
* Zainab Haider |
|
|
* Syeda Haya Fatima |
|
|
* Hammad Sajid |
|
|
* Ali Muhammad Asad |
|
|
|
|
|
**Supervisor of ALIF ุงูู:** |
|
|
* Dr. Abdul Samad |
|
|
* Dr Inayat Ullah |
|
|
|
|
|
**Affiliation:** |
|
|
* Habib University, Dhanani School of Science and Engineering |
|
|
|
|
|
## Our Focus Areas |
|
|
|
|
|
* **Data Curation & Tokenization:** Novel creation and meticulous preprocessing of large-scale, culturally relevant datasets and language-specific tokenizer. |
|
|
* **Urdu Language Model Development:** Creating robust pretrained and instruction-tuned Small Language Models (SLMs) for Urdu. |
|
|
* **Low-Resource NLP:** Developing scalable frameworks and methodologies for building language models for underrepresented languages. |
|
|
* **Open Source Contribution:** Sharing models, datasets, and research findings with the global community. |
|
|
* **Sustainable AI:** Advocating for efficient and environmentally conscious AI practices. |
|
|
|
|
|
## Our Flagship Project: ALIF ุงูู |
|
|
|
|
|
The **ALIF ุงูู** project represents our initial and core contribution, featuring a series of Urdu pretrained generative models, custom tokenizers, and comprehensive datasets. |
|
|
<!-- * [Link to ALIF Project Paper/Website (if separate from HF)] |
|
|
* [Link to ALIF Models on Hugging Face] |
|
|
* [Link to ALIF Datasets on Hugging Face] --> |
|
|
|
|
|
<!-- ## Values |
|
|
|
|
|
* **Openness:** We believe in the power of open-source to accelerate research and development. |
|
|
* **Inclusivity:** We strive to make AI accessible and beneficial for all linguistic communities. |
|
|
* **Rigor:** We are committed to high-quality research and meticulous development practices. |
|
|
* **Collaboration:** We welcome partnerships and contributions from the wider community. |
|
|
* **Impact:** We aim to create AI solutions that have a tangible positive impact. --> |
|
|
|
|
|
## Get Involved |
|
|
|
|
|
* **Explore our Models & Datasets:** Browse our contributions on the Hugging Face Hub. |
|
|
<!-- * **Contribute:** We encourage contributions to our open-source projects. Check out our GitHub repositories [Link to Orature AI GitHub Org, if applicable]. --> |
|
|
<!-- * **Contact Us:** For collaborations, inquiries, or feedback, please reach out to [YOUR_ORATURE_AI_EMAIL_OR_CONTACT_METHOD]. --> |
|
|
|