|
|
--- |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- axmeeabdhullo/axya-tech-dv100 |
|
|
language: |
|
|
- dv |
|
|
metrics: |
|
|
- accuracy |
|
|
base_model: |
|
|
- openai-community/gpt2 |
|
|
new_version: axmeeabdhullo/axya-mini |
|
|
pipeline_tag: question-answering |
|
|
library_name: adapter-transformers |
|
|
--- |
|
|
|
|
|
# Axya-Mini |
|
|
|
|
|
> A fine-tuned GPT-2 adapter model for Dhivehi (Thaana) language question-answering and text generation tasks. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
Axya-Mini is a lightweight, efficient adapter-based language model specifically designed for the Dhivehi language, the national language of the Maldives. Built on the GPT-2 architecture and optimized using adapter layers, this model excels at question-answering tasks while maintaining compact size and fast inference. |
|
|
|
|
|
**Model Type:** Adapter-based Fine-tuned Model |
|
|
**Base Model:** GPT-2 (openai-community/gpt2) |
|
|
**Language:** Dhivehi (ދިވެހި) |
|
|
**Framework:** Quetzal (Revolutionary CPU-optimized training library) |
|
|
|
|
|
## Key Features |
|
|
|
|
|
- 🌟 **Language-Specific:** Optimized for Dhivehi (dv) language processing |
|
|
- ⚡ **Lightweight:** Efficient adapter architecture for fast inference |
|
|
- 🎉 **Question Answering:** Trained on question-answering tasks |
|
|
- 💾 **Safetensors Format:** Secure model serialization |
|
|
- 🤗 **Adapter-Based:** Uses adapter layers for efficient fine-tuning and storage |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Intended Use |
|
|
|
|
|
This model is designed for: |
|
|
- Question answering in Dhivehi |
|
|
- Text generation tasks in Dhivehi |
|
|
- Language understanding for Dhivehi content |
|
|
- Building Dhivehi NLP applications |
|
|
|
|
|
### Training Data |
|
|
|
|
|
**Dataset:** [axmeeabdhullo/axya-tech-dv100](https://huggingface.co/datasets/axmeeabdhullo/axya-tech-dv100) |
|
|
A curated Dhivehi dataset containing 100 high-quality technical and educational content samples. |
|
|
|
|
|
### Training Methodology |
|
|
|
|
|
- **Fine-tuning Approach:** Adapter-based fine-tuning |
|
|
- **Metrics:** Accuracy optimization |
|
|
- **Library:** adapter-transformers |
|
|
- **Optimization:** Efficient parameter updating through adapter modules |
|
|
### Quetzal Library Optimization |
|
|
|
|
|
**Quetzal** is a revolutionary library that powers the efficient training of this model on CPU, making high-quality AI accessible without expensive GPUs. |
|
|
|
|
|
**Library Features:** |
|
|
- 🚀 **3x Faster CPU Training**: Advanced optimizations for CPU-based training |
|
|
- 📊 **Data Augmentation**: Train accurate models with minimal data (5-10x augmentation) |
|
|
- 💾 **Memory Efficient**: 4-bit quantization and LoRA for reduced memory usage |
|
|
- 🎯 **High Accuracy**: Specialized techniques for low-resource scenarios |
|
|
- 🌍 **Multilingual**: Optimized for languages like Dhivehi, but works for any language |
|
|
- 🔧 **Easy to Use**: Simple API similar to popular libraries |
|
|
|
|
|
**Installation:** |
|
|
```bash |
|
|
pip install quetzal-ai |
|
|
``` |
|
|
|
|
|
**Why Quetzal for Dhivehi?** |
|
|
Quetzal is specifically optimized for low-resource languages like Dhivehi, enabling efficient model training and deployment without requiring expensive GPU infrastructure. This makes it ideal for building NLP models for endangered or underrepresented languages. |
|
|
|
|
|
|
|
|
## Model Performance |
|
|
|
|
|
- **Inference Speed:** Fast and efficient due to adapter architecture |
|
|
- **Model Size:** Compact compared to full model fine-tuning |
|
|
- **Accuracy:** Optimized for Dhivehi language understanding tasks |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Trained specifically on Dhivehi language content |
|
|
- Performance may vary with dialects or regional variations |
|
|
- Requires GPU/TPU for optimal inference speed |
|
|
- Limited evaluation on diverse downstream tasks |
|
|
|
|
|
## Recommendations |
|
|
|
|
|
1. **Fine-tuning:** Can be further fine-tuned on domain-specific Dhivehi data |
|
|
2. **Deployment:** Use with sufficient computational resources for production |
|
|
3. **Evaluation:** Test on your specific use case before deployment |
|
|
4. **Updates:** Check for newer versions of the model for improved performance |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model, please cite: |
|
|
|
|
|
```bibtex |
|
|
@model{axya_mini, |
|
|
author = {Abdhullo, Axmee}, |
|
|
title = {Axya-Mini: Dhivehi Language Question-Answering Model}, |
|
|
year = {2025}, |
|
|
publisher = {Hugging Face Model Hub}, |
|
|
howpublished = {https://huggingface.co/axmeeabdhullo/axya-mini} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is licensed under the Apache License 2.0. See the LICENSE file for details. |
|
|
|
|
|
## Related Resources |
|
|
|
|
|
- **Dataset:** [axya-tech-dv100](https://huggingface.co/datasets/axmeeabdhullo/axya-tech-dv100) |
|
|
- **Base Model:** [GPT-2](https://huggingface.co/openai-community/gpt2) |
|
|
- **Library Documentation:** [Adapter-Transformers](https://adapterhub.ml/) |
|
|
- **Hugging Face Hub:** [Model Hub](https://huggingface.co/) |
|
|
|
|
|
## Author |
|
|
|
|
|
**Axmee Abdhullo** |
|
|
AI/ML Developer specializing in Dhivehi NLP |
|
|
[Hugging Face](https://huggingface.co/axmeeabdhullo) |
|
|
|
|
|
## Contact & Support |
|
|
|
|
|
For questions, suggestions, or support: |
|
|
- Open an issue on the model's repository |
|
|
- Join the Hugging Face community discussions |
|
|
- Check the model card for updates |
|
|
|
|
|
--- |
|
|
|
|
|
**Last Updated:** December 2024 |
|
|
**Status:** Active Development |
|
|
**Version:** 1.0 |
|
|
|