axya-mini / README.md
axmeeabdhullo's picture
Update README.md
34ca02a verified
---
license: apache-2.0
datasets:
- axmeeabdhullo/axya-tech-dv100
language:
- dv
metrics:
- accuracy
base_model:
- openai-community/gpt2
new_version: axmeeabdhullo/axya-mini
pipeline_tag: question-answering
library_name: adapter-transformers
---
# Axya-Mini
> A fine-tuned GPT-2 adapter model for Dhivehi (Thaana) language question-answering and text generation tasks.
## Model Description
Axya-Mini is a lightweight, efficient adapter-based language model specifically designed for the Dhivehi language, the national language of the Maldives. Built on the GPT-2 architecture and optimized using adapter layers, this model excels at question-answering tasks while maintaining compact size and fast inference.
**Model Type:** Adapter-based Fine-tuned Model
**Base Model:** GPT-2 (openai-community/gpt2)
**Language:** Dhivehi (ދިވެހި)
**Framework:** Quetzal (Revolutionary CPU-optimized training library)
## Key Features
- 🌟 **Language-Specific:** Optimized for Dhivehi (dv) language processing
-**Lightweight:** Efficient adapter architecture for fast inference
- 🎉 **Question Answering:** Trained on question-answering tasks
- 💾 **Safetensors Format:** Secure model serialization
- 🤗 **Adapter-Based:** Uses adapter layers for efficient fine-tuning and storage
## Model Details
### Intended Use
This model is designed for:
- Question answering in Dhivehi
- Text generation tasks in Dhivehi
- Language understanding for Dhivehi content
- Building Dhivehi NLP applications
### Training Data
**Dataset:** [axmeeabdhullo/axya-tech-dv100](https://huggingface.co/datasets/axmeeabdhullo/axya-tech-dv100)
A curated Dhivehi dataset containing 100 high-quality technical and educational content samples.
### Training Methodology
- **Fine-tuning Approach:** Adapter-based fine-tuning
- **Metrics:** Accuracy optimization
- **Library:** adapter-transformers
- **Optimization:** Efficient parameter updating through adapter modules
### Quetzal Library Optimization
**Quetzal** is a revolutionary library that powers the efficient training of this model on CPU, making high-quality AI accessible without expensive GPUs.
**Library Features:**
- 🚀 **3x Faster CPU Training**: Advanced optimizations for CPU-based training
- 📊 **Data Augmentation**: Train accurate models with minimal data (5-10x augmentation)
- 💾 **Memory Efficient**: 4-bit quantization and LoRA for reduced memory usage
- 🎯 **High Accuracy**: Specialized techniques for low-resource scenarios
- 🌍 **Multilingual**: Optimized for languages like Dhivehi, but works for any language
- 🔧 **Easy to Use**: Simple API similar to popular libraries
**Installation:**
```bash
pip install quetzal-ai
```
**Why Quetzal for Dhivehi?**
Quetzal is specifically optimized for low-resource languages like Dhivehi, enabling efficient model training and deployment without requiring expensive GPU infrastructure. This makes it ideal for building NLP models for endangered or underrepresented languages.
## Model Performance
- **Inference Speed:** Fast and efficient due to adapter architecture
- **Model Size:** Compact compared to full model fine-tuning
- **Accuracy:** Optimized for Dhivehi language understanding tasks
## Limitations
- Trained specifically on Dhivehi language content
- Performance may vary with dialects or regional variations
- Requires GPU/TPU for optimal inference speed
- Limited evaluation on diverse downstream tasks
## Recommendations
1. **Fine-tuning:** Can be further fine-tuned on domain-specific Dhivehi data
2. **Deployment:** Use with sufficient computational resources for production
3. **Evaluation:** Test on your specific use case before deployment
4. **Updates:** Check for newer versions of the model for improved performance
## Citation
If you use this model, please cite:
```bibtex
@model{axya_mini,
author = {Abdhullo, Axmee},
title = {Axya-Mini: Dhivehi Language Question-Answering Model},
year = {2025},
publisher = {Hugging Face Model Hub},
howpublished = {https://huggingface.co/axmeeabdhullo/axya-mini}
}
```
## License
This model is licensed under the Apache License 2.0. See the LICENSE file for details.
## Related Resources
- **Dataset:** [axya-tech-dv100](https://huggingface.co/datasets/axmeeabdhullo/axya-tech-dv100)
- **Base Model:** [GPT-2](https://huggingface.co/openai-community/gpt2)
- **Library Documentation:** [Adapter-Transformers](https://adapterhub.ml/)
- **Hugging Face Hub:** [Model Hub](https://huggingface.co/)
## Author
**Axmee Abdhullo**
AI/ML Developer specializing in Dhivehi NLP
[Hugging Face](https://huggingface.co/axmeeabdhullo)
## Contact & Support
For questions, suggestions, or support:
- Open an issue on the model's repository
- Join the Hugging Face community discussions
- Check the model card for updates
---
**Last Updated:** December 2024
**Status:** Active Development
**Version:** 1.0