Axya-Mini

A fine-tuned GPT-2 adapter model for Dhivehi (Thaana) language question-answering and text generation tasks.

Model Description

Axya-Mini is a lightweight, efficient adapter-based language model specifically designed for the Dhivehi language, the national language of the Maldives. Built on the GPT-2 architecture and optimized using adapter layers, this model excels at question-answering tasks while maintaining compact size and fast inference.

Model Type: Adapter-based Fine-tuned Model
Base Model: GPT-2 (openai-community/gpt2)
Language: Dhivehi (ދިވެހި)
Framework: Quetzal (Revolutionary CPU-optimized training library)

Key Features

  • 🌟 Language-Specific: Optimized for Dhivehi (dv) language processing
  • Lightweight: Efficient adapter architecture for fast inference
  • 🎉 Question Answering: Trained on question-answering tasks
  • 💾 Safetensors Format: Secure model serialization
  • 🤗 Adapter-Based: Uses adapter layers for efficient fine-tuning and storage

Model Details

Intended Use

This model is designed for:

  • Question answering in Dhivehi
  • Text generation tasks in Dhivehi
  • Language understanding for Dhivehi content
  • Building Dhivehi NLP applications

Training Data

Dataset: axmeeabdhullo/axya-tech-dv100
A curated Dhivehi dataset containing 100 high-quality technical and educational content samples.

Training Methodology

  • Fine-tuning Approach: Adapter-based fine-tuning
  • Metrics: Accuracy optimization
  • Library: adapter-transformers
  • Optimization: Efficient parameter updating through adapter modules

Quetzal Library Optimization

Quetzal is a revolutionary library that powers the efficient training of this model on CPU, making high-quality AI accessible without expensive GPUs.

Library Features:

  • 🚀 3x Faster CPU Training: Advanced optimizations for CPU-based training
  • 📊 Data Augmentation: Train accurate models with minimal data (5-10x augmentation)
  • 💾 Memory Efficient: 4-bit quantization and LoRA for reduced memory usage
  • 🎯 High Accuracy: Specialized techniques for low-resource scenarios
  • 🌍 Multilingual: Optimized for languages like Dhivehi, but works for any language
  • 🔧 Easy to Use: Simple API similar to popular libraries

Installation:

pip install quetzal-ai

Why Quetzal for Dhivehi? Quetzal is specifically optimized for low-resource languages like Dhivehi, enabling efficient model training and deployment without requiring expensive GPU infrastructure. This makes it ideal for building NLP models for endangered or underrepresented languages.

Model Performance

  • Inference Speed: Fast and efficient due to adapter architecture
  • Model Size: Compact compared to full model fine-tuning
  • Accuracy: Optimized for Dhivehi language understanding tasks

Limitations

  • Trained specifically on Dhivehi language content
  • Performance may vary with dialects or regional variations
  • Requires GPU/TPU for optimal inference speed
  • Limited evaluation on diverse downstream tasks

Recommendations

  1. Fine-tuning: Can be further fine-tuned on domain-specific Dhivehi data
  2. Deployment: Use with sufficient computational resources for production
  3. Evaluation: Test on your specific use case before deployment
  4. Updates: Check for newer versions of the model for improved performance

Citation

If you use this model, please cite:

@model{axya_mini,
  author = {Abdhullo, Axmee},
  title = {Axya-Mini: Dhivehi Language Question-Answering Model},
  year = {2025},
  publisher = {Hugging Face Model Hub},
  howpublished = {https://huggingface.co/axmeeabdhullo/axya-mini}
}

License

This model is licensed under the Apache License 2.0. See the LICENSE file for details.

Related Resources

Author

Axmee Abdhullo
AI/ML Developer specializing in Dhivehi NLP
Hugging Face

Contact & Support

For questions, suggestions, or support:

  • Open an issue on the model's repository
  • Join the Hugging Face community discussions
  • Check the model card for updates

Last Updated: December 2024
Status: Active Development
Version: 1.0

Downloads last month
25
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for axmeeabdhullo/axya-mini

Adapter
(1650)
this model

Dataset used to train axmeeabdhullo/axya-mini