axya-mini / README.md

Update README.md

34ca02a verified 21 days ago

4.92 kB

	---
	license: apache-2.0
	datasets:
	- axmeeabdhullo/axya-tech-dv100
	language:
	- dv
	metrics:
	- accuracy
	base_model:
	- openai-community/gpt2
	new_version: axmeeabdhullo/axya-mini
	pipeline_tag: question-answering
	library_name: adapter-transformers
	---

	# Axya-Mini

	> A fine-tuned GPT-2 adapter model for Dhivehi (Thaana) language question-answering and text generation tasks.

	## Model Description

	Axya-Mini is a lightweight, efficient adapter-based language model specifically designed for the Dhivehi language, the national language of the Maldives. Built on the GPT-2 architecture and optimized using adapter layers, this model excels at question-answering tasks while maintaining compact size and fast inference.

	Model Type: Adapter-based Fine-tuned Model
	Base Model: GPT-2 (openai-community/gpt2)
	Language: Dhivehi (ދިވެހި)
	Framework: Quetzal (Revolutionary CPU-optimized training library)

	## Key Features

	- 🌟 Language-Specific: Optimized for Dhivehi (dv) language processing
	- ⚡ Lightweight: Efficient adapter architecture for fast inference
	- 🎉 Question Answering: Trained on question-answering tasks
	- 💾 Safetensors Format: Secure model serialization
	- 🤗 Adapter-Based: Uses adapter layers for efficient fine-tuning and storage

	## Model Details

	### Intended Use

	This model is designed for:
	- Question answering in Dhivehi
	- Text generation tasks in Dhivehi
	- Language understanding for Dhivehi content
	- Building Dhivehi NLP applications

	### Training Data

	Dataset: [axmeeabdhullo/axya-tech-dv100](https://huggingface.co/datasets/axmeeabdhullo/axya-tech-dv100)
	A curated Dhivehi dataset containing 100 high-quality technical and educational content samples.

	### Training Methodology

	- Fine-tuning Approach: Adapter-based fine-tuning
	- Metrics: Accuracy optimization
	- Library: adapter-transformers
	- Optimization: Efficient parameter updating through adapter modules
	### Quetzal Library Optimization

	Quetzal is a revolutionary library that powers the efficient training of this model on CPU, making high-quality AI accessible without expensive GPUs.

	Library Features:
	- 🚀 3x Faster CPU Training: Advanced optimizations for CPU-based training
	- 📊 Data Augmentation: Train accurate models with minimal data (5-10x augmentation)
	- 💾 Memory Efficient: 4-bit quantization and LoRA for reduced memory usage
	- 🎯 High Accuracy: Specialized techniques for low-resource scenarios
	- 🌍 Multilingual: Optimized for languages like Dhivehi, but works for any language
	- 🔧 Easy to Use: Simple API similar to popular libraries

	Installation:
	```bash
	pip install quetzal-ai
	```

	Why Quetzal for Dhivehi?
	Quetzal is specifically optimized for low-resource languages like Dhivehi, enabling efficient model training and deployment without requiring expensive GPU infrastructure. This makes it ideal for building NLP models for endangered or underrepresented languages.


	## Model Performance

	- Inference Speed: Fast and efficient due to adapter architecture
	- Model Size: Compact compared to full model fine-tuning
	- Accuracy: Optimized for Dhivehi language understanding tasks

	## Limitations

	- Trained specifically on Dhivehi language content
	- Performance may vary with dialects or regional variations
	- Requires GPU/TPU for optimal inference speed
	- Limited evaluation on diverse downstream tasks

	## Recommendations

	1. Fine-tuning: Can be further fine-tuned on domain-specific Dhivehi data
	2. Deployment: Use with sufficient computational resources for production
	3. Evaluation: Test on your specific use case before deployment
	4. Updates: Check for newer versions of the model for improved performance

	## Citation

	If you use this model, please cite:

	```bibtex
	@model{axya_mini,
	author = {Abdhullo, Axmee},
	title = {Axya-Mini: Dhivehi Language Question-Answering Model},
	year = {2025},
	publisher = {Hugging Face Model Hub},
	howpublished = {https://huggingface.co/axmeeabdhullo/axya-mini}
	}
	```

	## License

	This model is licensed under the Apache License 2.0. See the LICENSE file for details.

	## Related Resources

	- Dataset: [axya-tech-dv100](https://huggingface.co/datasets/axmeeabdhullo/axya-tech-dv100)
	- Base Model: [GPT-2](https://huggingface.co/openai-community/gpt2)
	- Library Documentation: [Adapter-Transformers](https://adapterhub.ml/)
	- Hugging Face Hub: [Model Hub](https://huggingface.co/)

	## Author

	Axmee Abdhullo
	AI/ML Developer specializing in Dhivehi NLP
	[Hugging Face](https://huggingface.co/axmeeabdhullo)

	## Contact & Support

	For questions, suggestions, or support:
	- Open an issue on the model's repository
	- Join the Hugging Face community discussions
	- Check the model card for updates

	---

	Last Updated: December 2024
	Status: Active Development
	Version: 1.0