Meena Logo # 🤖 **Meena** - Enterprise AI Pipeline [![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/sheikh-vegeta/Meena/auto-train-publish.yml?branch=main&style=for-the-badge&logo=github&label=CI%2FCD&color=4CAF50)](https://github.com/sheikh-vegeta/Meena/actions) [![Python Version](https://img.shields.io/badge/python-3.8%2B-blue?style=for-the-badge&logo=python&color=3776AB)](https://python.org) [![Hugging Face](https://img.shields.io/badge/🤗%20Hugging%20Face-Models-yellow?style=for-the-badge&color=FFD21E)](https://huggingface.co) [![License](https://img.shields.io/github/license/sheikh-vegeta/Meena?style=for-the-badge&color=FF6B6B)](https://github.com/sheikh-vegeta/Meena/blob/main/LICENSE) **🌍 āĻŦāĻžāĻ‚āϞāĻž āĻ“ āχāĻ‚āϰ⧇āϜāĻŋāϤ⧇ āĻ•āĻĨā§‹āĻĒāĻ•āĻĨāύ⧇āϰ āĻāφāχ | Bengali & English Conversational AI** *"āϝ⧁āĻ—āĻžāĻ¨ā§āϤāĻ•āĻžāϰ⧀ āĻĒā§āϰāϝ⧁āĻ•ā§āϤāĻŋāϰ āϏāĻžāĻĨ⧇ āĻŽāĻžāϤ⧃āĻ­āĻžāώāĻžāϰ āĻŽāĻŋāϞāύ"* *Revolutionary technology meets mother tongue* --- ### ⚡ **Enterprise-grade CI/CD pipeline for training, benchmarking, and deploying intelligent conversational AI**
--- ## đŸŽ¯ **āĻŽā§‚āϞ āĻŦ⧈āĻļāĻŋāĻˇā§āĻŸā§āϝ | Key Features**
### 🚀 **āĻ¸ā§āĻŦāϝāĻŧāĻ‚āĻ•ā§āϰāĻŋāϝāĻŧ āĻĒāĻžāχāĻĒāϞāĻžāχāύ** **Automated Pipeline** âš™ī¸ **CI/CD Automation** 🔍 **Smart Change Detection** 🔄 **Multi-environment Support** *"āĻāĻ•āĻŦāĻžāϰ āϏ⧇āϟ āĻ•āϰ⧁āύ, āϚāĻŋāϰāĻ•āĻžāϞ āϚāĻžāϞāĻžāύ"* ### 🧠 **āωāĻ¨ā§āύāϤ āĻāφāχ āĻĒā§āϰāĻļāĻŋāĻ•ā§āώāĻŖ** **Advanced AI Training** đŸŽ¯ **LoRA Fine-tuning** 📊 **Integrated Benchmarking** 🌍 **Multilingual Support** *"āĻŦāĻžāĻ‚āϞāĻž āĻ­āĻžāώāĻžāϰ āϜāĻ¨ā§āϝ āĻŦāĻŋāĻļ⧇āώāĻ­āĻžāĻŦ⧇ āĻ…āĻĒā§āϟāĻŋāĻŽāĻžāχāϜāĻĄ"* ### đŸ“Ļ **āĻĒ⧇āĻļāĻžāĻĻāĻžāϰ āĻ¸ā§āĻĨāĻžāĻĒāύāĻž** **Professional Deployment** 🤗 **HuggingFace Integration** 📝 **Auto Model Cards** 🔔 **Smart Notifications** *"āĻŦāĻŋāĻļā§āĻŦāĻŽāĻžāύ⧇āϰ āĻŽāĻĄā§‡āϞ āĻĄāĻŋāĻĒā§āϞāϝāĻŧāĻŽā§‡āĻ¨ā§āϟ"*
--- ## đŸ› ī¸ **Architecture Overview | āĻ¸ā§āĻĨāĻžāĻĒāĻ¤ā§āϝ āĻĒāϰāĻŋāĻ•āĻ˛ā§āĻĒāύāĻž**
```mermaid flowchart TD A[🔄 āϕ⧋āĻĄ āĻĒ⧁āĻļ
Code Push] --> B[đŸ•ĩī¸ āĻĒāϰāĻŋāĻŦāĻ°ā§āϤāύ āϏāύāĻžāĻ•ā§āϤāĻ•āϰāĻŖ
Change Detection] B --> C{📝 Changes?} C -->|Training| D[🎓 āĻŽāĻĄā§‡āϞ āĻĒā§āϰāĻļāĻŋāĻ•ā§āώāĻŖ
Model Training] C -->|Benchmark| E[📈 āĻ•āĻ°ā§āĻŽāĻ•ā§āώāĻŽāϤāĻž āĻŽā§‚āĻ˛ā§āϝāĻžāϝāĻŧāύ
Performance Eval] D --> F[📊 āĻĒā§āϰāĻļāĻŋāĻ•ā§āώāĻŖ āĻŽā§‡āĻŸā§āϰāĻŋāĻ•ā§āϏ
Training Metrics] E --> G[📈 āĻŦ⧇āĻžā§āϚāĻŽāĻžāĻ°ā§āĻ• āĻĢāϞāĻžāĻĢāϞ
Benchmark Results] F --> H[🚀 āĻŽāĻĄā§‡āϞ āĻĒā§āϰāĻ•āĻžāĻļāύāĻž
Model Publishing] G --> H H --> I[🤗 HuggingFace Hub] H --> J[đŸ“Ļ GitHub Release] I --> K[đŸ§Ē āĻĒāϰ⧀āĻ•ā§āώāĻž
Testing] J --> K K --> L[✅ Quality Gates] L --> M[🔔 āĻŦāĻŋāĻœā§āĻžāĻĒā§āϤāĻŋ
Notification] style A fill:#e3f2fd style D fill:#f3e5f5 style E fill:#fff8e1 style H fill:#e8f5e8 style M fill:#fce4ec ```
--- ## 🚀 **Quick Start | āĻĻā§āϰ⧁āϤ āĻļ⧁āϰ⧁** ### āĻŦāĻžāĻ‚āϞāĻž āύāĻŋāĻ°ā§āĻĻ⧇āĻļāύāĻž | Bengali Instructions ```bash # āϰāĻŋāĻĒā§‹āϜāĻŋāϟāϰāĻŋ āĻ•ā§āϞ⧋āύ āĻ•āϰ⧁āύ | Clone repository git clone https://github.com/sheikh-vegeta/Meena.git cd Meena # āĻ­āĻžāĻ°ā§āϚ⧁āϝāĻŧāĻžāϞ āĻāύāĻ­āĻžāϝāĻŧāϰāύāĻŽā§‡āĻ¨ā§āϟ āϤ⧈āϰāĻŋ āĻ•āϰ⧁āύ | Create virtual environment python -m venv meena-env source meena-env/bin/activate # Windows: meena-env\Scripts\activate # āĻĒā§āϰāϝāĻŧā§‹āϜāύ⧀āϝāĻŧ āĻĒā§āϝāĻžāϕ⧇āϜ āχāύāĻ¸ā§āϟāϞ āĻ•āϰ⧁āύ | Install dependencies pip install -r requirements.txt # āĻĒā§āϰāĻļāĻŋāĻ•ā§āώāĻŖ āĻļ⧁āϰ⧁ āĻ•āϰ⧁āύ | Start training python train.py --language bengali # āĻŦ⧇āĻžā§āϚāĻŽāĻžāĻ°ā§āĻ• āϚāĻžāϞāĻžāύ | Run benchmark python benchmark.py --eval-lang bn ``` > 💡 **āĻĒā§āϰ⧋ āϟāĻŋāĻĒ:** `--language mixed` āĻŦā§āϝāĻŦāĻšāĻžāϰ āĻ•āϰ⧇ āĻŦāĻžāĻ‚āϞāĻž āĻ“ āχāĻ‚āϰ⧇āϜāĻŋ āĻāĻ•āϏāĻžāĻĨ⧇ āĻĒā§āϰāĻļāĻŋāĻ•ā§āώāĻŖ āĻĻāĻŋāύ! --- ## 📋 **Pipeline Jobs | āĻĒāĻžāχāĻĒāϞāĻžāχāύ āĻ•āĻžāϜāϏāĻŽā§‚āĻš**
| đŸŽ¯ Job | āĻŦāĻŋāĻŦāϰāĻŖ | Description | Triggers | |---------|--------|-------------|----------| | đŸ•ĩī¸ **detect-changes** | āĻĒāϰāĻŋāĻŦāĻ°ā§āϤāύ āĻļāύāĻžāĻ•ā§āϤāĻ•āϰāĻŖ | Change Detection | āϏāĻ°ā§āĻŦāĻĻāĻž \| Always | | 🎓 **train** | āĻŽāĻĄā§‡āϞ āĻĒā§āϰāĻļāĻŋāĻ•ā§āώāĻŖ | Model Training | Training scripts modified | | 📈 **benchmark** | āĻ•āĻ°ā§āĻŽāĻ•ā§āώāĻŽāϤāĻž āĻĒāϰ⧀āĻ•ā§āώāĻž | Performance Testing | Model changes | | 🚀 **publish** | āĻŽāĻĄā§‡āϞ āĻĒā§āϰāĻ•āĻžāĻļāύāĻž | Model Publishing | Training success | | đŸ§Ē **test** | āĻšā§‚āĻĄāĻŧāĻžāĻ¨ā§āϤ āĻĒāϰ⧀āĻ•ā§āώāĻž | Final Validation | Post-deployment | | 🔔 **notify** | āĻŦāĻŋāĻœā§āĻžāĻĒā§āϤāĻŋ āĻĒāĻžāĻ āĻžāύ⧋ | Send Notifications | Pipeline completion |
--- ## 🌍 **Multilingual Support | āĻŦāĻšā§āĻ­āĻžāώāĻŋāĻ• āϏāĻžāĻĒā§‹āĻ°ā§āϟ** ### 🇧🇩 Bengali (āĻŦāĻžāĻ‚āϞāĻž) Features
| āĻŦ⧈āĻļāĻŋāĻˇā§āĻŸā§āϝ | Feature | Status | |------------|---------|--------| | 📚 **āύ⧇āϟāĻŋāĻ­ āĻĄā§‡āϟāĻžāϏ⧇āϟ** | Native Datasets | ✅ āϏāĻ•ā§āϰāĻŋāϝāĻŧ | | 🔤 **āĻŸā§‹āϕ⧇āύāĻžāχāĻœā§‡āĻļāύ** | Proper Tokenization | ✅ āϏāĻ•ā§āϰāĻŋāϝāĻŧ | | 🎭 **āϏāĻžāĻ‚āĻ¸ā§āĻ•ā§ƒāϤāĻŋāĻ• āĻĒā§āϰāϏāĻ™ā§āĻ—** | Cultural Context | ✅ āϏāĻ•ā§āϰāĻŋāϝāĻŧ | | ⚡ **āĻĻā§āϰ⧁āϤ āχāύāĻĢāĻžāϰ⧇āĻ¨ā§āϏ** | Fast Inference | ✅ āϏāĻ•ā§āϰāĻŋāϝāĻŧ |
> **āĻŦāĻžāĻ‚āϞāĻž āĻ­āĻžāώāĻžāϰ āϜāĻ¨ā§āϝ āĻŦāĻŋāĻļ⧇āώ āĻ…āĻĒā§āϟāĻŋāĻŽāĻžāχāĻœā§‡āĻļāύ:** > *"āφāĻŽāĻžāĻĻ⧇āϰ āĻŽāĻĄā§‡āϞ āĻŦāĻžāĻ‚āϞāĻž āĻ­āĻžāώāĻžāϰ āĻŦā§āϝāĻžāĻ•āϰāĻŖ, āĻŦāĻžāĻ—āϧāĻžāϰāĻž āĻāĻŦāĻ‚ āφāĻžā§āϚāϞāĻŋāĻ• āĻ­āĻžāώāĻžāϰ āĻŦ⧈āϚāĻŋāĻ¤ā§āĻ°ā§āϝ āĻŦ⧁āĻāϤ⧇ āĻĒāĻžāϰ⧇āĨ¤"* ### Training Data Structure ``` datasets/ ├── 🇧🇩 bengali/ │ ├── āφāύ⧁āĻˇā§āĻ āĻžāύāĻŋāĻ•-āĻ•āĻĨā§‹āĻĒāĻ•āĻĨāύ.json # Formal dialogues │ ├── āύ⧈āĻŽāĻŋāĻ¤ā§āϤāĻŋāĻ•-āĻšā§āϝāĻžāϟ.json # Casual conversations │ └── āϏāĻžāĻšāĻŋāĻ¤ā§āϝāĻŋāĻ•-āϏāĻ‚āϞāĻžāĻĒ.json # Literary dialogues ├── đŸ‡ē🇸 english/ │ ├── dialogpt_data.json │ └── general_conversations.json └── 🌍 mixed/ └── bilingual_pairs.json # āĻĻā§āĻŦāĻŋāĻ­āĻžāώāĻŋāĻ• āĻœā§‹āĻĄāĻŧāĻž ``` --- ## 📊 **Benchmarking | āĻ•āĻ°ā§āĻŽāĻ•ā§āώāĻŽāϤāĻž āĻŽā§‚āĻ˛ā§āϝāĻžāϝāĻŧāύ** ### āĻŽā§‡āĻŸā§āϰāĻŋāĻ•ā§āϏ | Metrics Overview
| āĻŽā§‡āĻŸā§āϰāĻŋāĻ• | Metric | āĻŦāĻžāĻ‚āϞāĻž | English | Mixed | |----------|--------|--------|---------|-------| | 📈 **Perplexity** | āĻ­āĻžāώāĻž āĻŽāĻĄā§‡āϞ āϗ⧁āĻŖāĻŽāĻžāύ | `< 15` | `< 12` | `< 18` | | đŸŽ¯ **BLEU Score** | āĻ…āύ⧁āĻŦāĻžāĻĻ āϗ⧁āĻŖāĻŽāĻžāύ | `> 85` | `> 88` | `> 82` | | đŸ—Ŗī¸ **Dialogue Coherence** | āϏāĻ‚āϞāĻžāĻĒ āϏāĻ‚āĻ—āϤāĻŋ | `> 90%` | `> 92%` | `> 88%` | | ⚡ **Inference Speed** | āĻĒā§āϰāϤāĻŋāĻ•ā§āϰāĻŋāϝāĻŧāĻžāϰ āĻ—āϤāĻŋ | `< 200ms` | `< 180ms` | `< 220ms` |
> **āĻŦāĻžāĻ‚āϞāĻž āĻŽā§‡āĻŸā§āϰāĻŋāĻ•ā§āϏ āĻŦāĻŋāĻļ⧇āώāĻ¤ā§āĻŦ:** > *"āφāĻŽāĻžāĻĻ⧇āϰ āĻŦ⧇āĻžā§āϚāĻŽāĻžāĻ°ā§āĻ•āĻŋāĻ‚ āϏāĻŋāĻ¸ā§āĻŸā§‡āĻŽ āĻŦāĻžāĻ‚āϞāĻž āĻ­āĻžāώāĻžāϰ āϜāĻ¨ā§āϝ āĻŦāĻŋāĻļ⧇āώāĻ­āĻžāĻŦ⧇ āϤ⧈āϰāĻŋ āĻ•āϰāĻž āĻšāϝāĻŧ⧇āϛ⧇āĨ¤"* --- ## 🔔 **Notification System | āĻŦāĻŋāĻœā§āĻžāĻĒā§āϤāĻŋ āĻŦā§āϝāĻŦāĻ¸ā§āĻĨāĻž**
### 📱 **Smart Notifications** | Platform | āĻŦāĻŋāĻœā§āĻžāĻĒā§āϤāĻŋāϰ āϧāϰāύ | Notification Type | Status | |----------|------------------|-------------------|--------| | 📧 **Email** | āϗ⧁āϰ⧁āĻ¤ā§āĻŦāĻĒā§‚āĻ°ā§āĻŖ āĻŦā§āϝāĻ°ā§āĻĨāϤāĻž | Critical Failures | đŸŸĸ Active | | đŸ’Ŧ **Slack** | āϟāĻŋāĻŽ āφāĻĒāĻĄā§‡āϟ | Team Updates | đŸŸĸ Active | | 🚨 **Discord** | āĻ•āĻŽāĻŋāωāύāĻŋāϟāĻŋ āĻŦāĻžāĻ°ā§āϤāĻž | Community Alerts | đŸŸĸ Active | | 📱 **GitHub** | āχāĻ¸ā§āϝ⧁ āĻŸā§āĻ°ā§āϝāĻžāĻ•āĻŋāĻ‚ | Issue Tracking | đŸŸĸ Active |
--- ## 🤝 **Contributing | āĻ…āĻŦāĻĻāĻžāύ āϰāĻžāϖ⧁āύ** ### 🌟 **How to Contribute | āϕ⧀āĻ­āĻžāĻŦ⧇ āĻ…āĻŦāĻĻāĻžāύ āϰāĻžāĻ–āĻŦ⧇āύ**
```mermaid flowchart LR A[🍴 Fork Repository
āϰāĻŋāĻĒā§‹ āĻĢāĻ°ā§āĻ• āĻ•āϰ⧁āύ] --> B[đŸŒŋ Create Branch
āĻŦā§āϰāĻžāĻžā§āϚ āϤ⧈āϰāĻŋ āĻ•āϰ⧁āύ] B --> C[⚡ Make Changes
āĻĒāϰāĻŋāĻŦāĻ°ā§āϤāύ āĻ•āϰ⧁āύ] C --> D[✅ Test Locally
āĻ¸ā§āĻĨāĻžāύ⧀āϝāĻŧ āĻĒāϰ⧀āĻ•ā§āώāĻž] D --> E[📝 Commit & Push
āĻ•āĻŽāĻŋāϟ āĻ“ āĻĒ⧁āĻļ] E --> F[🚀 Pull Request
āĻĒ⧁āϞ āϰāĻŋāϕ⧋āϝāĻŧ⧇āĻ¸ā§āϟ] ```
### āĻ…āĻŦāĻĻāĻžāύ⧇āϰ āĻ•ā§āώ⧇āĻ¤ā§āϰāϏāĻŽā§‚āĻš | Contribution Areas - 🧠 **āĻŽāĻĄā§‡āϞ āωāĻ¨ā§āύāϤāĻŋ** | Model Improvements - 🌐 **āĻ­āĻžāώāĻž āϏāĻžāĻĒā§‹āĻ°ā§āϟ** | Language Support - 📊 **āĻŦ⧇āĻžā§āϚāĻŽāĻžāĻ°ā§āĻ•āĻŋāĻ‚** | Benchmarking - 🔧 **āĻ…āĻŦāĻ•āĻžāĻ āĻžāĻŽā§‹** | Infrastructure - 📚 **āĻĄāϕ⧁āĻŽā§‡āĻ¨ā§āĻŸā§‡āĻļāύ** | Documentation > **āĻ…āĻŦāĻĻāĻžāύāĻ•āĻžāϰ⧀āĻĻ⧇āϰ āϜāĻ¨ā§āϝ āĻŦāĻžāĻ°ā§āϤāĻž:** > *"āφāĻĒāύāĻžāϰ āĻĒā§āϰāϤāĻŋāϟāĻŋ āĻ…āĻŦāĻĻāĻžāύ āĻŦāĻžāĻ‚āϞāĻž AI-āĻāϰ āωāĻœā§āĻœā§āĻŦāϞ āĻ­āĻŦāĻŋāĻˇā§āĻ¯ā§Ž āĻ—āĻĄāĻŧāϤ⧇ āϏāĻžāĻšāĻžāĻ¯ā§āϝ āĻ•āϰāĻŦ⧇āĨ¤ āφāĻŽāϰāĻž āφāĻĒāύāĻžāϰ āϏ⧃āϜāύāĻļā§€āϞāϤāĻž āĻ“ āĻĻāĻ•ā§āώāϤāĻžāϕ⧇ āĻ¸ā§āĻŦāĻžāĻ—āϤ āϜāĻžāύāĻžāχ!"* --- ## 🏆 **Acknowledgments | āĻ•ā§ƒāϤāĻœā§āĻžāϤāĻž**
### 🙏 **Special Thanks | āĻŦāĻŋāĻļ⧇āώ āϧāĻ¨ā§āϝāĻŦāĻžāĻĻ** | 🤝 Contributor | āĻ…āĻŦāĻĻāĻžāύ | Contribution | |-----------------|---------|-------------| | 🤗 **Hugging Face** | āĻŸā§āϰāĻžāĻ¨ā§āϏāĻĢāϰāĻŽāĻžāϰ āϞāĻžāχāĻŦā§āϰ⧇āϰāĻŋ | Transformers Library | | 🌐 **Bengali NLP Community** | āĻĄā§‡āϟāĻžāϏ⧇āϟ āĻ“ āĻĢāĻŋāĻĄāĻŦā§āϝāĻžāĻ• | Datasets & Feedback | | đŸ‘Ĩ **All Contributors** | āϕ⧋āĻĄ āĻ“ āĻĄāϕ⧁āĻŽā§‡āĻ¨ā§āĻŸā§‡āĻļāύ | Code & Documentation | | 🇧🇩 **Bangladesh AI Community** | āĻĒā§āϰ⧇āϰāĻŖāĻž āĻ“ āϏāĻšāĻžāϝāĻŧāϤāĻž | Inspiration & Support |
---
--- ### 🔮 **āĻ­āĻŦāĻŋāĻˇā§āϝāϤ⧇āϰ āĻ¸ā§āĻŦāĻĒā§āύ | Future Vision** *"āĻāĻ•āϟāĻŋ āĻāĻŽāύ āĻĒ⧃āĻĨāĻŋāĻŦā§€ āϝ⧇āĻ–āĻžāύ⧇ āĻĒā§āϰāϝ⧁āĻ•ā§āϤāĻŋ āφāĻŽāĻžāĻĻ⧇āϰ āĻŽāĻžāϤ⧃āĻ­āĻžāώāĻžāϕ⧇ āϏāĻŽā§āĻŽāĻžāύ āĻ•āϰ⧇"* **"A world where technology honors our mother tongue"** --- Meena Logo **Made with â¤ī¸ by the Meena Team** [![⭐ Star this repository](https://img.shields.io/github/stars/sheikh-vegeta/Meena?style=social)](https://github.com/sheikh-vegeta/Meena) [![🐛 Report Bug](https://img.shields.io/badge/🐛-Report%20Bug-red?style=flat-square)](https://github.com/sheikh-vegeta/Meena/issues) [![💡 Request Feature](https://img.shields.io/badge/💡-Request%20Feature-blue?style=flat-square)](https://github.com/sheikh-vegeta/Meena/issues) **📧 Contact:** [GitHub Issues](https://github.com/sheikh-vegeta/Meena/issues) | **đŸ’Ŧ Discuss:** [GitHub Discussions](https://github.com/sheikh-vegeta/Meena/discussions)