Spaces:
Running
Running
File size: 1,567 Bytes
fb26617 5f36d51 57c40a2 fb26617 57c40a2 fb26617 57c40a2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
title: Urdu Emoji Predictor
emoji: 🎯
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.0.0
app_file: app.py
pinned: false
---
# 🎯 Urdu Emoji Predictor
An AI-powered tool that predicts relevant emojis for Urdu text using machine learning and semantic similarity.
## 🚀 Try It Out!
Simply enter Urdu text and get the most relevant emojis instantly.
## 🎯 Examples
- `میں بہت خوش ہوں` → 🎉 🎊 👌
- `دل ٹوٹ گیا ہے` → 🌚 😞 💔
- `نیند آ رہی ہے` → 😴 😞 🌚
- `دوستوں کے ساتھ پارٹی` → 🎉 😋 🎊
## 🔧 How It Works
1. **Text Encoding**: Converts Urdu text to semantic embeddings using multilingual sentence transformers
2. **Similarity Search**: Compares text embeddings with pre-computed emoji embeddings
3. **Ranking**: Returns top emojis based on cosine similarity scores
## 🏗️ Technical Details
- **Model**: `sentence-transformers/paraphrase-multilingual-mpnet-base-v2`
- **Emojis**: 80 most common emojis from Urdu social media
- **Method**: Cosine similarity between text and emoji embeddings
- **Framework**: Gradio + FastAPI
## 📊 Model Performance
- **Top-1 Accuracy**: ~16%
- **Top-3 Accuracy**: ~30%
- **Trained on**: 800K+ Urdu text-emoji pairs
## 🎮 Usage
```python
from urdu_specific_embedding import UrduOptimizedPredictor
predictor = UrduOptimizedPredictor("models/urdu_optimized_model")
predictions = predictor.predict_smart("میں بہت خوش ہوں", top_k=3)
# Returns: [('🎉', 0.555), ('🎊', 0.537), ('👌', 0.439)] |