|
|
--- |
|
|
language: he |
|
|
license: apache-2.0 |
|
|
datasets: |
|
|
- custom |
|
|
tags: |
|
|
- text-classification |
|
|
- intent-classification |
|
|
- hebrew |
|
|
- nlp |
|
|
- bert |
|
|
- customer-service |
|
|
widget: |
|
|
- text: "שכחתי את הסיסמה שלי" |
|
|
example_title: "Password Reset" |
|
|
- text: "רוצה לבטל את המנוי" |
|
|
example_title: "Cancel Subscription" |
|
|
- text: "כמה עולה החבילה" |
|
|
example_title: "General Question" |
|
|
- text: "האתר לא עובד" |
|
|
example_title: "Technical Support" |
|
|
--- |
|
|
|
|
|
# Hebrew Intent Classification Model |
|
|
|
|
|
## Model Description |
|
|
|
|
|
This model is a fine-tuned BERT model for Hebrew intent classification, specifically designed for customer service scenarios. It can classify Hebrew text into 4 different intent categories commonly found in customer support interactions. |
|
|
|
|
|
## Supported Intent Classes |
|
|
|
|
|
1. **ביטול מנוי** (Cancel Subscription) - Requests to cancel or terminate services |
|
|
2. **שאלה כללית** (General Question) - General inquiries about services, pricing, or account management |
|
|
3. **שכחת סיסמה** (Password Reset) - Issues related to forgotten passwords or login problems |
|
|
4. **תמיכה טכנית** (Technical Support) - Technical issues, bugs, or system problems |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
# Load the model |
|
|
classifier = pipeline("text-classification", model="Huggingm1r@n/hebrew-intent-classifier") |
|
|
|
|
|
# Make predictions |
|
|
result = classifier("שכחתי את הסיסמה שלי") |
|
|
print(result) |
|
|
# [{'label': 'שכחת סיסמה', 'score': 0.95}] |
|
|
|
|
|
# Test other examples |
|
|
examples = [ |
|
|
"רוצה לבטל את המנוי", |
|
|
"כמה עולה החבילה", |
|
|
"האתר לא עובד" |
|
|
] |
|
|
|
|
|
for text in examples: |
|
|
result = classifier(text) |
|
|
print(f"'{text}' -> {result[0]['label']} ({result[0]['score']:.2%})") |
|
|
``` |
|
|
|
|
|
## Direct Usage with Transformers |
|
|
|
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
tokenizer = AutoTokenizer.from_pretrained("Huggingm1r@n/hebrew-intent-classifier") |
|
|
model = AutoModelForSequenceClassification.from_pretrained("Huggingm1r@n/hebrew-intent-classifier") |
|
|
|
|
|
def predict_intent(text): |
|
|
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True) |
|
|
|
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
logits = outputs.logits |
|
|
probabilities = torch.softmax(logits, dim=-1) |
|
|
|
|
|
predicted_id = torch.argmax(logits, dim=-1).item() |
|
|
predicted_label = model.config.id2label[predicted_id] |
|
|
confidence = probabilities[0][predicted_id].item() |
|
|
|
|
|
return predicted_label, confidence |
|
|
|
|
|
# Example |
|
|
intent, confidence = predict_intent("שכחתי את הסיסמה") |
|
|
print(f"Intent: {intent}, Confidence: {confidence:.2%}") |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
- **Base Model**: bert-base-multilingual-cased |
|
|
- **Training Data**: 135 Hebrew customer service examples (augmented from 12 original) |
|
|
- **Data Augmentation**: Manual variations, formal/informal styles, polite forms |
|
|
- **Performance**: >90% accuracy on validation set |
|
|
|
|
|
## Example Predictions |
|
|
|
|
|
| Hebrew Text | Predicted Intent | English Translation | |
|
|
|------------|------------------|-------------------| |
|
|
| שכחתי את הסיסמה שלי | שכחת סיסמה | I forgot my password | |
|
|
| רוצה לבטל את המנוי | ביטול מנוי | Want to cancel subscription | |
|
|
| כמה עולה החבילה | שאלה כללית | How much does the package cost | |
|
|
| האתר לא עובד | תמיכה טכנית | The website doesn't work | |
|
|
|
|
|
## Use Cases |
|
|
|
|
|
- **Customer Service Chatbots**: Route Hebrew customer queries automatically |
|
|
- **Support Ticket Classification**: Categorize support requests by intent |
|
|
- **Voice of Customer Analysis**: Analyze Hebrew customer feedback |
|
|
- **Automated Response Systems**: Trigger appropriate responses based on intent |
|
|
|
|
|
## Limitations |
|
|
|
|
|
- Designed for customer service domain specifically |
|
|
- Limited to 4 predefined intent classes |
|
|
- May not work well with very informal Hebrew or slang |
|
|
- Requires Hebrew text input |
|
|
|
|
|
## Model Files |
|
|
|
|
|
- Uses `safetensors` format for secure model storage |
|
|
- Compatible with latest Transformers library |
|
|
- Includes comprehensive tokenizer configuration |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{hebrew-intent-classifier-2025, |
|
|
title={Hebrew Intent Classification Model for Customer Service}, |
|
|
author={Huggingm1r@n}, |
|
|
year={2025}, |
|
|
publisher={Hugging Face}, |
|
|
url={https://huggingface.co/Huggingm1r@n/hebrew-intent-classifier} |
|
|
} |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
This model is released under the Apache 2.0 License. |
|
|
|