Update Hebrew Intent Model - Enhanced with data augmentation (135 examples)

27f24a2 6 months ago

4.52 kB

	---
	language: he
	license: apache-2.0
	datasets:
	- custom
	tags:
	- text-classification
	- intent-classification
	- hebrew
	- nlp
	- bert
	- customer-service
	widget:
	- text: "שכחתי את הסיסמה שלי"
	example_title: "Password Reset"
	- text: "רוצה לבטל את המנוי"
	example_title: "Cancel Subscription"
	- text: "כמה עולה החבילה"
	example_title: "General Question"
	- text: "האתר לא עובד"
	example_title: "Technical Support"
	---

	# Hebrew Intent Classification Model

	## Model Description

	This model is a fine-tuned BERT model for Hebrew intent classification, specifically designed for customer service scenarios. It can classify Hebrew text into 4 different intent categories commonly found in customer support interactions.

	## Supported Intent Classes

	1. ביטול מנוי (Cancel Subscription) - Requests to cancel or terminate services
	2. שאלה כללית (General Question) - General inquiries about services, pricing, or account management
	3. שכחת סיסמה (Password Reset) - Issues related to forgotten passwords or login problems
	4. תמיכה טכנית (Technical Support) - Technical issues, bugs, or system problems

	## Usage

	```python
	from transformers import pipeline

	# Load the model
	classifier = pipeline("text-classification", model="Huggingm1r@n/hebrew-intent-classifier")

	# Make predictions
	result = classifier("שכחתי את הסיסמה שלי")
	print(result)
	# [{'label': 'שכחת סיסמה', 'score': 0.95}]

	# Test other examples
	examples = [
	"רוצה לבטל את המנוי",
	"כמה עולה החבילה",
	"האתר לא עובד"
	]

	for text in examples:
	result = classifier(text)
	print(f"'{text}' -> {result[0]['label']} ({result[0]['score']:.2%})")
	```

	## Direct Usage with Transformers

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification
	import torch

	# Load model and tokenizer
	tokenizer = AutoTokenizer.from_pretrained("Huggingm1r@n/hebrew-intent-classifier")
	model = AutoModelForSequenceClassification.from_pretrained("Huggingm1r@n/hebrew-intent-classifier")

	def predict_intent(text):
	inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

	with torch.no_grad():
	outputs = model(**inputs)
	logits = outputs.logits
	probabilities = torch.softmax(logits, dim=-1)

	predicted_id = torch.argmax(logits, dim=-1).item()
	predicted_label = model.config.id2label[predicted_id]
	confidence = probabilities[0][predicted_id].item()

	return predicted_label, confidence

	# Example
	intent, confidence = predict_intent("שכחתי את הסיסמה")
	print(f"Intent: {intent}, Confidence: {confidence:.2%}")
	```

	## Training Details

	- Base Model: bert-base-multilingual-cased
	- Training Data: 135 Hebrew customer service examples (augmented from 12 original)
	- Data Augmentation: Manual variations, formal/informal styles, polite forms
	- Performance: >90% accuracy on validation set

	## Example Predictions

	\| Hebrew Text \| Predicted Intent \| English Translation \|
	\|------------\|------------------\|-------------------\|
	\| שכחתי את הסיסמה שלי \| שכחת סיסמה \| I forgot my password \|
	\| רוצה לבטל את המנוי \| ביטול מנוי \| Want to cancel subscription \|
	\| כמה עולה החבילה \| שאלה כללית \| How much does the package cost \|
	\| האתר לא עובד \| תמיכה טכנית \| The website doesn't work \|

	## Use Cases

	- Customer Service Chatbots: Route Hebrew customer queries automatically
	- Support Ticket Classification: Categorize support requests by intent
	- Voice of Customer Analysis: Analyze Hebrew customer feedback
	- Automated Response Systems: Trigger appropriate responses based on intent

	## Limitations

	- Designed for customer service domain specifically
	- Limited to 4 predefined intent classes
	- May not work well with very informal Hebrew or slang
	- Requires Hebrew text input

	## Model Files

	- Uses `safetensors` format for secure model storage
	- Compatible with latest Transformers library
	- Includes comprehensive tokenizer configuration

	## Citation

	```bibtex
	@misc{hebrew-intent-classifier-2025,
	title={Hebrew Intent Classification Model for Customer Service},
	author={Huggingm1r@n},
	year={2025},
	publisher={Hugging Face},
	url={https://huggingface.co/Huggingm1r@n/hebrew-intent-classifier}
	}
	```

	## License

	This model is released under the Apache 2.0 License.