| --- |
| license: mit |
| tags: |
| - pytorch |
| - audio |
| - emotion-recognition |
| - audio-classification |
| - customer-service |
| --- |
| |
| # Mantis |
|
|
| ## Model Description |
| Mantis is an audio-based emotion recognition model designed for customer service intelligence. It classifies emotional states from speech audio using a HuBERT + CNN hybrid architecture, enabling real-time sentiment monitoring in call center environments. |
|
|
| ## Model Architecture |
| - **Architecture**: HuBERT (feature extractor) + CNN (classifier head) |
| - **Framework**: PyTorch |
| - **Task**: Audio Emotion Classification |
| - **Input**: Raw audio waveforms / mel spectrograms |
| - **Output**: Emotion class (e.g., neutral, happy, angry, sad, frustrated) |
|
|
| ## Training Details |
| - **Dataset**: Trained on emotion speech datasets (e.g., RAVDESS, IEMOCAP, or proprietary customer service audio) |
| - **Approach**: HuBERT pre-trained representations fed into a custom CNN classifier |
| - **Fine-tuning**: End-to-end fine-tuning for customer service emotion categories |
|
|
| ## Performance |
| Evaluated on held-out emotion speech samples with strong accuracy across key emotion classes relevant to customer service. |
|
|
| ## Files |
| | File | Description | |
| |------|-------------| |
| | `emotion_model.pth` | Final trained HuBERT-CNN emotion recognition model | |
|
|
| ## Usage |
|
|
| ```python |
| import torch |
| from huggingface_hub import hf_hub_download |
| |
| # Download model |
| model_path = hf_hub_download(repo_id='devanshty/Mantis', filename='emotion_model.pth') |
| |
| # Load model (adjust to your model class) |
| model = torch.load(model_path, map_location='cpu') |
| model.eval() |
| |
| # Run inference on audio features |
| # (preprocess audio to match training pipeline) |
| ``` |
|
|
| ## Download & Use |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| model_path = hf_hub_download(repo_id='devanshty/Mantis', filename='emotion_model.pth') |
| ``` |
|
|