--- license: mit tags: - pytorch - audio - emotion-recognition - audio-classification - customer-service --- # Mantis ## Model Description Mantis is an audio-based emotion recognition model designed for customer service intelligence. It classifies emotional states from speech audio using a HuBERT + CNN hybrid architecture, enabling real-time sentiment monitoring in call center environments. ## Model Architecture - **Architecture**: HuBERT (feature extractor) + CNN (classifier head) - **Framework**: PyTorch - **Task**: Audio Emotion Classification - **Input**: Raw audio waveforms / mel spectrograms - **Output**: Emotion class (e.g., neutral, happy, angry, sad, frustrated) ## Training Details - **Dataset**: Trained on emotion speech datasets (e.g., RAVDESS, IEMOCAP, or proprietary customer service audio) - **Approach**: HuBERT pre-trained representations fed into a custom CNN classifier - **Fine-tuning**: End-to-end fine-tuning for customer service emotion categories ## Performance Evaluated on held-out emotion speech samples with strong accuracy across key emotion classes relevant to customer service. ## Files | File | Description | |------|-------------| | `emotion_model.pth` | Final trained HuBERT-CNN emotion recognition model | ## Usage ```python import torch from huggingface_hub import hf_hub_download # Download model model_path = hf_hub_download(repo_id='devanshty/Mantis', filename='emotion_model.pth') # Load model (adjust to your model class) model = torch.load(model_path, map_location='cpu') model.eval() # Run inference on audio features # (preprocess audio to match training pipeline) ``` ## Download & Use ```python from huggingface_hub import hf_hub_download model_path = hf_hub_download(repo_id='devanshty/Mantis', filename='emotion_model.pth') ```