Spaces:

abedir
/

emotion-detector-api

Sleeping

File size: 5,159 Bytes

ce699d1
7de41d7
 
 
ce699d1
 
7de41d7
ce699d1
 
7de41d7

---
title: Emotion Detector API
emoji: 🎧
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
---

# 🎧 Emotion Detector API

Professional RESTful API for emotion recognition in speech using the fine-tuned HuBERT model: **abedir/emotion-detector**

## 🚀 Quick Start

### Health Check
```bash
curl https://YOUR-SPACE-NAME.hf.space/health
```

### Predict Emotion
```bash
curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict" \
  -F "file=@audio.wav"
```

### Python Example
```python
import requests

# Predict emotion
url = "https://YOUR-SPACE-NAME.hf.space/predict"
files = {"file": open("audio.wav", "rb")}
response = requests.post(url, files=files)
result = response.json()

print(f"Emotion: {result['emotion']}")
print(f"Confidence: {result['confidence']:.2%}")
```

## 🎯 Supported Emotions

1. **Angry/Fearful** - Expressions of anger or fear
2. **Happy/Laugh** - Joyful or laughing expressions
3. **Neutral/Calm** - Neutral or calm speech
4. **Sad/Cry** - Expressions of sadness or crying
5. **Surprised/Amazed** - Surprised or amazed reactions

## 📡 API Endpoints

### Core Endpoints
- `GET /` - API welcome and version info
- `GET /health` - Health check with system status
- `GET /docs` - **Interactive API documentation (Swagger UI)**
- `GET /redoc` - Alternative API documentation
- `GET /model/info` - Model configuration details
- `GET /emotions` - List of supported emotions
- `GET /stats` - API and system statistics
- `GET /version` - API version information

### Prediction Endpoints
- `POST /predict` - Basic emotion prediction
- `POST /predict/detailed` - Prediction with audio metadata
- `POST /predict/base64` - Predict from base64 encoded audio
- `POST /predict/batch` - Batch processing (max 50 files)
- `POST /predict/top-k` - Get top K predictions
- `POST /predict/threshold` - Confidence-based prediction

### Analysis Endpoints
- `POST /analyze/audio` - Get audio metadata without prediction

## 📦 Response Format

```json
{
  "emotion": "Happy/Laugh",
  "confidence": 0.8745,
  "probabilities": {
    "Angry/Fearful": 0.0234,
    "Happy/Laugh": 0.8745,
    "Neutral/Calm": 0.0521,
    "Sad/Cry": 0.0178,
    "Surprised/Amazed": 0.0322
  }
}
```

## 🛠️ Integration Examples

### cURL
```bash
# Basic prediction
curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict" \
  -F "file=@audio.wav"

# Detailed prediction
curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict/detailed" \
  -F "file=@audio.wav"

# Top 3 predictions
curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict/top-k?k=3" \
  -F "file=@audio.wav"

# Batch prediction
curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict/batch" \
  -F "files=@audio1.wav" \
  -F "files=@audio2.wav" \
  -F "files=@audio3.wav"
```

### Python
```python
import requests

BASE_URL = "https://YOUR-SPACE-NAME.hf.space"

# Basic prediction
with open("audio.wav", "rb") as f:
    response = requests.post(f"{BASE_URL}/predict", files={"file": f})
    result = response.json()
    print(f"Emotion: {result['emotion']}")
    print(f"Confidence: {result['confidence']:.2%}")

# Batch prediction
files = [
    ("files", open("audio1.wav", "rb")),
    ("files", open("audio2.wav", "rb")),
    ("files", open("audio3.wav", "rb"))
]
response = requests.post(f"{BASE_URL}/predict/batch", files=files)
results = response.json()
print(f"Processed {results['total_files']} files in {results['processing_time_seconds']:.2f}s")
```

### JavaScript
```javascript
// Using Fetch API
const formData = new FormData();
formData.append('file', audioFile);

fetch('https://YOUR-SPACE-NAME.hf.space/predict', {
    method: 'POST',
    body: formData
})
.then(response => response.json())
.then(data => {
    console.log('Emotion:', data.emotion);
    console.log('Confidence:', data.confidence);
});
```

## 📚 Documentation

After deployment, visit:
- **Swagger UI**: `/docs` - Interactive API testing
- **ReDoc**: `/redoc` - Beautiful API documentation

## 🔧 Technical Details

- **Model**: HuBERT (Hidden-Unit BERT)
- **Model ID**: abedir/emotion-detector
- **Sample Rate**: 16kHz (automatic resampling)
- **Max Duration**: 3 seconds
- **Supported Formats**: WAV, MP3, FLAC, OGG, M4A, WebM
- **Framework**: FastAPI + PyTorch + Transformers

## 🎯 Use Cases

✅ Call center sentiment analysis  
✅ Mental health monitoring  
✅ Voice assistant emotion detection  
✅ Gaming and entertainment  
✅ Media content analysis  
✅ Research in affective computing  

## 🚨 Error Handling

All errors return a consistent format:

```json
{
  "error": "Invalid file format",
  "detail": "Supported formats: .wav, .mp3, .flac, .ogg, .m4a, .webm",
  "timestamp": "2024-02-06T10:30:00"
}
```

HTTP Status Codes:
- `200` - Success
- `400` - Bad Request (invalid input)
- `422` - Validation Error
- `500` - Internal Server Error

## 🔗 Related Links

- **Model**: [abedir/emotion-detector](https://huggingface.co/abedir/emotion-detector)
- **HuBERT Paper**: [arXiv:2106.07447](https://arxiv.org/abs/2106.07447)
- **FastAPI**: [Documentation](https://fastapi.tiangolo.com/)

## 📄 License

Apache 2.0

---

**Built with ❤️ using HuBERT, FastAPI, and Transformers**