Spaces:

abedir
/

emotion-detector-api

Sleeping

App Files Files Community

emotion-detector-api / README.md

abedir

Upload 4 files

7de41d7 verified 18 days ago

preview code

raw

history blame contribute delete

5.16 kB

	---
	title: Emotion Detector API
	emoji: 🎧
	colorFrom: blue
	colorTo: purple
	sdk: docker
	app_port: 7860
	---

	# 🎧 Emotion Detector API

	Professional RESTful API for emotion recognition in speech using the fine-tuned HuBERT model: abedir/emotion-detector

	## 🚀 Quick Start

	### Health Check
	```bash
	curl https://YOUR-SPACE-NAME.hf.space/health
	```

	### Predict Emotion
	```bash
	curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict" \
	-F "file=@audio.wav"
	```

	### Python Example
	```python
	import requests

	# Predict emotion
	url = "https://YOUR-SPACE-NAME.hf.space/predict"
	files = {"file": open("audio.wav", "rb")}
	response = requests.post(url, files=files)
	result = response.json()

	print(f"Emotion: {result['emotion']}")
	print(f"Confidence: {result['confidence']:.2%}")
	```

	## 🎯 Supported Emotions

	1. Angry/Fearful - Expressions of anger or fear
	2. Happy/Laugh - Joyful or laughing expressions
	3. Neutral/Calm - Neutral or calm speech
	4. Sad/Cry - Expressions of sadness or crying
	5. Surprised/Amazed - Surprised or amazed reactions

	## 📡 API Endpoints

	### Core Endpoints
	- `GET /` - API welcome and version info
	- `GET /health` - Health check with system status
	- `GET /docs` - Interactive API documentation (Swagger UI)
	- `GET /redoc` - Alternative API documentation
	- `GET /model/info` - Model configuration details
	- `GET /emotions` - List of supported emotions
	- `GET /stats` - API and system statistics
	- `GET /version` - API version information

	### Prediction Endpoints
	- `POST /predict` - Basic emotion prediction
	- `POST /predict/detailed` - Prediction with audio metadata
	- `POST /predict/base64` - Predict from base64 encoded audio
	- `POST /predict/batch` - Batch processing (max 50 files)
	- `POST /predict/top-k` - Get top K predictions
	- `POST /predict/threshold` - Confidence-based prediction

	### Analysis Endpoints
	- `POST /analyze/audio` - Get audio metadata without prediction

	## 📦 Response Format

	```json
	{
	"emotion": "Happy/Laugh",
	"confidence": 0.8745,
	"probabilities": {
	"Angry/Fearful": 0.0234,
	"Happy/Laugh": 0.8745,
	"Neutral/Calm": 0.0521,
	"Sad/Cry": 0.0178,
	"Surprised/Amazed": 0.0322
	}
	}
	```

	## 🛠️ Integration Examples

	### cURL
	```bash
	# Basic prediction
	curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict" \
	-F "file=@audio.wav"

	# Detailed prediction
	curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict/detailed" \
	-F "file=@audio.wav"

	# Top 3 predictions
	curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict/top-k?k=3" \
	-F "file=@audio.wav"

	# Batch prediction
	curl -X POST "https://YOUR-SPACE-NAME.hf.space/predict/batch" \
	-F "files=@audio1.wav" \
	-F "files=@audio2.wav" \
	-F "files=@audio3.wav"
	```

	### Python
	```python
	import requests

	BASE_URL = "https://YOUR-SPACE-NAME.hf.space"

	# Basic prediction
	with open("audio.wav", "rb") as f:
	response = requests.post(f"{BASE_URL}/predict", files={"file": f})
	result = response.json()
	print(f"Emotion: {result['emotion']}")
	print(f"Confidence: {result['confidence']:.2%}")

	# Batch prediction
	files = [
	("files", open("audio1.wav", "rb")),
	("files", open("audio2.wav", "rb")),
	("files", open("audio3.wav", "rb"))
	]
	response = requests.post(f"{BASE_URL}/predict/batch", files=files)
	results = response.json()
	print(f"Processed {results['total_files']} files in {results['processing_time_seconds']:.2f}s")
	```

	### JavaScript
	```javascript
	// Using Fetch API
	const formData = new FormData();
	formData.append('file', audioFile);

	fetch('https://YOUR-SPACE-NAME.hf.space/predict', {
	method: 'POST',
	body: formData
	})
	.then(response => response.json())
	.then(data => {
	console.log('Emotion:', data.emotion);
	console.log('Confidence:', data.confidence);
	});
	```

	## 📚 Documentation

	After deployment, visit:
	- Swagger UI: `/docs` - Interactive API testing
	- ReDoc: `/redoc` - Beautiful API documentation

	## 🔧 Technical Details

	- Model: HuBERT (Hidden-Unit BERT)
	- Model ID: abedir/emotion-detector
	- Sample Rate: 16kHz (automatic resampling)
	- Max Duration: 3 seconds
	- Supported Formats: WAV, MP3, FLAC, OGG, M4A, WebM
	- Framework: FastAPI + PyTorch + Transformers

	## 🎯 Use Cases

	✅ Call center sentiment analysis
	✅ Mental health monitoring
	✅ Voice assistant emotion detection
	✅ Gaming and entertainment
	✅ Media content analysis
	✅ Research in affective computing

	## 🚨 Error Handling

	All errors return a consistent format:

	```json
	{
	"error": "Invalid file format",
	"detail": "Supported formats: .wav, .mp3, .flac, .ogg, .m4a, .webm",
	"timestamp": "2024-02-06T10:30:00"
	}
	```

	HTTP Status Codes:
	- `200` - Success
	- `400` - Bad Request (invalid input)
	- `422` - Validation Error
	- `500` - Internal Server Error

	## 🔗 Related Links

	- Model: [abedir/emotion-detector](https://huggingface.co/abedir/emotion-detector)
	- HuBERT Paper: [arXiv:2106.07447](https://arxiv.org/abs/2106.07447)
	- FastAPI: [Documentation](https://fastapi.tiangolo.com/)

	## 📄 License

	Apache 2.0

	---

	Built with ❤️ using HuBERT, FastAPI, and Transformers