Add emotion2vec+ large MLX weights (fp16) + config + model card

f2f5dd2 verified 1 day ago

2.36 kB

	---
	license: other
	license_name: funasr-model-license
	license_link: https://huggingface.co/emotion2vec/emotion2vec_plus_large/blob/main/LICENSE
	library_name: mlx
	base_model: emotion2vec/emotion2vec_plus_large
	pipeline_tag: audio-classification
	tags:
	- mlx
	- audio
	- audio-classification
	- speech-emotion-recognition
	- emotion-recognition
	- emotion2vec
	- data2vec
	- apple-silicon
	---

	# mlx-community/emotion2vec-plus-large-mlx

	The emotion2vec+ large speech-emotion-recognition model converted to MLX format for native
	inference on Apple Silicon, consumed by the [`xocialize/emotion2vec-mlx-swift`](https://github.com/xocialize/emotion2vec-mlx-swift)
	Swift port. Refer to the [original model card](https://huggingface.co/emotion2vec/emotion2vec_plus_large)
	for details.

	## Model

	- Family: emotion2vec / emotion2vec+ (Ma et al., "emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation," [arXiv:2312.15185](https://arxiv.org/abs/2312.15185))
	- Architecture: Data2Vec 2.0 — conv feature extractor → transformer encoder → 9-class linear head
	- Output: 9-class categorical emotion (`angry`, `disgusted`, `fearful`, `happy`, `neutral`, `other`, `sad`, `surprised`, `unknown`)
	- Sample rate: 16000 Hz, mono
	- Precision: fp16 (233 tensors)

	## Files

	- `emotion2vec_large.safetensors` — the MLX weights (fp16).
	- `emotion2vec_large_config.json` — model config consumed by the loader.

	## Usage (Swift / MLX)

	```swift
	import Emotion2VecMLX
	import Hub

	let dir = try await HubApi().snapshot(from: "mlx-community/emotion2vec-plus-large-mlx")
	let recogniser = try await EmotionRecogniser(weightsDirectory: dir,
	config: EmotionRecogniserConfig(models: .categorical))
	let result = try await recogniser.classify(audioURL: speechURL)
	print(result.categorical.label, result.categorical.confidence)
	```

	## Source

	- Original model: https://huggingface.co/emotion2vec/emotion2vec_plus_large
	- Swift consumer: https://github.com/xocialize/emotion2vec-mlx-swift

	## License

	FunASR's custom MODEL_LICENSE — permits use, copy, modification, and redistribution with
	attribution and model-name retention (no-denigration clause, no warranty). Non-SPDX but
	permissive. See the [original license](https://huggingface.co/emotion2vec/emotion2vec_plus_large/blob/main/LICENSE).