YuNet Face Detection (GGUF)
GGUF conversion of YuNet for use with CrispEmbed.
YuNet is a lightweight face detector based on ShuffleNetV2, originally shipped with OpenCV. This GGUF file was converted from the face_detection_yunet_2023mar.onnx checkpoint using CrispEmbed's convert-face-to-gguf.py converter.
Model Details
| Property | Value |
|---|---|
| Architecture | ShuffleNetV2 backbone + FPN + multi-scale detection heads |
| Input | 640x640 BGR, raw uint8 range [0, 255] |
| Strides | 8, 16, 32 |
| Outputs | cls (confidence), obj (IoU), bbox (4), kps (5 landmarks x 2) per stride |
| Parameters | ~75K |
| GGUF size | 222 KB |
| ONNX source | face_detection_yunet_2023mar.onnx (228 KB) |
| License | Apache 2.0 |
Usage with CrispEmbed
CLI
# Auto-download and detect
crispembed -m yunet --detect photo.jpg
# JSON output
crispembed -m yunet --detect photo.jpg --json
# Lower confidence threshold
crispembed -m yunet --detect photo.jpg --conf 0.3
Output format
Each detection contains:
x, y, w, hโ bounding box (top-left corner + size) in original image coordinatesconfโ detection confidence (0..1)landmarks[10]โ 5 facial landmarks as (x, y) pairs:- [0,1] right eye
- [2,3] left eye
- [4,5] nose tip
- [6,7] right mouth corner
- [8,9] left mouth corner
Note: landmark order follows OpenCV's convention (right_eye, left_eye, nose, right_mouth, left_mouth), which differs from InsightFace/SCRFD (left_eye, right_eye, nose, left_mouth, right_mouth).
C API
#include "crispembed.h"
crispembed_ctx * ctx = crispembed_init("yunet.gguf", 4);
crispembed_face faces[32];
int n = crispembed_detect(ctx, "photo.jpg", faces, 32, 0.5f, 640);
for (int i = 0; i < n; i++) {
printf("face %d: (%.0f,%.0f,%.0f,%.0f) conf=%.2f\n",
i, faces[i].x, faces[i].y, faces[i].w, faces[i].h, faces[i].conf);
}
crispembed_free(ctx);
Python
from crispembed import CrispFace
det = CrispFace("yunet.gguf")
faces = det.detect("photo.jpg", conf=0.5, det_size=640)
for f in faces:
print(f"bbox=({f['x']:.0f},{f['y']:.0f},{f['w']:.0f},{f['h']:.0f}) conf={f['confidence']:.2f}")
YuNet vs SCRFD
| YuNet | SCRFD-10G | |
|---|---|---|
| Size | 222 KB | ~16 MB |
| Speed (CPU) | ~5ms | ~50ms |
| Accuracy (WiderFace easy) | 88.3% | 95.2% |
| Anchors per cell | 1 | 2 |
| Bbox decode | center+scale (exp) | distance-based |
| Input normalization | None (raw 0-255) | (v-127.5)/128 |
YuNet is best for latency-critical or resource-constrained scenarios. SCRFD is better when detection accuracy matters more than speed or model size.
Conversion
python models/convert-face-to-gguf.py \
--onnx face_detection_yunet_2023mar.onnx \
--output yunet.gguf \
--model-type detection \
--model-name yunet
Parity
Tested against OpenCV's cv2.FaceDetectorYN on the same ONNX model:
- Bounding box IoU: >0.99
- Score difference: <0.01
- Landmark difference: <2px
Source
- ONNX model: opencv/opencv_zoo
- Paper: YuNet: A Tiny Millisecond-level Face Detector (Machine Intelligence Research, 2023)
- Original implementation: ShiqiYu/libfacedetection
- Downloads last month
- 82
We're not able to determine the quantization variants.
Model tree for cstr/yunet-GGUF
Base model
opencv/opencv_zoo