Spaces:
Paused
Paused
A newer version of the Gradio SDK is available: 6.13.0
metadata
title: WeDetect Open-Vocabulary Detection
emoji: ๐
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.50.0
app_file: app.py
pinned: false
license: gpl-3.0
suggested_hardware: t4-medium
models:
- fushh7/WeDetect
tags:
- object-detection
- zero-shot
- open-vocabulary
- computer-vision
- chinese
๐ WeDetect: Open-Vocabulary Object Detection
WeDetect is a fast, open-vocabulary object detection model that can detect arbitrary objects specified via text prompts. This demo provides an interactive interface for testing the model.
โจ Features
- Open-Vocabulary Detection: Detect any object by simply specifying its name
- English & Chinese Support: Enter class names in English (auto-translated) or Chinese directly
- Editable Translation: Review and correct auto-translations before detection
- Multiple Model Sizes: Choose between Tiny (fast), Base (balanced), or Large (best quality)
- Adjustable Threshold: Fine-tune detection sensitivity
๐ How to Use
- Upload an Image: Click the upload area or drag-and-drop an image
- Choose Input Language: Select English or Chinese (ไธญๆ)
- Enter Class Names: Type the objects you want to detect, separated by commas
- English example:
person, car, dog, cat - Chinese example:
ไบบ, ่ฝฆ, ็, ็ซ
- English example:
- Review Translation: If using English, check the Chinese preview and edit if needed
- Adjust Threshold: Lower values = more detections, higher values = more confident detections
- Click Detect: Press the "๐ Detect Objects" button
๐ Model Information
| Model | Parameters | Speed | Quality | GPU Memory |
|---|---|---|---|---|
| WeDetect-Tiny | ~28M | โกโกโก Fastest | Good | ~4-6 GB |
| WeDetect-Base | ~89M | โกโก Fast | Better | ~8-10 GB |
| WeDetect-Large | ~198M | โก Moderate | Best | ~12-16 GB |
๐ Common Class Names
| English | Chinese | English | Chinese | English | Chinese | ||
|---|---|---|---|---|---|---|---|
| person | ไบบ | car | ่ฝฆ | dog | ็ | ||
| cat | ็ซ | bird | ้ธ | bicycle | ่ช่ก่ฝฆ | ||
| chair | ๆค ๅญ | table | ๆกๅญ | bed | ๅบ | ||
| phone | ๆๆบ | laptop | ็ฌ่ฎฐๆฌ็ต่ | book | ไนฆ | ||
| bottle | ็ถๅญ | cup | ๆฏๅญ | shoe | ้ |
Note: WeDetect is trained on Chinese data, so it works best with Chinese class names. The built-in dictionary covers ~200 common objects.
โ ๏ธ Important Notes
- Chinese Model: WeDetect uses Chinese class names internally. English inputs are auto-translated using a dictionary of ~200 common objects.
- Unknown Words: If a word isn't in the dictionary, it will be passed through unchanged. Check the Chinese preview to verify translations.
- GPU Required: This demo requires GPU acceleration. If you encounter memory errors, try using a smaller model.
๐ง Technical Details
This Space uses:
- Gradio 5.50.0+ - Compatible with huggingface_hub 1.x
- huggingface_hub 1.x - Latest HF Hub API
- MMDetection 3.3.0 - Object detection framework
- @spaces.GPU decorator for GPU acceleration
๐ Citation
If you use WeDetect in your research, please cite:
@article{fu2025wedetect,
title={WeDetect: Fast Open-Vocabulary Object Detection as Retrieval},
author={Fu, Shenghao and Su, Yukun and Rao, Fengyun and LYU, Jing and Xie, Xiaohua and Zheng, Wei-Shi},
journal={arXiv preprint arXiv:2512.12309},
year={2025}
}
๐ License
This demo uses the WeDetect model which is licensed under GPL-3.0.
๐ Acknowledgments
- WeDetect by WeChatCV
- MMDetection
- Hugging Face Spaces