Spaces:

mrdbourke
/

WeDetect-demo

Paused

App Files Files Community

WeDetect-demo / README.md

mrdbourke

Upload 3 files

98545e4 verified 3 months ago

preview code

raw

history blame contribute delete

4.15 kB

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

metadata

title: WeDetect Open-Vocabulary Detection
emoji: 🔍
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.50.0
app_file: app.py
pinned: false
license: gpl-3.0
suggested_hardware: t4-medium
models:
  - fushh7/WeDetect
tags:
  - object-detection
  - zero-shot
  - open-vocabulary
  - computer-vision
  - chinese

🔍 WeDetect: Open-Vocabulary Object Detection

WeDetect is a fast, open-vocabulary object detection model that can detect arbitrary objects specified via text prompts. This demo provides an interactive interface for testing the model.

✨ Features

Open-Vocabulary Detection: Detect any object by simply specifying its name
English & Chinese Support: Enter class names in English (auto-translated) or Chinese directly
Editable Translation: Review and correct auto-translations before detection
Multiple Model Sizes: Choose between Tiny (fast), Base (balanced), or Large (best quality)
Adjustable Threshold: Fine-tune detection sensitivity

🚀 How to Use

Upload an Image: Click the upload area or drag-and-drop an image
Choose Input Language: Select English or Chinese (中文)
Enter Class Names: Type the objects you want to detect, separated by commas
- English example: person, car, dog, cat
- Chinese example: 人, 车, 狗, 猫
Review Translation: If using English, check the Chinese preview and edit if needed
Adjust Threshold: Lower values = more detections, higher values = more confident detections
Click Detect: Press the "🔍 Detect Objects" button

📊 Model Information

Model	Parameters	Speed	Quality	GPU Memory
WeDetect-Tiny	~28M	⚡⚡⚡ Fastest	Good	~4-6 GB
WeDetect-Base	~89M	⚡⚡ Fast	Better	~8-10 GB
WeDetect-Large	~198M	⚡ Moderate	Best	~12-16 GB

📚 Common Class Names

English	Chinese	English	Chinese	English	Chinese
person	人	car	车	dog	狗
cat	猫	bird	鸟	bicycle	自行车
chair	椅子	table	桌子	bed	床
phone	手机	laptop	笔记本电脑	book	书
bottle	瓶子	cup	杯子	shoe	鞋

Note: WeDetect is trained on Chinese data, so it works best with Chinese class names. The built-in dictionary covers ~200 common objects.

⚠️ Important Notes

Chinese Model: WeDetect uses Chinese class names internally. English inputs are auto-translated using a dictionary of ~200 common objects.
Unknown Words: If a word isn't in the dictionary, it will be passed through unchanged. Check the Chinese preview to verify translations.
GPU Required: This demo requires GPU acceleration. If you encounter memory errors, try using a smaller model.

🔧 Technical Details

This Space uses:

Gradio 5.50.0+ - Compatible with huggingface_hub 1.x
huggingface_hub 1.x - Latest HF Hub API
MMDetection 3.3.0 - Object detection framework
@spaces.GPU decorator for GPU acceleration

📖 Citation

If you use WeDetect in your research, please cite:

@article{fu2025wedetect,
  title={WeDetect: Fast Open-Vocabulary Object Detection as Retrieval},
  author={Fu, Shenghao and Su, Yukun and Rao, Fengyun and LYU, Jing and Xie, Xiaohua and Zheng, Wei-Shi},
  journal={arXiv preprint arXiv:2512.12309},
  year={2025}
}

📄 License

This demo uses the WeDetect model which is licensed under GPL-3.0.