File size: 2,358 Bytes
e6f5e33
3fb714b
8ccdf00
 
 
e6f5e33
d204bd3
e6f5e33
 
 
660d98a
8ccdf00
d204bd3
e6f5e33
 
660d98a
3fb714b
660d98a
d204bd3
660d98a
4c6a6fa
acb6eb6
3fb714b
660d98a
 
3fb714b
eb0bfb4
 
 
3fb714b
 
660d98a
3fb714b
660d98a
 
 
 
 
 
 
3fb714b
 
 
 
 
660d98a
3fb714b
660d98a
 
 
3fb714b
660d98a
 
 
 
 
 
3fb714b
 
 
 
 
660d98a
0994822
660d98a
0994822
660d98a
 
3fb714b
 
 
 
 
660d98a
 
3fb714b
660d98a
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
---
title: CV MCP Client
emoji: 🐒
colorFrom: indigo
colorTo: red
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
license: mit
short_description: Computer Vision MCP Client for MCP 1st Birthday Hackathon
tags:
- mcp-in-action-track-creative
---

# πŸŽ₯ CV MCP Client

A **Computer Vision Model Context Protocol (MCP) Client** that streams webcam images and provides detailed scene analysis in real time, designed for the **MCP 1st Birthday Hackathon**.  

Check out the Hackathon details [here](https://huggingface.co/MCP-1st-Birthday).  

The social media post of this MCP Hackthon project on [Discord](https://discord.com/channels/879548962464493619/1439001549492719726/1443045145284051084) [Instagram](https://www.instagram.com/p/DRsw2KOADrB/) [Thread](https://www.threads.com/@oppa.ai_the.one.and.only/post/DRsxlNzAdCj?xmt=AQF0fVYU0qfeEUT4nDojv48yYZmjtK6tCrMx3sehnhVyOw).  

Demo video of the CV MCP Server analyzing the video feed from my robot:  
[Demo Video](https://photos.app.goo.gl/guxui1EsdPNoL4mw7)  

GitHub repo of the cv robot python script:
https://github.com/OppaAI/CV_Robot_MCP

---

## 🌟 Key Features

- Real-time scene description  
- Human detection  
- Animal and object detection  
- Environment classification (indoor/outdoor)  
- Lighting condition analysis  
- Hazards identification  
- Optimized for efficient context window usage  

---

## πŸ”§ How It Works

This Gradio Space interacts with an MCP server to analyze webcam images:

1. Capture an image from your webcam.  
2. Convert the image to **Base64 format**.  
3. Send the image along with your **Hugging Face API token** to the MCP server.  
4. Receive detailed scene analysis, including:
   - Scene description  
   - Detected humans  
   - Detected animals and objects  
   - Environment type (indoor/outdoor)  
   - Lighting condition  
   - Hazards  

---

## ⚑ Demo

Compatible with **PC, mobile, and robots with cameras**.  

Stream images via your webcam or phone camera to receive real-time scene analysis.  

Watch a demo video of the CV MCP Server analyzing the video feed from my robot:  
[Demo Video](https://photos.app.goo.gl/guxui1EsdPNoL4mw7)  

---

## πŸ”‘ Requirements

- **Hugging Face API Token** is required for MCP server access.  
- Set your token as an environment variable:  

```bash
export HF_TOKEN=your_huggingface_token