OppaAI commited on
Commit
0bc6e46
Β·
verified Β·
1 Parent(s): 70d08b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -54
README.md CHANGED
@@ -8,86 +8,68 @@ sdk_version: 6.0.1
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
- short_description: CV VLM MCP Server for MCP 1st Birthday party Hackathon
12
  tags:
13
  - building-mcp-track-creative
14
  ---
15
 
16
- Check out the Hackathon details at: https://huggingface.co/MCP-1st-Birthday
17
 
18
- Social media post link:
19
- https://discord.com/channels/879548962464493619/
20
 
21
- # πŸŽ₯ Robot Vision MCP Server
 
22
 
23
- A Model Context Protocol (MCP) server that provides real-time scene analysis for webcam images. This Space allows users to stream webcam feeds and get detailed information about the environment, objects, humans, and more.
 
24
 
25
  ---
26
 
27
- ## 🌟 Features
28
 
29
- - Real-time scene description
30
- - Human detection
31
- - Animals and objects detection
32
- - Environment classification (indoor/outdoor)
33
- - Lighting condition analysis
34
- - Hazards identification
35
- - Optimized for context window efficiency
36
 
37
  ---
38
 
39
  ## πŸ”§ How It Works
40
 
41
- The Space uses a MCP server to analyze images captured from your webcam. When an image is streamed:
42
 
43
- 1. The image is sent to the MCP server.
44
- 2. The server processes it using the Model Context Protocol.
45
- 3. Outputs are returned and displayed in the UI, including:
46
- - General description of the scene
47
- - Detected humans
48
- - Animals and objects
49
- - Environment type (indoor/outdoor)
50
- - Lighting condition
51
- - Hazards
 
52
 
53
  ---
54
 
55
  ## ⚑ Demo
56
 
57
- Open the Space and use your webcam to test the real-time scene analysis.
58
-
59
- ---
60
-
61
- ## πŸ”‘ Requirements
62
-
63
- - A valid **Hugging Face API Token** is required.
64
- - Ensure you set your token as an environment variable `HF_TOKEN` if running locally.
65
 
66
- > **Note:** This project is meant to demonstrate MCP-based vision tools. It was created for educational purposes, and the MCP server may have usage limits.
67
 
68
- ---
69
-
70
- ## πŸ“š References
71
-
72
- Check out the Hugging Face configuration reference for Spaces: [Spaces Config Reference](https://huggingface.co/docs/hub/spaces-config-reference)
73
 
74
  ---
75
 
76
- ## πŸš€ Usage
77
-
78
- 1. Click the webcam feed in the CV MCP Client: https://huggingface.co/spaces/MCP-1st-Birthday/CV_MCP_Client
79
- 2. The Space will display real-time outputs in the provided textboxes:
80
- - Description
81
- - Environment
82
- - Indoor/Outdoor
83
- - Lighting Condition
84
- - Human Detected
85
- - Animals Detected
86
- - Objects Detected
87
- - Hazards Identified
88
-
89
- ---
90
 
91
- ## ⚠️ Note
 
92
 
93
- This project was created as a demo for an MCP-based vision server. While fully functional, heavy usage may incur resource limits. Feel free to explore the code to understand how the MCP server processes webcam feeds and returns detailed scene analysis.
 
 
8
  app_file: app.py
9
  pinned: false
10
  license: mit
11
+ short_description: Real-time CV VLM MCP Server for MCP 1st Birthday Hackathon
12
  tags:
13
  - building-mcp-track-creative
14
  ---
15
 
16
+ # πŸŽ₯ Robot Vision MCP Server
17
 
18
+ A **Model Context Protocol (MCP) server** that provides **real-time scene analysis** for webcam images.
19
+ This Space allows users to stream live video feeds and get detailed insights about the environment, objects, humans, and more.
20
 
21
+ Check out the Hackathon details [here](https://huggingface.co/MCP-1st-Birthday).
22
+ Join the community discussion on [Discord](https://discord.com/channels/879548962464493619/).
23
 
24
+ 🎬 Watch a demo of the CV MCP Server analyzing a robot’s camera feed:
25
+ [Demo Video](https://photos.app.goo.gl/guxui1EsdPNoL4mw7)
26
 
27
  ---
28
 
29
+ ## 🌟 Key Features
30
 
31
+ - Real-time scene description
32
+ - Human detection
33
+ - Animal and object detection
34
+ - Environment classification (indoor/outdoor)
35
+ - Lighting condition analysis
36
+ - Hazards identification
37
+ - Optimized for **context window efficiency**
38
 
39
  ---
40
 
41
  ## πŸ”§ How It Works
42
 
43
+ This Space leverages the MCP server to analyze images captured from your webcam:
44
 
45
+ 1. Stream an image from your webcam.
46
+ 2. The image is sent to the **MCP server**.
47
+ 3. The server processes it using the **Model Context Protocol**.
48
+ 4. Outputs are returned and displayed in the UI, including:
49
+ - General scene description
50
+ - Detected humans
51
+ - Detected animals and objects
52
+ - Environment type (indoor/outdoor)
53
+ - Lighting condition
54
+ - Hazards
55
 
56
  ---
57
 
58
  ## ⚑ Demo
59
 
60
+ Compatible with **PC, mobile, and robots with cameras**.
 
 
 
 
 
 
 
61
 
62
+ Stream images via your webcam or phone camera to receive real-time scene analysis.
63
 
64
+ Watch a demo video of the CV MCP Server analyzing the video feed from my robot:
65
+ [Demo Video](https://photos.app.goo.gl/guxui1EsdPNoL4mw7)
 
 
 
66
 
67
  ---
68
 
69
+ ## πŸ”‘ Requirements
 
 
 
 
 
 
 
 
 
 
 
 
 
70
 
71
+ - A valid **Hugging Face API Token** is required.
72
+ - If running locally, set your token as an environment variable:
73
 
74
+ ```bash
75
+ export HF_TOKEN=your_huggingface_token