NiWaRe commited on
Commit
1c642fe
·
verified ·
1 Parent(s): d097b9a

Update HUGGINGFACE_DEPLOYMENT.md

Browse files
Files changed (1) hide show
  1. HUGGINGFACE_DEPLOYMENT.md +10 -113
HUGGINGFACE_DEPLOYMENT.md CHANGED
@@ -8,15 +8,17 @@ The application runs as a FastAPI server on port 7860 (HF Spaces default) with:
8
  - **Main landing page**: `/` - Serves the index.html with setup instructions
9
  - **Health check**: `/health` - Returns server status and W&B configuration
10
  - **MCP endpoint**: `/mcp` - Streamable HTTP transport endpoint for MCP
11
- - Uses Server-Sent Events (SSE) for responses
12
  - Requires `Accept: application/json, text/event-stream` header
13
  - Supports initialize, tools/list, tools/call methods
14
 
 
 
15
  ## Key Changes for HF Spaces
16
 
17
  ### 1. app.py
18
  - Creates a FastAPI application that serves the landing page
19
- - Mounts FastMCP server using `mcp.streamable_http_app()` pattern (following HuggingFace example)
20
  - Uses lifespan context manager for session management
21
  - Configured to run on `0.0.0.0:7860` (HF Spaces requirement)
22
  - Sets W&B cache directories to `/tmp` to avoid permission issues
@@ -27,19 +29,19 @@ The application runs as a FastAPI server on port 7860 (HF Spaces default) with:
27
  - Maintains backward compatibility with CLI usage
28
 
29
  ### 3. Dependencies
30
- - FastAPI and uvicorn moved to main dependencies (not optional)
31
  - All dependencies listed in requirements.txt for HF Spaces
32
 
33
  ### 4. Lazy Loading Fix
34
- - Fixed `TraceService` initialization in `query_weave.py` to use lazy loading
35
- - This allows the server to start even without a W&B API key
36
  - The service is only initialized when first needed
37
 
38
  ## Environment Variables
39
 
40
  No environment variables are required! The server works without any configuration.
41
 
42
- **Note**: Users provide their own W&B API keys as Bearer tokens. No server configuration needed.
43
 
44
  ## Deployment Steps
45
 
@@ -49,7 +51,7 @@ No environment variables are required! The server works without any configuratio
49
 
50
  2. **Configure Secrets**
51
  - Go to Settings → Variables and secrets
52
- - Add `WANDB_API_KEY` as a secret
53
 
54
  3. **Push the Code**
55
  ```bash
@@ -94,55 +96,6 @@ python app.py
94
 
95
  The server will start on http://localhost:7860
96
 
97
- ## MCP Client Configuration
98
-
99
- ### Important Notes
100
-
101
- The MCP server uses the Streamable HTTP transport which:
102
- - Returns responses in Server-Sent Events (SSE) format
103
- - Requires the client to send `Accept: application/json, text/event-stream` header
104
- - Uses session management for stateful operations
105
-
106
- ### Testing with curl
107
-
108
- ```bash
109
- # Initialize the server
110
- curl -X POST https://[your-username]-[space-name].hf.space/mcp \
111
- -H "Content-Type: application/json" \
112
- -H "Accept: application/json, text/event-stream" \
113
- -d '{
114
- "jsonrpc": "2.0",
115
- "method": "initialize",
116
- "params": {
117
- "protocolVersion": "0.1.0",
118
- "capabilities": {},
119
- "clientInfo": {"name": "test-client", "version": "1.0"}
120
- },
121
- "id": 1
122
- }'
123
-
124
- # List available tools
125
- curl -X POST https://[your-username]-[space-name].hf.space/mcp \
126
- -H "Content-Type: application/json" \
127
- -H "Accept: application/json, text/event-stream" \
128
- -d '{"jsonrpc":"2.0","method":"tools/list","params":{},"id":2}'
129
- ```
130
-
131
- ### MCP Client Configuration
132
-
133
- For MCP clients that support streamable HTTP:
134
-
135
- ```json
136
- {
137
- "mcpServers": {
138
- "wandb": {
139
- "url": "https://[your-username]-[space-name].hf.space/mcp",
140
- "transport": "streamable-http"
141
- }
142
- }
143
- }
144
- ```
145
-
146
  ## MCP Architecture & Key Learnings
147
 
148
  ### Understanding MCP and FastMCP
@@ -249,60 +202,4 @@ os.environ["HOME"] = "/tmp"
249
  | Missing Accept header | "Not Acceptable" error | Include `Accept: application/json, text/event-stream` |
250
  | Import-time API key errors | Server fails to start | Use lazy loading pattern |
251
  | Permission errors in HF Spaces | `mkdir /.cache: permission denied` | Set cache dirs to `/tmp` |
252
- | Can't access MCP methods | Methods not exposed | Use FastMCP's built-in decorators and methods |
253
-
254
- ### Testing Strategy
255
-
256
- 1. **Local Testing**: Always test with correct headers
257
- 2. **Check Routes**: Verify mounting creates `/mcp` endpoint
258
- 3. **Test Initialize First**: This method doesn't require session state
259
- 4. **SSE Response Parsing**: Remember responses are SSE formatted, not plain JSON
260
-
261
- ### Evolution of Our Implementation
262
-
263
- Our journey to the correct implementation went through several iterations:
264
-
265
- #### Attempt 1: Direct Protocol Implementation
266
- - **Approach**: Implement MCP protocol directly in FastAPI
267
- - **Issue**: Reinventing the wheel, not using FastMCP's built-in capabilities
268
- - **Learning**: FastMCP already handles the protocol complexity
269
-
270
- #### Attempt 2: Trying to Extract FastMCP's Internal App
271
- - **Approach**: Access FastMCP's internal FastAPI app via attributes
272
- - **Issue**: FastMCP doesn't expose its app in an accessible way
273
- - **Learning**: Need to use FastMCP's intended methods
274
-
275
- #### Attempt 3: Using http_app() Method
276
- - **Approach**: Try various methods like `http_app()`, `asgi_app()`, etc.
277
- - **Issue**: These methods either don't exist or don't work as expected
278
- - **Learning**: Documentation and examples are crucial
279
-
280
- #### Attempt 4: The Correct Pattern
281
- - **Approach**: Use `streamable_http_app()` following HuggingFace example
282
- - **Success**: Works perfectly when mounted at root
283
- - **Key Insight**: The example pattern exists for a reason - follow it!
284
-
285
- ### Key Takeaways
286
-
287
- 1. **Follow Existing Examples**: The HuggingFace example was the key to success
288
- 2. **Understand the Protocol**: MCP uses SSE for good reasons (streaming, stateless option)
289
- 3. **Lazy Loading is Critical**: Avoid initialization-time dependencies
290
- 4. **Environment Matters**: HF Spaces has specific constraints (ports, permissions)
291
- 5. **Test Incrementally**: Start with basic endpoints before complex operations
292
-
293
- ## Differences from Standard Deployment
294
-
295
- | Feature | Standard | HF Spaces |
296
- |---------|----------|-----------|
297
- | Transport | stdio/http | streamable-http only |
298
- | Port | Configurable | Fixed at 7860 |
299
- | Host | Configurable | Fixed at 0.0.0.0 |
300
- | Entry Point | CLI (server.py) | FastAPI (app.py) |
301
- | Static Files | Optional directories | Embedded in app |
302
-
303
- ## Troubleshooting
304
-
305
- 1. **Server not starting**: Check WANDB_API_KEY is set in Space secrets
306
- 2. **MCP connection fails**: Ensure using `/mcp` endpoint with correct transport ("streamable-http")
307
- 3. **Tools not working**: Verify W&B API key has necessary permissions
308
- 4. **Landing page not loading**: Check index.html is included in deployment
 
8
  - **Main landing page**: `/` - Serves the index.html with setup instructions
9
  - **Health check**: `/health` - Returns server status and W&B configuration
10
  - **MCP endpoint**: `/mcp` - Streamable HTTP transport endpoint for MCP
11
+ - Server can intelligently decide to return plan plan JSON or a SSE stream (the client always requests in the same way, see below)
12
  - Requires `Accept: application/json, text/event-stream` header
13
  - Supports initialize, tools/list, tools/call methods
14
 
15
+ More information on the details of [streamable http](https://modelcontextprotocol.io/specification/draft/basic/transports#streamable-http) are in the official docs and [this PR](https://github.com/modelcontextprotocol/modelcontextprotocol/pull/206).
16
+
17
  ## Key Changes for HF Spaces
18
 
19
  ### 1. app.py
20
  - Creates a FastAPI application that serves the landing page
21
+ - Mounts FastMCP server using `mcp.streamable_http_app()` pattern (following [example from Mistral here](https://huggingface.co/spaces/Jofthomas/Multiple_mcp_fastapi_template))
22
  - Uses lifespan context manager for session management
23
  - Configured to run on `0.0.0.0:7860` (HF Spaces requirement)
24
  - Sets W&B cache directories to `/tmp` to avoid permission issues
 
29
  - Maintains backward compatibility with CLI usage
30
 
31
  ### 3. Dependencies
32
+ - FastAPI and uvicorn as main dependencies
33
  - All dependencies listed in requirements.txt for HF Spaces
34
 
35
  ### 4. Lazy Loading Fix
36
+ - `TraceService` initialization in `query_weave.py` to use lazy loading
37
+ - This allows the server to start even without a W&B API key (when first adding in LeChat for example without connecting)
38
  - The service is only initialized when first needed
39
 
40
  ## Environment Variables
41
 
42
  No environment variables are required! The server works without any configuration.
43
 
44
+ **Note**: Users provide their own W&B API keys as Bearer tokens. No server configuration needed (see AUTH_README.md).
45
 
46
  ## Deployment Steps
47
 
 
51
 
52
  2. **Configure Secrets**
53
  - Go to Settings → Variables and secrets
54
+ - Add `MCP_SERVER_URL` as a variable for the URL to be correctly
55
 
56
  3. **Push the Code**
57
  ```bash
 
96
 
97
  The server will start on http://localhost:7860
98
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
99
  ## MCP Architecture & Key Learnings
100
 
101
  ### Understanding MCP and FastMCP
 
202
  | Missing Accept header | "Not Acceptable" error | Include `Accept: application/json, text/event-stream` |
203
  | Import-time API key errors | Server fails to start | Use lazy loading pattern |
204
  | Permission errors in HF Spaces | `mkdir /.cache: permission denied` | Set cache dirs to `/tmp` |
205
+ | Can't access MCP methods | Methods not exposed | Use FastMCP's built-in decorators and methods |