A newer version of the Gradio SDK is available: 6.13.0
HuggingFace Spaces Deployment Guide
This guide explains how to deploy the Multi-Agent Research Paper Analysis System to HuggingFace Spaces.
Prerequisites
- HuggingFace Account: Create an account at huggingface.co
- Azure OpenAI Resource: You need an active Azure OpenAI resource with:
- A deployed LLM model (e.g.,
gpt-4o-mini) - A deployed embedding model (e.g.,
text-embedding-3-small)
- A deployed LLM model (e.g.,
Required Environment Variables
You MUST configure the following environment variables in HuggingFace Spaces Settings > Repository secrets:
Azure OpenAI Configuration (REQUIRED)
| Variable Name | Description | Example |
|---|---|---|
AZURE_OPENAI_ENDPOINT |
Your Azure OpenAI resource endpoint | https://your-resource.openai.azure.com/ |
AZURE_OPENAI_API_KEY |
Your Azure OpenAI API key | abc123... |
AZURE_OPENAI_DEPLOYMENT_NAME |
Your LLM deployment name | gpt-4o-mini |
AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME |
Your embedding deployment name | text-embedding-3-small |
AZURE_OPENAI_API_VERSION |
Azure OpenAI API version | 2024-05-01-preview |
LangFuse Observability (Optional)
| Variable Name | Description | Default |
|---|---|---|
LANGFUSE_ENABLED |
Enable/disable LangFuse tracing | true |
LANGFUSE_PUBLIC_KEY |
LangFuse public key | (required if enabled) |
LANGFUSE_SECRET_KEY |
LangFuse secret key | (required if enabled) |
LANGFUSE_HOST |
LangFuse host URL | https://cloud.langfuse.com |
MCP Configuration (Optional)
| Variable Name | Description | Default |
|---|---|---|
USE_MCP_ARXIV |
Use MCP for arXiv access | false |
USE_LEGACY_MCP |
Use legacy MCP instead of FastMCP | false |
MCP_ARXIV_STORAGE_PATH |
MCP server storage path | ./data/mcp_papers/ |
FASTMCP_SERVER_PORT |
FastMCP server port | 5555 |
Common Deployment Issues
1. 404 Error: "Resource not found"
Symptoms:
Error code: 404 - {'error': {'code': '404', 'message': 'Resource not found'}}
Cause: Missing or incorrect AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME variable.
Solution:
- Go to HuggingFace Spaces Settings > Repository secrets
- Add
AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAMEwith your embedding deployment name - Verify the deployment exists in your Azure OpenAI resource
2. Missing Environment Variables
Symptoms:
ValueError: Missing required environment variables: AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME
Solution: The app will now validate all required variables on startup. Follow the error message to set missing variables in HuggingFace Spaces secrets.
3. MCP Dependency Conflicts
Symptoms:
ImportError: cannot import name 'FastMCP'
Solution:
The huggingface_startup.sh script automatically fixes MCP version conflicts. Ensure this script is configured as the startup command in your Space's settings.
Deployment Steps
1. Create a New Space
- Go to huggingface.co/spaces
- Click "Create new Space"
- Select "Gradio" as the SDK
- Choose Python 3.10 as the version
- Set the Space name and visibility
2. Configure Repository Secrets
- Go to your Space's Settings
- Scroll to "Repository secrets"
- Add all required environment variables listed above
- Click "Save" after adding each variable
3. Configure Startup Command
In your Space's README.md, ensure the startup command uses the custom script:
---
title: Multi-Agent Research Paper Analysis
emoji: π
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.11.0
python_version: 3.10
app_file: app.py
startup_duration_timeout: 5m
---
In your Space settings, set the startup command to:
bash huggingface_startup.sh
4. Push Your Code
git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME
git push hf main
5. Monitor Deployment
- Watch the build logs in HuggingFace Spaces
- Look for the environment variable check output:
π Checking environment variables... β Found: AZURE_OPENAI_ENDPOINT β Found: AZURE_OPENAI_API_KEY β Found: AZURE_OPENAI_DEPLOYMENT_NAME β Found: AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME - If any variables are missing, the deployment will fail with clear instructions
Verifying Deployment
Once deployed, test your Space:
- Open the Space URL
- Enter a research query (e.g., "transformer architectures in NLP")
- Select an arXiv category
- Click "Analyze Papers"
- Verify that papers are retrieved and analyzed successfully
Troubleshooting
Check Logs
View real-time logs in HuggingFace Spaces:
- Go to your Space
- Click on "Logs" tab
- Look for error messages or warnings
Validate Azure OpenAI Deployments
Ensure your deployments exist:
- Go to portal.azure.com
- Navigate to your Azure OpenAI resource
- Click "Model deployments"
- Verify both LLM and embedding deployments are listed and active
Test Locally First
Before deploying to HuggingFace Spaces:
- Copy
.env.exampleto.env - Fill in your Azure OpenAI credentials
- Run
python app.pylocally - Verify everything works
- Then push to HuggingFace Spaces
Performance Considerations
- Cold Start: First load may take 1-2 minutes as dependencies initialize
- Memory: Recommended minimum 4GB RAM
- Storage: ~500MB for dependencies + downloaded papers
- Timeout: Set
startup_duration_timeout: 5min README.md
Security Best Practices
- Never commit API keys to the repository
- Use HuggingFace Spaces secrets for all sensitive variables
- Rotate keys regularly in both Azure and HuggingFace
- Monitor usage in Azure OpenAI to prevent unexpected costs
- Set rate limits in Azure to prevent abuse
Cost Management
- Embedding costs: ~$0.02 per 1M tokens
- LLM costs: ~$0.15-$0.60 per 1M tokens (depending on model)
- Typical analysis: 5 papers costs ~$0.10-$0.50
- Monitor usage: Use Azure OpenAI metrics dashboard
- LangFuse observability: Track token usage and costs per request
Support
For issues specific to:
- This application: Open an issue on GitHub
- HuggingFace Spaces: Check HuggingFace Docs
- Azure OpenAI: Consult Azure OpenAI Documentation