A newer version of the Gradio SDK is available:
6.5.1
MCP Download Issue - Fix Documentation
Problem Summary
The MCP arXiv client was experiencing an issue where the download_paper tool would complete successfully on the remote MCP server, but the downloaded PDF files would not appear in the client's local data/mcp_papers/ directory.
Root Cause
The issue stems from the client-server architecture of MCP (Model Context Protocol):
- MCP Server runs as a separate process (possibly remote)
- Server downloads PDFs to its own storage location
- Server returns
{"status": "success"}without file path - Client expects files in its local
data/mcp_papers/directory - No file transfer mechanism exists between server and client storage
This is fundamentally a storage path mismatch between what the server uses and what the client expects.
Solution Implemented
1. Tool Discovery (Diagnostic)
Added automatic tool discovery when connecting to MCP server:
- Lists all available MCP tools at session initialization
- Logs tool names, descriptions, and schemas
- Helps diagnose what capabilities the server provides
Location: utils/mcp_arxiv_client.py:88-112 (_discover_tools method)
2. Direct Download Fallback
Implemented a fallback mechanism that downloads PDFs directly from arXiv when MCP download fails:
- Detects when MCP download completes but file is not accessible
- Downloads PDF directly from
https://arxiv.org/pdf/{paper_id}.pdf - Writes file to client's local storage directory
- Maintains same retry logic and error handling
Location: utils/mcp_arxiv_client.py:114-152 (_download_from_arxiv_direct method)
3. Enhanced Error Handling
Updated download_paper_async to:
- Try MCP download first (preserves existing functionality)
- Check multiple possible file locations
- Fall back to direct download if MCP fails
- Provide detailed logging at each step
Location: utils/mcp_arxiv_client.py:462-479 (updated error handling)
How It Works Now
Download Flow
1. Check if file already exists locally β Return if found
2. Call MCP server's download_paper tool
3. Check if file appeared in expected locations:
a. Expected path: data/mcp_papers/{paper_id}.pdf
b. MCP-returned path (if provided in response)
c. Any file in storage matching paper_id
4. If file not found β Fall back to direct arXiv download
5. Download PDF directly to client storage
6. Return path to downloaded file
Benefits
- Zero breaking changes: Existing MCP functionality preserved
- Automatic fallback: Works even with remote MCP servers
- Better diagnostics: Tool discovery helps troubleshoot issues
- Guaranteed downloads: Direct fallback ensures files are retrieved
- Client-side storage: Files always accessible to client process
Using the Fix
Running the Application
No changes needed! The fix is automatic:
# Set environment variables (optional - defaults work)
export USE_MCP_ARXIV=true
export MCP_ARXIV_STORAGE_PATH=data/mcp_papers
# Run the application
python app.py
The system will:
- Try MCP download first
- Automatically fall back to direct download if needed
- Log which method succeeded
Running Diagnostics
Use the diagnostic script to test your MCP setup:
python test_mcp_diagnostic.py
This will:
- Check environment configuration
- Verify storage directory setup
- List available MCP tools
- Test search functionality
- Test download with detailed logging
- Show file system state before/after
Expected Output:
================================================================================
MCP arXiv Client Diagnostic Test
================================================================================
[1] Environment Configuration:
USE_MCP_ARXIV: true
MCP_ARXIV_STORAGE_PATH: data/mcp_papers
[2] Storage Directory:
Path: /path/to/data/mcp_papers
Exists: True
Contains 0 PDF files
[3] Initializing MCP Client:
β Client initialized successfully
[4] Testing Search Functionality:
β Search successful, found 2 papers
First paper: Attention Is All You Need...
Paper ID: 1706.03762
[5] Testing Download Functionality:
Attempting to download: 1706.03762
PDF URL: https://arxiv.org/pdf/1706.03762.pdf
β Download successful!
File path: data/mcp_papers/1706.03762v7.pdf
File exists: True
File size: 2,215,520 bytes (2.11 MB)
[6] Storage Directory After Download:
Contains 1 PDF files
Files: ['1706.03762v7.pdf']
[7] Cleaning Up:
β MCP session closed
================================================================================
Diagnostic Test Complete
================================================================================
Interpreting Logs
Successful MCP Download
If MCP server works correctly, you'll see:
2025-11-12 01:50:27 - utils.mcp_arxiv_client - INFO - Downloading paper 2203.08975v2 via MCP
2025-11-12 01:50:27 - utils.mcp_arxiv_client - INFO - MCP download_paper response type: <class 'dict'>
2025-11-12 01:50:27 - utils.mcp_arxiv_client - INFO - Successfully downloaded paper to data/mcp_papers/2203.08975v2.pdf
Fallback to Direct Download
If MCP fails but direct download succeeds:
2025-11-12 01:50:27 - utils.mcp_arxiv_client - WARNING - File not found at expected path
2025-11-12 01:50:27 - utils.mcp_arxiv_client - ERROR - MCP download call completed but file not found
2025-11-12 01:50:27 - utils.mcp_arxiv_client - WARNING - Falling back to direct arXiv download...
2025-11-12 01:50:27 - utils.mcp_arxiv_client - INFO - Attempting direct download from arXiv for 2203.08975v2
2025-11-12 01:50:28 - utils.mcp_arxiv_client - INFO - Successfully downloaded 1234567 bytes to data/mcp_papers/2203.08975v2.pdf
Tool Discovery
At session initialization:
2025-11-12 01:50:26 - utils.mcp_arxiv_client - INFO - MCP server provides 3 tools:
2025-11-12 01:50:26 - utils.mcp_arxiv_client - INFO - - search_papers: Search arXiv for papers
2025-11-12 01:50:26 - utils.mcp_arxiv_client - INFO - - download_paper: Download paper PDF
2025-11-12 01:50:26 - utils.mcp_arxiv_client - INFO - - list_papers: List cached papers
Troubleshooting
Issue: MCP server not found
Symptom: Error during initialization: command not found: arxiv-mcp-server
Solution:
- Ensure MCP server is installed and in PATH
- Check server configuration in your MCP settings
- Try using direct ArxivClient instead:
export USE_MCP_ARXIV=false
Issue: Files still not downloading
Symptom: Both MCP and direct download fail
Possible causes:
- Network connectivity issues
- arXiv API rate limiting
- Invalid paper IDs
- Storage directory permissions
Debugging steps:
# Check network connectivity
curl https://arxiv.org/pdf/1706.03762.pdf -o test.pdf
# Check storage permissions
ls -la data/mcp_papers/
touch data/mcp_papers/test.txt
# Run diagnostic script
python test_mcp_diagnostic.py
Issue: MCP server uses different storage path
Symptom: MCP downloads succeed but client can't find files
Current solution: Direct download fallback handles this automatically
Future enhancement: Could add file transfer mechanism if MCP provides retrieval tools
Technical Details
Architecture Decision: Why Fallback Instead of File Transfer?
We chose direct download fallback over implementing a file transfer mechanism because:
- Server is third-party: Cannot modify MCP server to add file retrieval tools
- Simpler implementation: Direct download is straightforward and reliable
- Better performance: Avoids two-step download (server β client transfer)
- Same result: Client gets PDFs either way
- Fail-safe: Works even if MCP server is completely unavailable
Performance Impact
- MCP successful: No performance change (same as before)
- MCP fails: Extra ~2-5 seconds for direct download
- Network overhead: Same (one download either way)
- Storage: Client-side only (no redundant server storage)
Comparison with Direct ArxivClient
| Feature | MCPArxivClient (with fallback) | Direct ArxivClient |
|---|---|---|
| Search via MCP | β | β |
| Download via MCP | Tries first | β |
| Direct download | Fallback | Primary |
| Remote MCP server | β | N/A |
| File storage | Client-side | Client-side |
| Reliability | High (dual method) | High |
Future Enhancements
If MCP server capabilities expand, possible improvements:
- File retrieval tool: MCP server adds
get_file(paper_id)tool - Streaming transfer: MCP response includes base64-encoded PDF
- Shared storage: Configure MCP server to write to shared filesystem
- Batch downloads: Optimize multi-paper downloads
For now, the fallback solution provides robust, reliable downloads without requiring MCP server changes.
Files Modified
utils/mcp_arxiv_client.py- Core client with fallback logictest_mcp_diagnostic.py- New diagnostic scriptMCP_FIX_DOCUMENTATION.md- This document
Testing
Run the test suite to verify the fix:
# Test MCP client
pytest tests/test_mcp_arxiv_client.py -v
# Run diagnostic
python test_mcp_diagnostic.py
# Full integration test
python app.py
# Then use the Gradio UI to analyze papers with MCP enabled
Summary
The fix ensures reliable PDF downloads by combining MCP capabilities with direct arXiv fallback:
- β Preserves MCP functionality for servers that work correctly
- β Automatic fallback when MCP fails or files aren't accessible
- β No configuration changes required
- β Better diagnostics via tool discovery
- β Comprehensive logging for troubleshooting
- β Zero breaking changes to existing code
The system now works reliably with remote MCP servers, local servers, or no MCP at all.