Lineage-graph-accelerator / DEPLOYMENT.md
aamanlamba's picture
first version - lineage extractor
60ac2eb
# Quick Deployment Guide
Follow these steps to deploy the Lineage Graph Extractor to Hugging Face Spaces:
## Quick Start (5 minutes)
### 1. Create Space
```bash
# Go to: https://huggingface.co/new-space
# Choose: Gradio SDK
# Hardware: CPU Basic (free)
```
### 2. Upload Files
Upload these files from `/hf_space/` to your Space:
- βœ… `app.py`
- βœ… `requirements.txt`
- βœ… `README.md`
- ⚠️ `.env.example` (optional reference)
- ⚠️ `SETUP_GUIDE.md` (optional)
### 3. Add Secrets
In Space Settings β†’ Repository Secrets, add:
- `ANTHROPIC_API_KEY` - Your Claude API key (**required**)
- `GOOGLE_CLOUD_PROJECT` - For BigQuery (optional)
### 4. Wait for Build
- Space will automatically build (2-3 minutes)
- Check "Logs" tab for any errors
- Once ready, the app will be live!
## Detailed Step-by-Step
### Method 1: Web Interface (Easiest)
1. **Create Space**
- Go to https://huggingface.co/spaces
- Click "Create new Space"
- Name: `lineage-graph-extractor`
- SDK: Gradio
- Click "Create Space"
2. **Upload Files**
- Click "Files and versions"
- Click "Add file" β†’ "Upload files"
- Select all files from `/hf_space/`
- Click "Commit changes"
3. **Configure Secrets**
- Click "Settings"
- Scroll to "Repository secrets"
- Add `ANTHROPIC_API_KEY` with your API key
- Save
4. **Verify Deployment**
- Go to "App" tab
- Wait for build to complete
- Test the interface
### Method 2: Git CLI (For Developers)
```bash
# Clone your Space
git clone https://huggingface.co/spaces/YOUR_USERNAME/lineage-graph-extractor
cd lineage-graph-extractor
# Copy files (adjust path to where you saved the files)
cp /path/to/hf_space/app.py .
cp /path/to/hf_space/requirements.txt .
cp /path/to/hf_space/README.md .
# Commit and push
git add .
git commit -m "Initial deployment"
git push
```
Then add secrets via the web interface (Settings β†’ Repository secrets).
### Method 3: Hugging Face CLI
```bash
# Install Hugging Face CLI
pip install huggingface_hub
# Login
huggingface-cli login
# Create Space
huggingface-cli repo create lineage-graph-extractor --type space --space_sdk gradio
# Upload files
huggingface-cli upload lineage-graph-extractor /path/to/hf_space/ .
```
## Important: Connect Your Agent
⚠️ **The template needs your agent integration!**
The `app.py` file contains placeholder functions. You need to integrate your actual agent:
### Quick Integration Example
Edit `app.py` and replace the `extract_lineage_from_text` function:
```python
import anthropic
import os
client = anthropic.Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))
def extract_lineage_from_text(metadata_text, source_type, viz_format):
"""Extract lineage using Claude AI agent."""
prompt = f"""
You are a lineage extraction expert. Extract data lineage from this {source_type} metadata
and create a {viz_format} visualization.
Metadata:
{metadata_text}
Return:
1. The visualization code
2. A brief summary
"""
response = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=4000,
messages=[{"role": "user", "content": prompt}]
)
# Parse response to extract visualization and summary
text = response.content[0].text
# Simple parsing (improve this based on your needs)
parts = text.split("---")
visualization = parts[0] if len(parts) > 0 else text
summary = parts[1] if len(parts) > 1 else "Lineage extracted successfully"
return visualization.strip(), summary.strip()
```
### Using Agent Memory Files
To use your full agent configuration:
1. Copy `/memories/` directory to Space:
```bash
cp -r /memories /path/to/space/
```
2. Reference agent instructions in your code:
```python
with open("memories/agent.md") as f:
agent_instructions = f.read()
# Use instructions in prompts
```
## Post-Deployment
### Test Functionality
1. βœ… Text/File extraction works
2. βœ… BigQuery integration (if configured)
3. βœ… URL fetching works
4. βœ… Visualizations render correctly
### Optimize Performance
- Upgrade hardware if needed (Settings β†’ Hardware)
- Add caching for repeated queries
- Implement rate limiting
### Share Your Space
- Make it public (Settings β†’ Visibility)
- Share URL: `https://huggingface.co/spaces/YOUR_USERNAME/lineage-graph-extractor`
- Add to your profile or collection
## Costs
- **Basic CPU**: Free forever βœ…
- **Upgraded CPU**: ~$0.03/hour
- **GPU**: ~$0.60/hour (if needed for heavy processing)
- **API costs**: Anthropic Claude API usage (pay-as-you-go)
## Troubleshooting
### Build Fails
- Check requirements.txt for incompatible versions
- Review logs for specific error messages
- Ensure Python 3.9+ compatibility
### App Won't Load
- Verify `app.py` has no syntax errors
- Check that `demo.launch()` is called
- Review Space logs
### API Errors
- Verify `ANTHROPIC_API_KEY` is set correctly
- Check API key has proper permissions
- Monitor API usage and rate limits
### Visualization Issues
- Test Mermaid syntax at https://mermaid.live/
- Ensure proper code block formatting
- Check browser console for rendering errors
## Support
- **Hugging Face Docs**: https://huggingface.co/docs/hub/spaces
- **Gradio Docs**: https://gradio.app/docs
- **Community Forum**: https://discuss.huggingface.co/
## Next Steps
1. βœ… Deploy to Hugging Face Spaces
2. πŸ”§ Integrate your agent backend
3. πŸ§ͺ Test with real metadata
4. 🎨 Customize UI/UX
5. πŸ“Š Add analytics
6. πŸš€ Share with community
---
**Ready to deploy?** Start with Method 1 (Web Interface) - it's the easiest!