meetara / docs /deployment /huggingface-spaces.md
rameshbasina's picture
Remove QUICK_START.md and reorganize documentation structure; add detailed deployment guide for Hugging Face Spaces and performance optimization documentation for agent mode.
7002c4d

MeeTARA Hugging Face Spaces Deployment Guide

Quick Start ⭐

Option 1: Connect GitHub Repo (Recommended)

  1. Push to GitHub (if not already):

    git add .
    git commit -m "Add HF Space deployment files"
    git push
    
  2. Create Space on HF:

    • Go to https://huggingface.co/spaces
    • Click "Create new Space"
    • Select "Gradio" SDK
    • Choose "Connect to existing repo"
    • Select your GitHub repo: your-username/meetara
    • Click "Create Space"
  3. Done! HF will:

    • Install dependencies automatically
    • Run app.py
    • Download models from meetara-lab repos on first use

Option 2: Create Separate Space on HF

  1. Create Space:

  2. Clone and Copy Files:

    # Clone the Space repo
    git clone https://huggingface.co/spaces/meetara-lab/meetara-space
    cd meetara-space
    
    # Copy files from your repo
    cp -r /path/to/meetara/* .
    
    # Commit and push
    git add .
    git commit -m "Initial MeeTARA Space deployment"
    git push
    

Files Structure

meetara/
β”œβ”€β”€ app.py                 # Main Gradio application
β”œβ”€β”€ download_models.py     # Model downloader from HF Hub
β”œβ”€β”€ requirements.txt       # Python dependencies
β”œβ”€β”€ README.md             # Space documentation (shown on HF)
β”œβ”€β”€ core/                 # Core model/agent logic
β”œβ”€β”€ config/               # Configuration files
└── docs/                 # Documentation

What Gets Deployed

  • βœ… Gradio web interface (app.py)
  • βœ… Model downloader from HF Hub (download_models.py)
  • βœ… Dependencies (requirements.txt)
  • βœ… Space documentation (README.md)

Models

Models are automatically downloaded from your HF repos:

  • meetara-lab/meetara-qwen3-4b-instruct-gguf
  • meetara-lab/meetara-qwen3-4b-thinking-gguf
  • meetara-lab/meetara-qwen3-8b-gguf
  • meetara-lab/meetara-qwen3-1.7b-gguf

First Run

  1. Space builds automatically (takes 2-5 minutes)
  2. Click "Initialize" button in the UI
  3. Models download on first initialization (may take 5-10 minutes)
  4. Start chatting!

Resource Considerations

Free Tier Limits

  • CPU: 2 vCPU
  • RAM: 16GB
  • Storage: 50GB
  • Timeout: 60 seconds per request

Recommendations

  1. Start with 4B Instruct model only (2.3GB) for faster startup
  2. Use lazy loading - only load models when needed
  3. Optimize context size - reduce n_ctx in config for faster inference
  4. Consider CPU-only - disable GPU layers to save memory

Customization

Modify Model Selection

Edit download_models.py to change which models are downloaded.

Adjust Performance

Edit config/meetara_lab_config.json to optimize for Spaces:

  • Reduce n_ctx (context size)
  • Reduce n_threads (CPU threads)
  • Reduce max_tokens (response length)

Change UI

Edit app.py to customize the Gradio interface (themes, layout, features).


Troubleshooting

Models Not Downloading

  • Check HF token is set (usually automatic in Spaces)
  • Verify repo IDs are correct in download_models.py
  • Check Space logs for download errors

Import Errors

  • Ensure all files from core/ and config/ are accessible
  • Check Python path in app.py
  • Verify requirements.txt has all dependencies

Out of Memory

  • Use smaller models (1.7B instead of 4B/8B)
  • Reduce context size in config
  • Enable model offloading (offload_kqv=True)

Slow Startup

  • Pre-download models (use HF's persistent storage)
  • Reduce number of models loaded
  • Optimize initialization code

Testing Locally

Before deploying, test locally:

cd meetara
pip install -r requirements.txt
python app.py

Visit http://localhost:7860 to test.


Updating the Space

After making changes:

  1. Commit changes to GitHub (if using Option 1)
  2. HF Spaces auto-rebuilds on push
  3. Or manually rebuild in Space settings

Support

For issues: