ABTestPredictor / SPACE_SETUP_GUIDE.md
nitish-spz's picture
Deploy: Optimize for Hugging Face Spaces - remove AI APIs and unnecessary files
ec4c5af

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

Hugging Face Space Setup Guide

Understanding Your Repositories

You have TWO separate repositories:

1. Model Repository: nitish-spz/ABTestPredictor

  • Purpose: Store the model files
  • Contains:
    • multimodal_gated_model_2.7_GGG.pth (789 MB)
    • multimodal_cat_mappings_GGG.json
  • Access: Read-only, Space downloads from here

2. Space Repository: SpiralyzeLLC/ABTestPredictor

  • Purpose: Run the Gradio application
  • Contains: All application code + downloads model from repo #1
  • Access: This is what deploys and runs

Required Files in Space Repository

Your Space needs these files (but NOT the large model file):

βœ… app.py                          # Main application code
βœ… requirements.txt                # Python dependencies  
βœ… packages.txt                    # System dependencies (tesseract-ocr)
βœ… README.md                       # Project documentation
βœ… confidence_scores.json          # Confidence data (14KB)
βœ… .gitattributes                  # Git LFS config
βœ… .dockerignore                   # Build optimization
❌ model/ folder                   # NOT needed - downloads from model repo
❌ patterbs.json                   # NOT needed - removed feature
❌ metadata.js                     # NOT needed - removed feature
❌ confidence_scores.js            # NOT needed - use .json instead

How the Model Loading Works

Your app.py is configured to:

  1. Check if model exists locally in model/ folder
  2. If not, download from nitish-spz/ABTestPredictor model repository
  3. Cache it for future use
# In app.py lines 707-748
if os.path.exists(MODEL_SAVE_PATH):
    model_path = MODEL_SAVE_PATH
    print(f"βœ… Using local model")
else:
    print(f"πŸ“₯ Downloading from Model Hub...")
    model_path = download_model_from_hub()

Deployment Steps

Step 1: Verify Required Files Exist Locally

cd /Users/nitish/Spiralyze/HuggingFace/Spaces/ABTestPredictor

# Check essential files
ls -lh app.py requirements.txt packages.txt README.md confidence_scores.json

# Should all exist

Step 2: Remove Large/Unnecessary Files

# Remove the local model folder (Space will download from model repo)
rm -rf model/

# Remove unused files from old version
rm -f patterbs.json metadata.js confidence_scores.js frontend.html index_v2.html

Step 3: Verify Git Remote Points to Space

git remote -v
# Should show: https://huggingface.co/spaces/SpiralyzeLLC/ABTestPredictor

Step 4: Commit and Push to Space

# Add all files
git add .

# Commit
git commit -m "Deploy: Add all application files, download model from hub"

# Push to Space
git push origin main

Step 5: Monitor Build

  1. Go to https://huggingface.co/spaces/SpiralyzeLLC/ABTestPredictor
  2. Click "Logs" tab
  3. Watch the build progress
  4. First build takes 5-10 minutes (downloading model)

If Build Fails

Check These Files Exist in Space Repo:

# Essential files checklist
app.py                  βœ…
requirements.txt        βœ…
packages.txt           βœ…
README.md              βœ…
confidence_scores.json βœ…
.dockerignore          βœ…
.gitattributes         βœ…

Verify Model Repo is Accessible

Your app downloads from nitish-spz/ABTestPredictor. Verify:

  1. Go to https://huggingface.co/nitish-spz/ABTestPredictor
  2. Check files are visible
  3. Make sure it's public (not private)

Check requirements.txt

cat requirements.txt

Should contain:

torch
transformers
pandas
scikit-learn
Pillow
gradio
pytesseract
spaces
huggingface_hub
python-dotenv

Check packages.txt

cat packages.txt

Should contain:

tesseract-ocr

Common Issues

Issue 1: "Model file not found"

Cause: Model repo is private or inaccessible Fix: Make nitish-spz/ABTestPredictor public

Issue 2: "No module named X"

Cause: Missing dependency in requirements.txt Fix: Add the missing package to requirements.txt

Issue 3: "Tesseract not found"

Cause: Missing system dependency Fix: Ensure packages.txt contains tesseract-ocr

Issue 4: Build hangs at "Installing requirements"

Cause: PyTorch is large (~2GB) Fix: Wait 5-10 minutes, this is normal

Space Configuration

Your Space should have these settings:

  • SDK: Gradio
  • SDK Version: 4.44.0
  • Hardware: GPU (recommended: T4 or better)
  • Python Version: 3.10 (default)
  • Visibility: Public or Private (your choice)

File Size Limits

  • Space repo: Each file < 50MB (except LFS)
  • Model repo: Files > 10MB should use Git LFS
  • Total Space size: No hard limit, but keep it reasonable

Success Indicators

βœ… Build completes without errors βœ… Space status shows "Running"
βœ… Can access the Gradio interface βœ… Making predictions returns results βœ… Logs show "Successfully loaded model"

Expected First-Run Behavior

πŸš€ Using device: cuda
πŸ”₯ GPU: Tesla T4
πŸ“₯ Model not found locally, downloading from Model Hub...
πŸ“₯ Downloading model from Hugging Face Model Hub: nitish-spz/ABTestPredictor
βœ… Model downloaded to: /home/user/.cache/huggingface/...
βœ… Successfully loaded GGG model weights
βœ… Model and processors loaded successfully.
Running on public URL: https://spiralyzellc-abtestpredictor.hf.space

Testing After Deployment

Test 1: Web Interface

  1. Visit your Space URL
  2. Upload test images
  3. Select categories
  4. Click predict
  5. Should see results in ~3-5 seconds

Test 2: API Client

from gradio_client import Client

client = Client("SpiralyzeLLC/ABTestPredictor")
result = client.predict(
    "control.jpg",
    "variant.jpg", 
    "SaaS", "B2B", "High-Intent Lead Gen",
    "B2B Software & Tech", "Awareness & Discovery",
    api_name="/predict_with_categorical_data"
)
print(result)

Need Help?

  1. Check Space logs for errors
  2. Review DEPLOYMENT_FIX.md for detailed troubleshooting
  3. Verify all required files are in Space repo
  4. Ensure model repo is public and accessible