# Hugging Face Space Setup Guide ## Understanding Your Repositories You have **TWO** separate repositories: ### 1. Model Repository: `nitish-spz/ABTestPredictor` - **Purpose**: Store the model files - **Contains**: - `multimodal_gated_model_2.7_GGG.pth` (789 MB) - `multimodal_cat_mappings_GGG.json` - **Access**: Read-only, Space downloads from here ### 2. Space Repository: `SpiralyzeLLC/ABTestPredictor` - **Purpose**: Run the Gradio application - **Contains**: All application code + downloads model from repo #1 - **Access**: This is what deploys and runs ## Required Files in Space Repository Your Space needs these files (but NOT the large model file): ``` ✅ app.py # Main application code ✅ requirements.txt # Python dependencies ✅ packages.txt # System dependencies (tesseract-ocr) ✅ README.md # Project documentation ✅ confidence_scores.json # Confidence data (14KB) ✅ .gitattributes # Git LFS config ✅ .dockerignore # Build optimization ❌ model/ folder # NOT needed - downloads from model repo ❌ patterbs.json # NOT needed - removed feature ❌ metadata.js # NOT needed - removed feature ❌ confidence_scores.js # NOT needed - use .json instead ``` ## How the Model Loading Works Your `app.py` is configured to: 1. Check if model exists locally in `model/` folder 2. If not, download from `nitish-spz/ABTestPredictor` model repository 3. Cache it for future use ```python # In app.py lines 707-748 if os.path.exists(MODEL_SAVE_PATH): model_path = MODEL_SAVE_PATH print(f"✅ Using local model") else: print(f"📥 Downloading from Model Hub...") model_path = download_model_from_hub() ``` ## Deployment Steps ### Step 1: Verify Required Files Exist Locally ```bash cd /Users/nitish/Spiralyze/HuggingFace/Spaces/ABTestPredictor # Check essential files ls -lh app.py requirements.txt packages.txt README.md confidence_scores.json # Should all exist ``` ### Step 2: Remove Large/Unnecessary Files ```bash # Remove the local model folder (Space will download from model repo) rm -rf model/ # Remove unused files from old version rm -f patterbs.json metadata.js confidence_scores.js frontend.html index_v2.html ``` ### Step 3: Verify Git Remote Points to Space ```bash git remote -v # Should show: https://huggingface.co/spaces/SpiralyzeLLC/ABTestPredictor ``` ### Step 4: Commit and Push to Space ```bash # Add all files git add . # Commit git commit -m "Deploy: Add all application files, download model from hub" # Push to Space git push origin main ``` ### Step 5: Monitor Build 1. Go to https://huggingface.co/spaces/SpiralyzeLLC/ABTestPredictor 2. Click "Logs" tab 3. Watch the build progress 4. First build takes 5-10 minutes (downloading model) ## If Build Fails ### Check These Files Exist in Space Repo: ```bash # Essential files checklist app.py ✅ requirements.txt ✅ packages.txt ✅ README.md ✅ confidence_scores.json ✅ .dockerignore ✅ .gitattributes ✅ ``` ### Verify Model Repo is Accessible Your app downloads from `nitish-spz/ABTestPredictor`. Verify: 1. Go to https://huggingface.co/nitish-spz/ABTestPredictor 2. Check files are visible 3. Make sure it's **public** (not private) ### Check requirements.txt ```bash cat requirements.txt ``` Should contain: ``` torch transformers pandas scikit-learn Pillow gradio pytesseract spaces huggingface_hub python-dotenv ``` ### Check packages.txt ```bash cat packages.txt ``` Should contain: ``` tesseract-ocr ``` ## Common Issues ### Issue 1: "Model file not found" **Cause**: Model repo is private or inaccessible **Fix**: Make `nitish-spz/ABTestPredictor` public ### Issue 2: "No module named X" **Cause**: Missing dependency in requirements.txt **Fix**: Add the missing package to requirements.txt ### Issue 3: "Tesseract not found" **Cause**: Missing system dependency **Fix**: Ensure packages.txt contains `tesseract-ocr` ### Issue 4: Build hangs at "Installing requirements" **Cause**: PyTorch is large (~2GB) **Fix**: Wait 5-10 minutes, this is normal ## Space Configuration Your Space should have these settings: - **SDK**: Gradio - **SDK Version**: 4.44.0 - **Hardware**: GPU (recommended: T4 or better) - **Python Version**: 3.10 (default) - **Visibility**: Public or Private (your choice) ## File Size Limits - **Space repo**: Each file < 50MB (except LFS) - **Model repo**: Files > 10MB should use Git LFS - **Total Space size**: No hard limit, but keep it reasonable ## Success Indicators ✅ Build completes without errors ✅ Space status shows "Running" ✅ Can access the Gradio interface ✅ Making predictions returns results ✅ Logs show "Successfully loaded model" ## Expected First-Run Behavior ``` 🚀 Using device: cuda 🔥 GPU: Tesla T4 📥 Model not found locally, downloading from Model Hub... 📥 Downloading model from Hugging Face Model Hub: nitish-spz/ABTestPredictor ✅ Model downloaded to: /home/user/.cache/huggingface/... ✅ Successfully loaded GGG model weights ✅ Model and processors loaded successfully. Running on public URL: https://spiralyzellc-abtestpredictor.hf.space ``` ## Testing After Deployment ### Test 1: Web Interface 1. Visit your Space URL 2. Upload test images 3. Select categories 4. Click predict 5. Should see results in ~3-5 seconds ### Test 2: API Client ```python from gradio_client import Client client = Client("SpiralyzeLLC/ABTestPredictor") result = client.predict( "control.jpg", "variant.jpg", "SaaS", "B2B", "High-Intent Lead Gen", "B2B Software & Tech", "Awareness & Discovery", api_name="/predict_with_categorical_data" ) print(result) ``` ## Need Help? 1. Check Space logs for errors 2. Review DEPLOYMENT_FIX.md for detailed troubleshooting 3. Verify all required files are in Space repo 4. Ensure model repo is public and accessible