pythermalcomfort_Chat / docs /HF_SPACE_CONFIG.md
sadickam's picture
Docker restart
6db591a

HuggingFace Space Configuration Guide

Overview

This guide explains how to configure the HuggingFace Space at sadickam/Pythermalcomfort-Chat for deployment.

The Space uses the Docker SDK, which means:

  • HuggingFace builds a Docker image from the repository's Dockerfile
  • The container runs on HuggingFace's infrastructure
  • Environment variables and secrets are injected at runtime
  • The application serves both the Next.js frontend and FastAPI backend

README File Structure

The HuggingFace Space requires a README.md file with specific YAML frontmatter for configuration. This repository is structured so that README.md serves as the Space configuration file directly, eliminating the need for file renaming during deployment.

File Structure

File Purpose
README.md HuggingFace Space configuration with YAML frontmatter, user-facing documentation
PROJECT_README.md Developer documentation, installation instructions, contribution guidelines

YAML Frontmatter Requirements

The README.md starts with this required frontmatter:

---
title: Pythermalcomfort Chat
emoji: 🌖
colorFrom: yellow
colorTo: red
sdk: docker
pinned: false
license: mit
---

This tells HuggingFace how to build and display the Space.


Required Secrets Configuration

Secret Reference Table

Secret Name Required Description
GEMINI_API_KEY Required Google Gemini API key - Primary LLM provider for generating responses
HF_TOKEN Required HuggingFace token for accessing the index dataset and query logging
DEEPSEEK_API_KEY Optional DeepSeek API key - First fallback provider if Gemini fails
ANTHROPIC_API_KEY Optional Anthropic Claude API key - Second fallback provider
GROQ_API_KEY Optional Groq API key - Third fallback provider for fast inference

Step-by-Step Configuration

  1. Navigate to the Space Settings

    https://huggingface.co/spaces/sadickam/Pythermalcomfort-Chat/settings
    
  2. Scroll to the "Repository secrets" section

    • This section is located under "Variables and secrets" in the Settings page
  3. Add each secret

    • Click "New secret"
    • Enter the secret name exactly as shown (e.g., GEMINI_API_KEY)
    • Paste the API key value
    • Click "Add"
    • Repeat for each secret
  4. Required secrets setup

    GEMINI_API_KEY (Required):

    # Obtain from: https://makersuite.google.com/app/apikey
    # This is the primary LLM provider - the chatbot will not function without it
    

    HF_TOKEN (Required):

    # Obtain from: https://huggingface.co/settings/tokens
    # Required permissions:
    #   - Read access to sadickam/pytherm_index (prebuilt indexes)
    #   - Write access to sadickam/Pytherm_Qlog (query logging)
    
  5. Optional fallback provider secrets

    DEEPSEEK_API_KEY (Optional):

    # Obtain from: https://platform.deepseek.com/
    # First fallback if Gemini is unavailable
    

    ANTHROPIC_API_KEY (Optional):

    # Obtain from: https://console.anthropic.com/
    # Second fallback provider
    

    GROQ_API_KEY (Optional):

    # Obtain from: https://console.groq.com/
    # Third fallback - offers fast inference times
    
  6. Restart the Space

    • After adding all secrets, click "Restart" in the Space header
    • The Space will rebuild and inject the new secrets

Hardware Settings

Recommended Configuration

Setting Value Reason
Hardware CPU Basic (Free) No GPU required for inference
Sleep timeout Default (48 hours) Adjust based on usage patterns

Why CPU is Sufficient

  1. Prebuilt Indexes: The FAISS and BM25 indexes are built offline with GPU acceleration and published to sadickam/pytherm_index. At startup, the Space downloads these prebuilt artifacts.

  2. No Local Embedding: Query embedding uses the same BGE model but runs efficiently on CPU for single queries.

  3. External LLM Providers: Response generation is handled by external API providers (Gemini, DeepSeek, etc.), not local models.

  4. Cost Optimization: The free CPU tier is sufficient for the expected load and keeps operational costs at zero.

Configuring Hardware

  1. Navigate to: https://huggingface.co/spaces/sadickam/Pythermalcomfort-Chat/settings
  2. Find the "Space hardware" section
  3. Select "CPU basic" from the dropdown
  4. Click "Save" (the Space will restart automatically)

Deployment Verification Checklist

Pre-flight Checks

  • All required secrets are configured (GEMINI_API_KEY, HF_TOKEN)
  • Hardware is set to "CPU Basic"
  • Space is set to "Public" or "Private" as needed

Build Verification

  • Space builds successfully
    • Check the "Logs" tab for build output
    • Build should complete without errors in 5-10 minutes
    • Look for "Application startup complete" in logs

Health Endpoint Checks

  1. Basic Health Check

    curl https://sadickam-pythermalcomfort-chat.hf.space/health
    

    Expected response:

    {"status": "healthy"}
    
    • Returns HTTP 200
  2. Readiness Check (Optional Diagnostic)

    curl https://sadickam-pythermalcomfort-chat.hf.space/health/ready
    

    Expected response:

    {
      "ready": true,
      "status": "ok"
    }
    
    • Returns HTTP 200 after resources have been loaded
    • If it returns HTTP 503 with {"ready": false, "status": "loading"}, send one query first to trigger lazy loading
  3. Provider Availability

    curl https://sadickam-pythermalcomfort-chat.hf.space/api/providers
    

    Expected response:

    {
      "available": ["gemini", ...],
      "primary": "gemini"
    }
    
    • Returns HTTP 200
    • At least one provider in available list
    • primary is set (typically "gemini")

Functional Test

  • Test a sample query through the UI
    1. Open https://sadickam-pythermalcomfort-chat.hf.space
    2. Wait for the interface to load
    3. Enter a test question: "What is PMV?"
    4. Verify:
      • Response streams in real-time
      • Response includes relevant information about PMV (Predicted Mean Vote)
      • Source citations are included

Troubleshooting

Space Stuck in "Building"

Symptoms:

  • Build process runs for more than 15 minutes
  • Build log shows no progress or loops

Solutions:

  1. Check Dockerfile syntax

    # Validate locally before pushing
    docker build -t test-build .
    
  2. Review build logs

    • Click "Logs" tab in the Space
    • Look for error messages or failed commands
    • Common issues: missing files, dependency conflicts
  3. Clear build cache

    • Go to Settings > "Factory reboot"
    • This clears cached layers and rebuilds from scratch

Health Check Failing

Symptoms:

  • /health returns 500 or connection refused
  • Space shows as "Running" but endpoints don't respond

Solutions:

  1. Verify secrets are configured

    • Go to Settings > "Repository secrets"
    • Confirm GEMINI_API_KEY and HF_TOKEN are present
    • Note: You cannot see secret values, only that they exist
  2. Check application logs

    # Look for startup errors in the Logs tab
    # Common messages:
    #   - "Missing required environment variable"
    #   - "Failed to initialize provider"
    
  3. Restart the Space

    • Click the three-dot menu > "Restart"
    • Wait 2-3 minutes for full startup

No Providers Available

Symptoms:

  • /api/providers returns {"available": [], "primary": null}
  • Chat interface shows "No providers available" error

Solutions:

  1. Verify API keys are correct

    • Regenerate the API key from the provider's console
    • Update the secret in HuggingFace Space settings
    • Restart the Space
  2. Check provider status

    • Verify the provider's API is operational
    • Check for rate limiting or account issues
  3. Review provider logs

    # Look for these patterns in logs:
    #   - "API key invalid"
    #   - "Rate limit exceeded"
    #   - "Provider initialization failed"
    

Index Loading Failures

Symptoms:

  • /health/ready keeps returning HTTP 503 with {"ready": false, "status": "loading"} even after sending a query
  • Logs show "Failed to download artifacts"

Solutions:

  1. Verify HF_TOKEN permissions

  2. Check dataset availability

  3. Manual verification

    # Test token access locally
    curl -H "Authorization: Bearer $HF_TOKEN" \
      https://huggingface.co/api/datasets/sadickam/pytherm_index
    
  4. Check disk space

    • The index files require ~500MB of storage
    • HuggingFace Spaces have limited ephemeral storage
    • Consider reducing index size if this is an issue

Slow Response Times

Symptoms:

  • Queries take more than 30 seconds
  • Responses time out frequently

Solutions:

  1. Check provider latency

    • The primary provider (Gemini) may be experiencing high load
    • Fallback providers will be tried automatically
  2. Verify hybrid retrieval settings

    # In environment or settings:
    USE_HYBRID=true   # Enable both FAISS and BM25
    TOP_K=6           # Reduce if responses are slow
    
  3. Monitor Space resources

    • Check the "Metrics" tab for CPU/memory usage
    • Consider upgrading hardware if consistently maxed out

Environment Variables Reference

Secrets (Configure in Space Settings)

Variable Required Description
GEMINI_API_KEY Yes Google Gemini API key
HF_TOKEN Yes HuggingFace access token
DEEPSEEK_API_KEY No DeepSeek API key
ANTHROPIC_API_KEY No Anthropic API key
GROQ_API_KEY No Groq API key

Configuration (Can be set in Dockerfile)

Variable Default Description
USE_HYBRID true Enable hybrid retrieval (FAISS + BM25)
TOP_K 6 Number of chunks to retrieve
PROVIDER_TIMEOUT_MS 30000 Timeout before trying fallback provider
LOG_LEVEL INFO Application log level

Additional Resources