Spaces:
Running
HuggingFace Space Configuration Guide
Overview
This guide explains how to configure the HuggingFace Space at sadickam/Pythermalcomfort-Chat for deployment.
The Space uses the Docker SDK, which means:
- HuggingFace builds a Docker image from the repository's
Dockerfile - The container runs on HuggingFace's infrastructure
- Environment variables and secrets are injected at runtime
- The application serves both the Next.js frontend and FastAPI backend
README File Structure
The HuggingFace Space requires a README.md file with specific YAML frontmatter for configuration. This repository is structured so that README.md serves as the Space configuration file directly, eliminating the need for file renaming during deployment.
File Structure
| File | Purpose |
|---|---|
README.md |
HuggingFace Space configuration with YAML frontmatter, user-facing documentation |
PROJECT_README.md |
Developer documentation, installation instructions, contribution guidelines |
YAML Frontmatter Requirements
The README.md starts with this required frontmatter:
---
title: Pythermalcomfort Chat
emoji: 🌖
colorFrom: yellow
colorTo: red
sdk: docker
pinned: false
license: mit
---
This tells HuggingFace how to build and display the Space.
Required Secrets Configuration
Secret Reference Table
| Secret Name | Required | Description |
|---|---|---|
GEMINI_API_KEY |
Required | Google Gemini API key - Primary LLM provider for generating responses |
HF_TOKEN |
Required | HuggingFace token for accessing the index dataset and query logging |
DEEPSEEK_API_KEY |
Optional | DeepSeek API key - First fallback provider if Gemini fails |
ANTHROPIC_API_KEY |
Optional | Anthropic Claude API key - Second fallback provider |
GROQ_API_KEY |
Optional | Groq API key - Third fallback provider for fast inference |
Step-by-Step Configuration
Navigate to the Space Settings
https://huggingface.co/spaces/sadickam/Pythermalcomfort-Chat/settingsScroll to the "Repository secrets" section
- This section is located under "Variables and secrets" in the Settings page
Add each secret
- Click "New secret"
- Enter the secret name exactly as shown (e.g.,
GEMINI_API_KEY) - Paste the API key value
- Click "Add"
- Repeat for each secret
Required secrets setup
GEMINI_API_KEY (Required):
# Obtain from: https://makersuite.google.com/app/apikey # This is the primary LLM provider - the chatbot will not function without itHF_TOKEN (Required):
# Obtain from: https://huggingface.co/settings/tokens # Required permissions: # - Read access to sadickam/pytherm_index (prebuilt indexes) # - Write access to sadickam/Pytherm_Qlog (query logging)Optional fallback provider secrets
DEEPSEEK_API_KEY (Optional):
# Obtain from: https://platform.deepseek.com/ # First fallback if Gemini is unavailableANTHROPIC_API_KEY (Optional):
# Obtain from: https://console.anthropic.com/ # Second fallback providerGROQ_API_KEY (Optional):
# Obtain from: https://console.groq.com/ # Third fallback - offers fast inference timesRestart the Space
- After adding all secrets, click "Restart" in the Space header
- The Space will rebuild and inject the new secrets
Hardware Settings
Recommended Configuration
| Setting | Value | Reason |
|---|---|---|
| Hardware | CPU Basic (Free) | No GPU required for inference |
| Sleep timeout | Default (48 hours) | Adjust based on usage patterns |
Why CPU is Sufficient
Prebuilt Indexes: The FAISS and BM25 indexes are built offline with GPU acceleration and published to
sadickam/pytherm_index. At startup, the Space downloads these prebuilt artifacts.No Local Embedding: Query embedding uses the same BGE model but runs efficiently on CPU for single queries.
External LLM Providers: Response generation is handled by external API providers (Gemini, DeepSeek, etc.), not local models.
Cost Optimization: The free CPU tier is sufficient for the expected load and keeps operational costs at zero.
Configuring Hardware
- Navigate to:
https://huggingface.co/spaces/sadickam/Pythermalcomfort-Chat/settings - Find the "Space hardware" section
- Select "CPU basic" from the dropdown
- Click "Save" (the Space will restart automatically)
Deployment Verification Checklist
Pre-flight Checks
- All required secrets are configured (
GEMINI_API_KEY,HF_TOKEN) - Hardware is set to "CPU Basic"
- Space is set to "Public" or "Private" as needed
Build Verification
- Space builds successfully
- Check the "Logs" tab for build output
- Build should complete without errors in 5-10 minutes
- Look for "Application startup complete" in logs
Health Endpoint Checks
Basic Health Check
curl https://sadickam-pythermalcomfort-chat.hf.space/healthExpected response:
{"status": "healthy"}- Returns HTTP 200
Readiness Check (Optional Diagnostic)
curl https://sadickam-pythermalcomfort-chat.hf.space/health/readyExpected response:
{ "ready": true, "status": "ok" }- Returns HTTP 200 after resources have been loaded
- If it returns HTTP 503 with
{"ready": false, "status": "loading"}, send one query first to trigger lazy loading
Provider Availability
curl https://sadickam-pythermalcomfort-chat.hf.space/api/providersExpected response:
{ "available": ["gemini", ...], "primary": "gemini" }- Returns HTTP 200
- At least one provider in
availablelist -
primaryis set (typically "gemini")
Functional Test
- Test a sample query through the UI
- Open
https://sadickam-pythermalcomfort-chat.hf.space - Wait for the interface to load
- Enter a test question: "What is PMV?"
- Verify:
- Response streams in real-time
- Response includes relevant information about PMV (Predicted Mean Vote)
- Source citations are included
- Open
Troubleshooting
Space Stuck in "Building"
Symptoms:
- Build process runs for more than 15 minutes
- Build log shows no progress or loops
Solutions:
Check Dockerfile syntax
# Validate locally before pushing docker build -t test-build .Review build logs
- Click "Logs" tab in the Space
- Look for error messages or failed commands
- Common issues: missing files, dependency conflicts
Clear build cache
- Go to Settings > "Factory reboot"
- This clears cached layers and rebuilds from scratch
Health Check Failing
Symptoms:
/healthreturns 500 or connection refused- Space shows as "Running" but endpoints don't respond
Solutions:
Verify secrets are configured
- Go to Settings > "Repository secrets"
- Confirm
GEMINI_API_KEYandHF_TOKENare present - Note: You cannot see secret values, only that they exist
Check application logs
# Look for startup errors in the Logs tab # Common messages: # - "Missing required environment variable" # - "Failed to initialize provider"Restart the Space
- Click the three-dot menu > "Restart"
- Wait 2-3 minutes for full startup
No Providers Available
Symptoms:
/api/providersreturns{"available": [], "primary": null}- Chat interface shows "No providers available" error
Solutions:
Verify API keys are correct
- Regenerate the API key from the provider's console
- Update the secret in HuggingFace Space settings
- Restart the Space
Check provider status
- Verify the provider's API is operational
- Check for rate limiting or account issues
Review provider logs
# Look for these patterns in logs: # - "API key invalid" # - "Rate limit exceeded" # - "Provider initialization failed"
Index Loading Failures
Symptoms:
/health/readykeeps returning HTTP 503 with{"ready": false, "status": "loading"}even after sending a query- Logs show "Failed to download artifacts"
Solutions:
Verify HF_TOKEN permissions
- Go to https://huggingface.co/settings/tokens
- Ensure the token has "Read" access to
sadickam/pytherm_index - If using a fine-grained token, add explicit repo access
Check dataset availability
- Visit https://huggingface.co/datasets/sadickam/pytherm_index
- Verify the dataset exists and is accessible
- Check if the dataset is private and token has access
Manual verification
# Test token access locally curl -H "Authorization: Bearer $HF_TOKEN" \ https://huggingface.co/api/datasets/sadickam/pytherm_indexCheck disk space
- The index files require ~500MB of storage
- HuggingFace Spaces have limited ephemeral storage
- Consider reducing index size if this is an issue
Slow Response Times
Symptoms:
- Queries take more than 30 seconds
- Responses time out frequently
Solutions:
Check provider latency
- The primary provider (Gemini) may be experiencing high load
- Fallback providers will be tried automatically
Verify hybrid retrieval settings
# In environment or settings: USE_HYBRID=true # Enable both FAISS and BM25 TOP_K=6 # Reduce if responses are slowMonitor Space resources
- Check the "Metrics" tab for CPU/memory usage
- Consider upgrading hardware if consistently maxed out
Environment Variables Reference
Secrets (Configure in Space Settings)
| Variable | Required | Description |
|---|---|---|
GEMINI_API_KEY |
Yes | Google Gemini API key |
HF_TOKEN |
Yes | HuggingFace access token |
DEEPSEEK_API_KEY |
No | DeepSeek API key |
ANTHROPIC_API_KEY |
No | Anthropic API key |
GROQ_API_KEY |
No | Groq API key |
Configuration (Can be set in Dockerfile)
| Variable | Default | Description |
|---|---|---|
USE_HYBRID |
true |
Enable hybrid retrieval (FAISS + BM25) |
TOP_K |
6 |
Number of chunks to retrieve |
PROVIDER_TIMEOUT_MS |
30000 |
Timeout before trying fallback provider |
LOG_LEVEL |
INFO |
Application log level |