Spaces:

sadickam
/

pythermalcomfort_Chat

Running

App Files Files Community

pythermalcomfort_Chat / docs /HF_SPACE_CONFIG.md

sadickam

Docker restart

6db591a 2 months ago

preview code

raw

history blame contribute delete

13.2 kB

HuggingFace Space Configuration Guide

Overview

This guide explains how to configure the HuggingFace Space at sadickam/Pythermalcomfort-Chat for deployment.

The Space uses the Docker SDK, which means:

HuggingFace builds a Docker image from the repository's Dockerfile
The container runs on HuggingFace's infrastructure
Environment variables and secrets are injected at runtime
The application serves both the Next.js frontend and FastAPI backend

README File Structure

The HuggingFace Space requires a README.md file with specific YAML frontmatter for configuration. This repository is structured so that README.md serves as the Space configuration file directly, eliminating the need for file renaming during deployment.

File Structure

File	Purpose
`README.md`	HuggingFace Space configuration with YAML frontmatter, user-facing documentation
`PROJECT_README.md`	Developer documentation, installation instructions, contribution guidelines

YAML Frontmatter Requirements

The README.md starts with this required frontmatter:

---
title: Pythermalcomfort Chat
emoji: 🌖
colorFrom: yellow
colorTo: red
sdk: docker
pinned: false
license: mit
---

This tells HuggingFace how to build and display the Space.

Required Secrets Configuration

Secret Reference Table

Secret Name	Required	Description
`GEMINI_API_KEY`	Required	Google Gemini API key - Primary LLM provider for generating responses
`HF_TOKEN`	Required	HuggingFace token for accessing the index dataset and query logging
`DEEPSEEK_API_KEY`	Optional	DeepSeek API key - First fallback provider if Gemini fails
`ANTHROPIC_API_KEY`	Optional	Anthropic Claude API key - Second fallback provider
`GROQ_API_KEY`	Optional	Groq API key - Third fallback provider for fast inference

Step-by-Step Configuration

Navigate to the Space Settings

https://huggingface.co/spaces/sadickam/Pythermalcomfort-Chat/settings

Scroll to the "Repository secrets" section
- This section is located under "Variables and secrets" in the Settings page
Add each secret
- Click "New secret"
- Enter the secret name exactly as shown (e.g., GEMINI_API_KEY)
- Paste the API key value
- Click "Add"
- Repeat for each secret

Required secrets setup

GEMINI_API_KEY (Required):

# Obtain from: https://makersuite.google.com/app/apikey
# This is the primary LLM provider - the chatbot will not function without it

HF_TOKEN (Required):

# Obtain from: https://huggingface.co/settings/tokens
# Required permissions:
#   - Read access to sadickam/pytherm_index (prebuilt indexes)
#   - Write access to sadickam/Pytherm_Qlog (query logging)

Optional fallback provider secrets

DEEPSEEK_API_KEY (Optional):

# Obtain from: https://platform.deepseek.com/
# First fallback if Gemini is unavailable

ANTHROPIC_API_KEY (Optional):

# Obtain from: https://console.anthropic.com/
# Second fallback provider

GROQ_API_KEY (Optional):

# Obtain from: https://console.groq.com/
# Third fallback - offers fast inference times

Restart the Space
- After adding all secrets, click "Restart" in the Space header
- The Space will rebuild and inject the new secrets

Hardware Settings

Recommended Configuration

Setting	Value	Reason
Hardware	CPU Basic (Free)	No GPU required for inference
Sleep timeout	Default (48 hours)	Adjust based on usage patterns

Why CPU is Sufficient

Prebuilt Indexes: The FAISS and BM25 indexes are built offline with GPU acceleration and published to sadickam/pytherm_index. At startup, the Space downloads these prebuilt artifacts.
No Local Embedding: Query embedding uses the same BGE model but runs efficiently on CPU for single queries.
External LLM Providers: Response generation is handled by external API providers (Gemini, DeepSeek, etc.), not local models.
Cost Optimization: The free CPU tier is sufficient for the expected load and keeps operational costs at zero.

Configuring Hardware

Navigate to: https://huggingface.co/spaces/sadickam/Pythermalcomfort-Chat/settings
Find the "Space hardware" section
Select "CPU basic" from the dropdown
Click "Save" (the Space will restart automatically)

Deployment Verification Checklist

Pre-flight Checks

All required secrets are configured (GEMINI_API_KEY, HF_TOKEN)
Hardware is set to "CPU Basic"
Space is set to "Public" or "Private" as needed

Build Verification

Space builds successfully
- Check the "Logs" tab for build output
- Build should complete without errors in 5-10 minutes
- Look for "Application startup complete" in logs

Health Endpoint Checks

Basic Health Check

curl https://sadickam-pythermalcomfort-chat.hf.space/health

Expected response:

{"status": "healthy"}

Returns HTTP 200

Readiness Check (Optional Diagnostic)
```
curl https://sadickam-pythermalcomfort-chat.hf.space/health/ready
```
Expected response:
```
{
  "ready": true,
  "status": "ok"
}
```
- Returns HTTP 200 after resources have been loaded
- If it returns HTTP 503 with {"ready": false, "status": "loading"}, send one query first to trigger lazy loading

Provider Availability

curl https://sadickam-pythermalcomfort-chat.hf.space/api/providers

Expected response:

{
  "available": ["gemini", ...],
  "primary": "gemini"
}

Returns HTTP 200
At least one provider in available list
primary is set (typically "gemini")

Functional Test

Test a sample query through the UI
1. Open https://sadickam-pythermalcomfort-chat.hf.space
2. Wait for the interface to load
3. Enter a test question: "What is PMV?"
4. Verify:
  - Response streams in real-time
  - Response includes relevant information about PMV (Predicted Mean Vote)
  - Source citations are included

Troubleshooting

Space Stuck in "Building"

Symptoms:

Build process runs for more than 15 minutes
Build log shows no progress or loops

Solutions:

Check Dockerfile syntax

# Validate locally before pushing
docker build -t test-build .

Review build logs
- Click "Logs" tab in the Space
- Look for error messages or failed commands
- Common issues: missing files, dependency conflicts
Clear build cache
- Go to Settings > "Factory reboot"
- This clears cached layers and rebuilds from scratch

Health Check Failing

Symptoms:

/health returns 500 or connection refused
Space shows as "Running" but endpoints don't respond

Solutions:

Verify secrets are configured
- Go to Settings > "Repository secrets"
- Confirm GEMINI_API_KEY and HF_TOKEN are present
- Note: You cannot see secret values, only that they exist

Check application logs

# Look for startup errors in the Logs tab
# Common messages:
#   - "Missing required environment variable"
#   - "Failed to initialize provider"

Restart the Space
- Click the three-dot menu > "Restart"
- Wait 2-3 minutes for full startup

No Providers Available

Symptoms:

/api/providers returns {"available": [], "primary": null}
Chat interface shows "No providers available" error

Solutions:

Verify API keys are correct
- Regenerate the API key from the provider's console
- Update the secret in HuggingFace Space settings
- Restart the Space
Check provider status
- Verify the provider's API is operational
- Check for rate limiting or account issues

Review provider logs

# Look for these patterns in logs:
#   - "API key invalid"
#   - "Rate limit exceeded"
#   - "Provider initialization failed"

Index Loading Failures

Symptoms:

/health/ready keeps returning HTTP 503 with {"ready": false, "status": "loading"} even after sending a query
Logs show "Failed to download artifacts"

Solutions:

Verify HF_TOKEN permissions
- Go to https://huggingface.co/settings/tokens
- Ensure the token has "Read" access to sadickam/pytherm_index
- If using a fine-grained token, add explicit repo access
Check dataset availability
- Visit https://huggingface.co/datasets/sadickam/pytherm_index
- Verify the dataset exists and is accessible
- Check if the dataset is private and token has access

Manual verification

# Test token access locally
curl -H "Authorization: Bearer $HF_TOKEN" \
  https://huggingface.co/api/datasets/sadickam/pytherm_index

Check disk space
- The index files require ~500MB of storage
- HuggingFace Spaces have limited ephemeral storage
- Consider reducing index size if this is an issue

Slow Response Times

Symptoms:

Queries take more than 30 seconds
Responses time out frequently

Solutions:

Check provider latency
- The primary provider (Gemini) may be experiencing high load
- Fallback providers will be tried automatically

Verify hybrid retrieval settings

# In environment or settings:
USE_HYBRID=true   # Enable both FAISS and BM25
TOP_K=6           # Reduce if responses are slow

Monitor Space resources
- Check the "Metrics" tab for CPU/memory usage
- Consider upgrading hardware if consistently maxed out

Environment Variables Reference

Secrets (Configure in Space Settings)

Variable	Required	Description
`GEMINI_API_KEY`	Yes	Google Gemini API key
`HF_TOKEN`	Yes	HuggingFace access token
`DEEPSEEK_API_KEY`	No	DeepSeek API key
`ANTHROPIC_API_KEY`	No	Anthropic API key
`GROQ_API_KEY`	No	Groq API key

Configuration (Can be set in Dockerfile)

Variable	Default	Description
`USE_HYBRID`	`true`	Enable hybrid retrieval (FAISS + BM25)
`TOP_K`	`6`	Number of chunks to retrieve
`PROVIDER_TIMEOUT_MS`	`30000`	Timeout before trying fallback provider
`LOG_LEVEL`	`INFO`	Application log level

HuggingFace Space Configuration Guide

Overview

README File Structure

File Structure

YAML Frontmatter Requirements

Required Secrets Configuration

Secret Reference Table

Step-by-Step Configuration

Hardware Settings

Recommended Configuration

Why CPU is Sufficient

Configuring Hardware

Deployment Verification Checklist

Pre-flight Checks

Build Verification

Health Endpoint Checks

Functional Test

Troubleshooting

Space Stuck in "Building"

Health Check Failing

No Providers Available

Index Loading Failures

Slow Response Times

Environment Variables Reference

Secrets (Configure in Space Settings)

Configuration (Can be set in Dockerfile)

Additional Resources