hugstream-upload / README.md
sachnun's picture
Update documentation to reflect session token authentication
a18c954
metadata
title: Hugstream Upload
emoji: πŸš€
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
app_port: 7860

Hugstream Upload Server

External upload server for Hugstream file storage system. This server handles all file uploads to Hugging Face Dataset, offloading bandwidth and processing from the main VPS.

Features

  • Proxy Upload to Hugging Face: Receives files from main app and uploads to HF Dataset
  • Hash Verification: Validates MD5 hash to ensure data integrity
  • Authentication: Bearer token authentication to prevent unauthorized access
  • Deduplication: Automatic file deduplication using MD5 hash
  • CORS Support: Configurable CORS for cross-origin requests
  • Health Check: /health endpoint for monitoring

Deployment

Hugging Face Spaces (Recommended)

  1. Create a new Docker Space on Hugging Face
  2. Push this repository to the Space
  3. Configure Secrets in Space settings:
    • DATABASE_URL: Your Turso database URL (same as main app)
    • DATABASE_AUTH_TOKEN: Your Turso auth token
    • HF_TOKEN: Your Hugging Face API token
    • HF_DATASET_REPO: Your dataset repository (e.g., username/dataset-name)
    • UPLOAD_SERVER_TOKEN: Generate a secure random token
    • ALLOWED_ORIGIN: Your hugstream domain (optional)

Local Development

# Install dependencies
npm install

# Create .env file from example
cp .env.example .env

# Edit .env with your credentials
nano .env

# Run development server
npm run dev

# Build for production
npm run build

# Run production server
npm start

API Endpoints

POST /upload

Upload a file to Hugging Face Dataset.

Authentication: Bearer token required in Authorization header

Request (multipart/form-data):

  • file: File to upload
  • sessionToken: User session token from auth-session cookie (for authentication)
  • hash: MD5 hash of the file
  • filename: Original filename

Response:

{
  "success": true,
  "hfPath": "abc123...",
  "message": "File uploaded successfully",
  "size": 12345,
  "hash": "abc123..."
}

GET /health

Health check endpoint.

Response:

{
  "status": "ok",
  "service": "hugstream-upload",
  "timestamp": "2025-01-01T00:00:00.000Z"
}

Environment Variables

Variable Required Description
DATABASE_URL Yes Turso/LibSQL database URL for user validation
DATABASE_AUTH_TOKEN Yes Turso authentication token
HF_TOKEN Yes Hugging Face API token with write access
HF_DATASET_REPO Yes HF Dataset repository (format: username/repo)
UPLOAD_SERVER_TOKEN Yes Secure token for authentication
ALLOWED_ORIGIN No CORS origin (default: *)
PORT No Server port (default: 7860)

Integration with Main App

The main Hugstream application (hfStorage.ts) should be configured to send upload requests to this server instead of uploading directly to Hugging Face.

Environment variables needed in main app:

UPLOAD_SERVER_URL=https://your-space.hf.space
UPLOAD_SERVER_TOKEN=same_token_as_upload_server

Security Notes

  • Never expose HF_TOKEN to the client - it stays secure in this server
  • User Validation: Each upload validates user exists in database before processing
  • Use HTTPS in production - enforce in ALLOWED_ORIGIN
  • Keep UPLOAD_SERVER_TOKEN secret - share only between main app and upload server
  • Database Access: Uses same Turso database as main app for user validation
  • Token authentication prevents unauthorized uploads

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Browser       β”‚
β”‚   (User Client) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ POST /upload
         β”‚ (with auth token + sessionToken)
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Upload Server  │───────>β”‚  Turso Database β”‚
β”‚  (HF Spaces)    β”‚        β”‚  (Session Validation)β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚ Upload file
         β”‚ (with HF_TOKEN)
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Hugging Face   β”‚
β”‚  Dataset        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Benefits

  1. Direct Browser Upload: Browser uploads directly to HF Space - zero VPS bandwidth usage
  2. Security:
    • HF_TOKEN never exposed to client, stays secure in upload server
    • Session-based authentication prevents upload to other users' accounts
    • Upload server validates session token with database before accepting upload
  3. Global CDN: HF Spaces runs on edge network for fast uploads worldwide
  4. Deduplication: Automatic hash-based deduplication saves storage space
  5. Scalability: Zero VPS resources used for file uploads