# OllamaSpace Technical Specifications ## Project Overview OllamaSpace is a web-based chat application that serves as a frontend interface for interacting with Ollama language models. The application provides a real-time chat interface where users can communicate with AI models through a web browser. ## Architecture ### Backend - **Framework**: FastAPI (Python) - **API Gateway**: Acts as a proxy between the frontend and Ollama API - **Streaming**: Supports real-time streaming of model responses - **Default Model**: qwen3:4b ### Frontend - **Technology**: Pure HTML/CSS/JavaScript (no frameworks) - **Interface**: Simple chat interface with message history - **Interaction**: Real-time message streaming with typing indicators - **Styling**: Clean, minimal design with distinct user/bot message styling ## Components ### main.py - **Framework**: FastAPI - **Authentication**: Implements Bearer token authentication using HTTPBearer - **Endpoints**: - `GET /` - Redirects to `/chat` - `GET /chat` - Serves the chat HTML page - `POST /chat_api` - API endpoint that forwards requests to Ollama (requires authentication) - **Functionality**: - Proxies requests to local Ollama API (http://localhost:11434) - Streams model responses back to the frontend - Handles error cases and validation - Auto-generates secure API key if not provided via environment variable ### chat.html - **Template**: HTML structure for the chat interface with API key management - **Layout**: - Header with API key input and save button - Chat window area with message history - Message input field - Send button - **Static Assets**: Links to CSS and JavaScript files ### static/script.js - **Features**: - Real-time message streaming from the API - Message display in chat format - Enter key support for sending messages - Stream parsing to handle JSON responses - API key management with localStorage persistence - API key input UI with save functionality - **API Communication**: - Includes API key in Authorization header as Bearer token - POSTs to `/chat_api` endpoint - Receives streaming responses and displays incrementally - Handles error cases gracefully ### static/style.css - **Design**: Minimal, clean chat interface with API key management section - **Styling**: - Distinct colors for user vs. bot messages - Responsive layout - API key section in header with input field and save button - Auto-scrolling to latest messages ## Deployment ### Dockerfile - **Base Image**: ollama/ollama - **Environment**: Sets up Ollama server and FastAPI gateway - **Port Configuration**: Listens on port 7860 (Hugging Face Spaces default) - **Model Setup**: Downloads specified model during build process - **Dependencies**: Installs Python, FastAPI, and related libraries ### start.sh - **Initialization Sequence**: 1. Starts Ollama server in background 2. Health checks the Ollama server 3. Starts FastAPI gateway on port 7860 - **Error Handling**: Waits for Ollama to be ready before starting the gateway - **API Key**: If auto-generated, the API key will be displayed in the console logs during startup ## Configuration ### Environment Variables - `OLLAMA_HOST`: 0.0.0.0 (allows external connections) - `OLLAMA_ORIGINS`: '*' (allows CORS requests) - `OLLAMA_MODEL`: qwen3:4b (default model, can be overridden) - `OLLAMA_API_KEY`: (optional) Secure API key (auto-generated if not provided) ### Default Model - **Model**: qwen3:4b - **Fallback**: If no model specified in request, uses qwen3:4b ### API Key Management - **Generation**: If no OLLAMA_API_KEY environment variable is set, a cryptographically secure random key is generated at startup - **Access**: Generated API key is displayed in the application logs during startup - **Frontend Storage**: API key is stored in browser's localStorage after being entered once - **Authentication**: All API requests require a valid Bearer token in the Authorization header ## API Specification ### `/chat_api` Endpoint - **Method**: POST - **Authentication**: Requires Bearer token in Authorization header - **Content-Type**: application/json - **Request Headers**: - `Authorization`: Bearer {your_api_key} - `Content-Type`: application/json - **Request Body**: ```json { "model": "string (optional, defaults to qwen3:4b)", "prompt": "string (required)" } ``` - **Response**: Streaming response with incremental model output - **Error Handling**: - Returns 401 for invalid API key - Returns 400 for missing prompt ### Data Flow 1. Frontend sends user message to `/chat_api` 2. Backend forwards request to local Ollama API 3. Ollama processes request with specified model 4. Response is streamed back to frontend in real-time 5. Frontend displays response incrementally as it arrives ## Security Considerations - **API Key Authentication**: Required for all API access using Bearer token authentication - **Secure Key Generation**: API key is auto-generated using cryptographically secure random generator (secrets.token_urlsafe(32)) - **Configurable Keys**: API key can be set via environment variable (OLLAMA_API_KEY) or auto-generated - **Storage**: Client-side API key stored in browser's localStorage - **CORS**: Enabled for all origins (potential security concern in production) - **Input Validation**: Validates presence of prompt parameter - **Local API**: Communicates with Ollama through localhost only - **Key Exposure**: Auto-generated API key is displayed in console logs during startup (should be secured in production) ## Performance Features - **Streaming**: Real-time response streaming for better UX - **Client-side Display**: Incremental message display as responses arrive - **Efficient Communication**: Uses streaming HTTP responses to minimize latency ## Security Features - **Authentication**: Bearer token authentication for all API endpoints - **Key Generation**: Cryptographically secure random API key generation using secrets module - **Key Storage**: API key stored in browser localStorage (with option to enter via UI) - **Transport Security**: API key transmitted via Authorization header (should use HTTPS in production) ## Technologies Used - **Backend**: Python, FastAPI - **Frontend**: HTML5, CSS3, JavaScript (ES6+) - **Containerization**: Docker - **AI Model**: Ollama with qwen3:4b by default - **Web Server**: Uvicorn ASGI server ## File Structure ``` OllamaSpace/ ├── main.py (FastAPI application) ├── chat.html (Chat interface) ├── start.sh (Container startup script) ├── Dockerfile (Container configuration) ├── README.md (Project description) ├── static/ │ ├── script.js (Frontend JavaScript) │ └── style.css (Frontend styling) ``` ## Build Process 1. Container built with Ollama and Python dependencies 2. Model specified by OLLAMA_MODEL environment variable is pre-pulled 3. Application files are copied into container 4. FastAPI dependencies are installed 5. Container starts with Ollama server and FastAPI gateway ## Deployment Target - **Platform**: Designed for Hugging Face Spaces - **Port**: 7860 (standard for Hugging Face Spaces) - **Runtime**: Docker container - **Model Serving**: Ollama with FastAPI gateway