--- title: Ask Jerry emoji: πŸ›‘οΈ colorFrom: purple colorTo: indigo sdk: docker pinned: false app_port: 7860 --- # Ask Jerry Single-model demo chat for the **BrainForge/Security@2026.03.18** cybersecurity model (vLLM, OpenAI-compatible API). The assistant is always **AI Jerry**; you can add optional **extra persona instructions** (similar to LLMChats freeform persona input, without renaming the bot). **Repository:** [github.com/NeonClary/AskJerry](https://github.com/NeonClary/AskJerry) UI styling follows the same general look as [LLMChats3](https://github.com/NeonClary/LLMChats3) (purple / indigo accents, light default). ## Docker (same pattern as [LLMComparisons](https://github.com/NeonClary/LLMComparisons)) The root `Dockerfile` builds the Vite frontend, copies `dist/` into `backend/static/`, and runs **FastAPI + Uvicorn** on port **7860** (Hugging Face Spaces default). The API serves the SPA and proxies chat to your vLLM. ```bash docker build -t ask-jerry . docker run --rm -p 7860:7860 \ -e VLLM_BASE_URL=https://your-host/v1 \ -e VLLM_API_KEY=your-key \ -e CHAT_MODEL_ID=BrainForge/Security@2026.03.18 \ -e CORS_ORIGINS=* \ ask-jerry ``` Open `http://localhost:7860`. ## Hugging Face Spaces 1. Create a **Docker** Space (or connect this repo). 2. In **Settings β†’ Repository secrets** (or Space variables), add: - `VLLM_BASE_URL` β€” OpenAI-compatible base URL including `/v1` - `VLLM_API_KEY` β€” bearer token for vLLM (if required) - Optional: `CHAT_MODEL_ID`, `CORS_ORIGINS` (defaults to `*` in the image for HF; override if needed) Secrets are injected as environment variables at runtime; do **not** commit `.env` to git. ### Create the Space from the Hub CLI (optional) Install the CLI (`pip install huggingface_hub`), set `HF_TOKEN` to a token with write access, then: ```bash python -m huggingface_hub.cli.hf repos create neongeckocom/AskJerry --type space --space-sdk docker --public --exist-ok ``` Replace `neongeckocom/AskJerry` with your org/username and desired Space name. Then add this repo as a **git remote** and push, or use **Settings β†’ Connect to GitHub** on the Space to deploy from [NeonClary/AskJerry](https://github.com/NeonClary/AskJerry). ## Requirements - Python 3.11+ - Node.js 20+ - A running vLLM server that exposes `POST /v1/chat/completions` for `BrainForge/Security@2026.03.18` (or override `CHAT_MODEL_ID`). ## Setup ### Backend ```bash cd backend python -m venv .venv .venv\Scripts\activate pip install -r requirements.txt copy .env.example .env ``` Edit `.env` and set: - `VLLM_BASE_URL` β€” base URL including `/v1` (example: `http://your-host:port/v1`) - `VLLM_API_KEY` β€” bearer token your vLLM expects (see `.env.example` for BrainForge/Security notes) - `CHAT_MODEL_ID` β€” defaults to `BrainForge/Security@2026.03.18` ### Frontend ```bash cd frontend npm install ``` ## Run (development) Uses **port 3006** for the Vite dev server and **8006** for the API so it does not conflict with typical 3000/5173/8080 stacks. **Terminal 1 β€” API** ```bash cd backend .venv\Scripts\activate uvicorn app.main:app --host 127.0.0.1 --port 8006 ``` **Terminal 2 β€” UI** ```bash cd frontend npm run dev ``` Open **http://localhost:3006**. The dev server proxies `/api` and `/health` to `http://127.0.0.1:8006`. ## Features - **Additional persona instructions** β€” optional text merged into the system prompt under β€œAdditional instructions from the user” (name remains AI Jerry). - **Stop** β€” aborts the current stream (partial reply is kept). - **Refresh** β€” regenerates the last assistant reply (same last user message). - **New chat** β€” clears the thread (aborts if streaming). ## Clone ```bash git clone https://github.com/NeonClary/AskJerry.git cd AskJerry ```