AskJerry / README.md
NeonClary
README: HF CLI snippet for creating Docker Space
41531de
|
raw
history blame
3.78 kB
metadata
title: Ask Jerry
emoji: 🛡️
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860

Ask Jerry

Single-model demo chat for the BrainForge/Security@2026.03.18 cybersecurity model (vLLM, OpenAI-compatible API). The assistant is always AI Jerry; you can add optional extra persona instructions (similar to LLMChats freeform persona input, without renaming the bot).

Repository: github.com/NeonClary/AskJerry

UI styling follows the same general look as LLMChats3 (purple / indigo accents, light default).

Docker (same pattern as LLMComparisons)

The root Dockerfile builds the Vite frontend, copies dist/ into backend/static/, and runs FastAPI + Uvicorn on port 7860 (Hugging Face Spaces default). The API serves the SPA and proxies chat to your vLLM.

docker build -t ask-jerry .
docker run --rm -p 7860:7860 \
  -e VLLM_BASE_URL=https://your-host/v1 \
  -e VLLM_API_KEY=your-key \
  -e CHAT_MODEL_ID=BrainForge/Security@2026.03.18 \
  -e CORS_ORIGINS=* \
  ask-jerry

Open http://localhost:7860.

Hugging Face Spaces

  1. Create a Docker Space (or connect this repo).
  2. In Settings → Repository secrets (or Space variables), add:
    • VLLM_BASE_URL — OpenAI-compatible base URL including /v1
    • VLLM_API_KEY — bearer token for vLLM (if required)
    • Optional: CHAT_MODEL_ID, CORS_ORIGINS (defaults to * in the image for HF; override if needed)

Secrets are injected as environment variables at runtime; do not commit .env to git.

Create the Space from the Hub CLI (optional)

Install the CLI (pip install huggingface_hub), set HF_TOKEN to a token with write access, then:

python -m huggingface_hub.cli.hf repos create neongeckocom/AskJerry --type space --space-sdk docker --public --exist-ok

Replace neongeckocom/AskJerry with your org/username and desired Space name. Then add this repo as a git remote and push, or use Settings → Connect to GitHub on the Space to deploy from NeonClary/AskJerry.

Requirements

  • Python 3.11+
  • Node.js 20+
  • A running vLLM server that exposes POST /v1/chat/completions for BrainForge/Security@2026.03.18 (or override CHAT_MODEL_ID).

Setup

Backend

cd backend
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
copy .env.example .env

Edit .env and set:

  • VLLM_BASE_URL — base URL including /v1 (example: http://your-host:port/v1)
  • VLLM_API_KEY — bearer token your vLLM expects (see .env.example for BrainForge/Security notes)
  • CHAT_MODEL_ID — defaults to BrainForge/Security@2026.03.18

Frontend

cd frontend
npm install

Run (development)

Uses port 3006 for the Vite dev server and 8006 for the API so it does not conflict with typical 3000/5173/8080 stacks.

Terminal 1 — API

cd backend
.venv\Scripts\activate
uvicorn app.main:app --host 127.0.0.1 --port 8006

Terminal 2 — UI

cd frontend
npm run dev

Open http://localhost:3006. The dev server proxies /api and /health to http://127.0.0.1:8006.

Features

  • Additional persona instructions — optional text merged into the system prompt under “Additional instructions from the user” (name remains AI Jerry).
  • Stop — aborts the current stream (partial reply is kept).
  • Refresh — regenerates the last assistant reply (same last user message).
  • New chat — clears the thread (aborts if streaming).

Clone

git clone https://github.com/NeonClary/AskJerry.git
cd AskJerry