Spaces:

neongeckocom
/

AskJerry

Running

App Files Files Community

AskJerry / README.md

NeonClary

Add copy-response button and harden Ask Jerry chat streaming.

658b082 15 days ago

preview code

raw

history blame contribute delete

6.66 kB

metadata

title: Ask Jerry
emoji: 🛡️
colorFrom: purple
colorTo: indigo
sdk: docker
pinned: false
app_port: 7860

Ask Jerry

Single-model demo chat for the BrainForge/Security@2026.03.18 cybersecurity model (vLLM, OpenAI-compatible API). The assistant is always AI Jerry; you can add optional extra persona instructions (similar to LLMChats freeform persona input, without renaming the bot).

Repository: github.com/NeonClary/AskJerry

UI styling follows the same general look as LLMChats3 (purple / indigo accents, light default).

Docker (same pattern as LLMComparisons)

The root Dockerfile builds the Vite frontend, copies dist/ into backend/static/, and runs FastAPI + Uvicorn on port 7860 (Hugging Face Spaces default). The API serves the SPA and proxies chat to your vLLM.

docker build -t ask-jerry .
docker run --rm -p 7860:7860 \
  -e VLLM_BASE_URL=https://your-host/v1 \
  -e VLLM_API_KEY=your-key \
  -e CHAT_MODEL_ID=BrainForge/Security@2026.03.18 \
  -e CORS_ORIGINS=* \
  ask-jerry

Open http://localhost:7860.

Hugging Face Spaces

Create a Docker Space (or connect this repo).
In Settings → Repository secrets (or Space variables), add:
- VLLM_BASE_URL — OpenAI-compatible base URL including /v1
- VLLM_API_KEY — bearer token for vLLM (if required)
- Optional: CHAT_MODEL_ID, CORS_ORIGINS (defaults to * in the image for HF; override if needed)

Secrets are injected as environment variables at runtime; do not commit .env to git.

Create the Space from the Hub CLI (optional)

Install the CLI (pip install huggingface_hub), set HF_TOKEN to a token with write access, then:

python -m huggingface_hub.cli.hf repos create neongeckocom/AskJerry --type space --space-sdk docker --public --exist-ok

Replace neongeckocom/AskJerry with your org/username and desired Space name. Then add this repo as a git remote and push, or use Settings → Connect to GitHub on the Space to deploy from NeonClary/AskJerry.

Requirements

Python 3.11+
Node.js 20+
A running vLLM server that exposes POST /v1/chat/completions for BrainForge/Security@2026.03.18 (or override CHAT_MODEL_ID).

Setup

Backend

cd backend
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
copy .env.example .env

On Windows, if pip install reports packages under a path that is not ...\backend\.venv\..., use python -m pip install -r requirements.txt so dependencies install into the venv you activated.

Edit .env and set:

VLLM_BASE_URL — base URL including /v1 (example: http://your-host:port/v1)
VLLM_API_KEY — bearer token your vLLM expects (see .env.example for BrainForge/Security notes)
CHAT_MODEL_ID — defaults to BrainForge/Security@2026.03.18

Frontend

cd frontend
npm install

Run (development)

Canonical repo: use a single local clone of this repository (avoid duplicate checkouts serving different ports).

Uses port 3006 for the Vite dev server and 8006 for the API so it does not conflict with typical 3000/5173/8080 stacks.

Quick reference: scripts\start-dev.ps1 prints the two terminal commands.

Terminal 1 — API

cd backend
.venv\Scripts\activate
uvicorn app.main:app --host 127.0.0.1 --port 8006

Terminal 2 — UI

cd frontend
npm run dev

Open http://localhost:3006. The dev server proxies /api and /health to http://127.0.0.1:8006.

You can override ports with ASKJERRY_DEV_PORT (Vite) and ASKJERRY_API_PORT (proxy target and implied API port you pass to uvicorn).

Parallel local stacks (`FEAT_STT-TTS` + `master`)

To run the voice branch and master at the same time, use a second checkout (recommended: git worktree) and different ports so nothing collides.

1. One-time: add a worktree for master next to your main clone

From your main repo (e.g. on FEAT_STT-TTS):

cd path\to\AskJerry
git fetch origin
git worktree add ..\AskJerry-master master

2. Install dependencies in both folders (backend: venv + pip install -r requirements.txt; frontend: npm install). Copy backend\.env.example → backend\.env in each worktree; point both at the same vLLM if you like.

3. Port split (default convention)

Stack	Branch	UI	API
Main folder	`FEAT_STT-TTS`	3006	8006
`AskJerry-master`	`master`	3007	8007

4. Run

In the FEAT repo — Terminal A: cd backend → activate venv → uvicorn app.main:app --host 127.0.0.1 --port 8006. Terminal B: cd frontend → npm run dev → open http://localhost:3006.

In AskJerry-master — Terminal A: uvicorn ... --port 8007. Terminal B: npm run dev:alt → open http://localhost:3007.

In AskJerry-master backend\.env, set CORS_ORIGINS to include http://localhost:3007 and http://127.0.0.1:3007 (the FEAT stack can keep 3006 only).

The dev:alt script sets ASKJERRY_DEV_PORT=3007 and ASKJERRY_API_PORT=8007. The same port configuration is on master (pushed as of the “configurable dev ports” commit).

Voice (STT / TTS)

The API proxies Coqui TTS and Whisper STT (defaults: coqui.neonaiservices.com, whisper.neonaiservices.com). Override with COQUI_BASE_URL and WHISPER_BASE_URL in .env.

TTS: POST /api/tts with JSON { "text": "..." } returns audio/wav.
STT: POST /api/transcribe with multipart field audio (e.g. WebM from the browser); requires ffmpeg on the server (included in the Docker image).

Local dev: install backend deps (pip install -r requirements.txt) and ensure ffmpeg is on your PATH for speech-to-text conversion.

Features

Additional persona instructions — optional text merged into the system prompt under “Additional instructions from the user” (name remains AI Jerry).
Stop — aborts the current stream (partial reply is kept).
Refresh — regenerates the last assistant reply (same last user message).
New chat — clears the thread (aborts if streaming).
Read aloud — speaker icon on each assistant message (TTS); Always speak responses in Options.
Mic input — microphone next to Send records speech and appends transcribed text (STT).

Clone

git clone https://github.com/NeonClary/AskJerry.git
cd AskJerry