Speed up CPU inference: halve token limits, pre-download models, fix OMP threads 4af4003 cgoodmaker Claude Opus 4.6 commited on about 4 hours ago
Use bfloat16 on CPU to halve memory (8GB vs 16GB float32) 0989643 cgoodmaker Claude Opus 4.6 commited on 5 days ago
Fix MCP subprocess deadlock: use stderr=None instead of PIPE da343a7 cgoodmaker Claude Opus 4.6 commited on 8 days ago
Add timeout and stderr logging to MCP subprocess to debug tool hangs c376e14 cgoodmaker Claude Opus 4.6 commited on 8 days ago
Remove unused files: old Gradio frontend, dead model code, test artifacts 672ed11 cgoodmaker Claude Opus 4.6 commited on 8 days ago
Force MCP tool models to CPU to avoid GPU VRAM contention with MedGemma 1a97904 cgoodmaker Claude Opus 4.6 commited on 8 days ago
Add missing deps: sentence-transformers, pdfplumber, scipy for MCP tools 5157ba3 cgoodmaker Claude Opus 4.6 commited on 8 days ago
Add RAG Phase 4 management guidance, rebuild guidelines index (286 chunks), post-analysis hint UI 5241b71 cgoodmaker Claude Opus 4.6 commited on 8 days ago
Use dtype instead of deprecated torch_dtype in model_kwargs 82f82ac cgoodmaker Claude Opus 4.6 commited on 8 days ago
Redesign chat UI and fix MedGemma generation config issues 58a4476 cgoodmaker Claude Opus 4.6 commited on 8 days ago
Fix requirements: opencv-headless (no libGL needed), remove unused gradio, add faiss-cpu and huggingface_hub 4cbee96 cgoodmaker commited on 10 days ago
Include mcp_server/ in Docker image — required at runtime for tool calls 2068e4f cgoodmaker commited on 10 days ago
Pass HF_TOKEN explicitly to pipeline() for gated model auth b08f876 cgoodmaker commited on 10 days ago
Use HF_TOKEN env var to authenticate for gated MedGemma model bb7e939 cgoodmaker commited on 10 days ago
Fix Dockerfile: install curl before using it for NodeSource setup 076039f cgoodmaker commited on 10 days ago