Commit History

docs: update README — 4-tier routing, Groq LLM, integrated Prometheus, correct cost saving 96%
704951f

ninditya commited on

fix: add missing httpx import to inference.py (Grafana push NameError)
149688b

ninditya commited on

fix: add conversational_handler to frontend routing config and badge CSS
9e1c7f1

ninditya commited on

fix: add REGISTRY to prometheus_client imports (Grafana push NameError)
2c066b2

ninditya commited on

feat: add conversational pre-filter (tier 0) for greetings and farewells
844f7f7

ninditya commited on

fix: skip intent context hint for very low confidence LLM calls
c257aaa

ninditya commited on

fix: use llama-3.1-8b-instant as default Groq model
477d469

ninditya commited on

feat: add Groq provider to LLM client (free, no billing required)
2f33b1d

ninditya commited on

fix: replace prometheus-remote-write (requires py3.11) with manual protobuf+snappy
bb8fb46

ninditya commited on

feat: push metrics directly to Grafana Cloud from HF Space
dc2aee4

ninditya commited on

fix: add prometheus-client to serving requirements
d2ca5d3

ninditya commited on

feat: add /metrics endpoint for production monitoring on HF Spaces
471fac2

ninditya commited on

fix: correct Banking77 intent names mapping and add RAG for template_handler
26727cc

ninditya commited on

fix: download model from HF Hub instead of COPY artifacts
ba15b1d

ninditya commited on

initial deploy: SBERT+LinearSVC router with RAG + LLM
fc229ab

ninditya commited on