knowledge-inference / inference.py

Commit History

fix: Sage timeout — gunicorn 300s, max_tokens 150, ZeroGPU pattern, correct torch_dtype
ab04b85
Running

Charan Suresh commited on

Fix Sage/Lens silent failures and model loading on HF Space
6180f4c

Charan Suresh commited on

Fix HF Space: switch from llama-cpp-python to transformers for better compatibility
dee67dc

charan-ml commited on

Switch to HF Space as default backend, make Ollama optional
e1b76ea

charan-ml commited on

Replace cloud inference with local Ollama
4c42b27

charan-ml commited on

feat: add anti-gaming comprehension pipeline and harden HF space integration
b3edcc3

charan-ml commited on

Add HF Space model error diagnostics
372105d

charan-ml commited on

Switch HF Space to quantized Gemma 2B GGUF
1e693b7

charan-ml commited on

Switch HF Space to transformers backend
0b7846f

charan-ml commited on

Harden HF Space llama-cpp fallback
0912d7b

charan-ml commited on

Add HF Space inference fallback
c34b1b1

charan-ml commited on

Fix HF Space llama-cpp startup crash
205a0d7

charan-ml commited on

fix: cast model_path and prompt to string to resolve ValueError('not a string')
b6392ce

charan-ml commited on

chore: migrate hf_space to CPU-friendly GGUF via llama-cpp-python
65fcecf

charan-ml commited on

Fix HF Space inference runtime and error handling
fc09209

charan-ml commited on

Update app defaults and database to use US Curriculum Guide, fix HF requirements
1b03d62

charan-ml commited on