Commit History

Auto-correct EVERY broken GDScript block in place (capped at MAX_FIX_PASSES)
635e6fb
Running

vivekchakraverty Claude Opus 4.8 commited on

Restore max_new_tokens to 512 (4-bit gen is fast: ~25 tok/s on GPU)
6246295

vivekchakraverty commited on

Load Qwen2.5-Coder-7B in 4-bit (nf4) inside the GPU worker
2709f63

vivekchakraverty Claude Opus 4.8 commited on

ZeroGPU: load model on GPU inside @spaces.GPU (canonical), not at import
cccb7d5

vivekchakraverty Claude Opus 4.8 commited on

ZeroGPU: raise GPU budget 120->180s, cap max_new_tokens 512->256
5fa56c1

vivekchakraverty commited on

ZeroGPU: force model.to(cuda) in fn (ignore stale is_available); no cuda at import
743e3d3

vivekchakraverty commited on

diag: log cuda availability + model device + gen timing; force model.to(cuda) in fn
5ff14e5

vivekchakraverty commited on

ZeroGPU: keep model GPU-resident (canonical pattern)
8df32ec

vivekchakraverty Claude Opus 4.8 commited on

Load the LLM once at startup instead of per ZeroGPU call
043484b

vivekchakraverty Claude Opus 4.8 commited on

Hardcode chat memory to 4 turns (lock history_turns slider)
69036da

vivekchakraverty Claude Opus 4.8 commited on

Add bounded multi-turn chat memory + turns slider (app.py)
217a06b
verified

vivekchakraverty commited on

Add bounded multi-turn chat memory (prompt.py)
0298f08
verified

vivekchakraverty commited on

Fix ZeroGPU retrieval: pin jina query embedder to CPU
e48654b
verified

vivekchakraverty commited on

Add jina FAISS index (GPU build)
0f4aa1b
verified

vivekchakraverty commited on

Add jina FAISS index (GPU build)
5200bfe
verified

vivekchakraverty commited on

Fix Colab OOM: cap seq length + smaller batch
c314e63
verified

vivekchakraverty commited on

GDScript RAG assistant: app + corpus (index added later via Colab)
777ea0e
verified

vivekchakraverty commited on