rag-backend / model /loader.py

Commit History

fix: update model
ab16882
Running

imtrt004 commited on

fix: remove exAI
6780118

imtrt004 commited on

fix: still error
ac40983

imtrt004 commited on

fix: still error
666afe0

imtrt004 commited on

fix: check_model_input issue
bd1ed9f

imtrt004 commited on

fix: update req
3e3b4c9

imtrt004 commited on

fix: update req
01693a1

imtrt004 commited on

fix: update req and docker
c227cde

imtrt004 commited on

fix: update base model
11d16b6

imtrt004 commited on

fix: exAI ropeparam
5f8085e

imtrt004 commited on

fix: update dependency
9f311d9

imtrt004 commited on

fix: update backend lib with log
2aa0b72

imtrt004 commited on

fix: update model lists
7997082

imtrt004 commited on

feat: select model from admin panel
1903740

imtrt004 commited on

fix: update model to 360m
198a583

imtrt004 commited on

feat: update base free model
67a030a

imtrt004 commited on

perf: switch default LLM to SmolLM2-1.7B - 40-50% faster tok/s, better instruction following
18dc770

imtrt004 commited on

perf: greedy decoding + dtype fix - 2-3x faster inference on CPU
d16b829

imtrt004 commited on

fix: rewrite loader.py as clean UTF-8 - remove Windows-1252 em-dashes causing SyntaxError
8210d54

imtrt004 commited on

feat: self-hosted Qwen2.5-1.5B-Instruct via transformers β€” no external API, no compilation
deea70e

imtrt004 commited on

feat: replace llama-cpp-python/Groq with free HF InferenceClient (zero compilation)
98e3f05

imtrt004 commited on

fix: double .gguf extension β€” skip symlink when path already ends in .gguf; add verbose step logging
dbce995

imtrt004 commited on

feat: LLM readiness tracking β€” 503 while loading, llm_ready in /health
256f0fc

imtrt004 commited on

fix: symlink blob to .gguf extension so llama.cpp C loader accepts it
915613c

imtrt004 commited on

fix: use hf_hub_download + realpath to avoid snapshot ./path crash
fd0d531

imtrt004 commited on

fix: correct GGUF filename case β€” Qwen3-4B-Q4_K_M.gguf
1057c73

imtrt004 commited on

Initial backend
b5be2eb

imtrt004 commited on