Commit History

fix: update model
ab16882
Running

imtrt004 commited on

fix: remove exAI
6780118

imtrt004 commited on

fix: still error
ac40983

imtrt004 commited on

fix: still error
666afe0

imtrt004 commited on

fix: chat error
d5be08f

imtrt004 commited on

fix: chat error
1589f43

imtrt004 commited on

fix: check_model_input issue
bd1ed9f

imtrt004 commited on

fix: update app py
7a3d50f

imtrt004 commited on

fix: update req
3e3b4c9

imtrt004 commited on

fix: update req
ae136a2

imtrt004 commited on

fix: update req
01693a1

imtrt004 commited on

fix: update req and docker
c227cde

imtrt004 commited on

fix: update base model
11d16b6

imtrt004 commited on

fix: exAI ropeparam
5f8085e

imtrt004 commited on

fix: update req
44c35da

imtrt004 commited on

fix: for exai req update
e0ebda8

imtrt004 commited on

fix: update req and docker
5d9ef52

imtrt004 commited on

fix: update dockerfile
839aefc

imtrt004 commited on

fix: update req and head method
fa776dd

imtrt004 commited on

fix: update dependency
9f311d9

imtrt004 commited on

fix: update requirements for model
8fe0004

imtrt004 commited on

fix: update backend lib with log
2aa0b72

imtrt004 commited on

fix: update model lists
7997082

imtrt004 commited on

fix: model changing
11a93d7

imtrt004 commited on

feat: select model from admin panel
1903740

imtrt004 commited on

fix: update model to 360m
198a583

imtrt004 commited on

feat: update base free model
67a030a

imtrt004 commited on

fix: update tier mode
0829183

imtrt004 commited on

fix: update limit
69975bb

imtrt004 commited on

feat: add ping for uptime
113b6c1

imtrt004 commited on

fix: update context window and prompt
27128c4

imtrt004 commited on

feat: add cerebras api and super mode
a488f5e

imtrt004 commited on

feat: answering question
4d4abe9

imtrt004 commited on

feat: line number and multi docs
391fc60

imtrt004 commited on

fix: improve chunking
e2cc6a2

imtrt004 commited on

fix: upload issue large than 23 mb
311fda6

imtrt004 commited on

fix: llm model info
e859e26

imtrt004 commited on

feat: add groq api and deepmind
29cfc16

imtrt004 commited on

perf: switch default LLM to SmolLM2-1.7B - 40-50% faster tok/s, better instruction following
18dc770

imtrt004 commited on

fix: JSON-encode SSE tokens to preserve newlines in markdown; reduce top_k to 3
ae897ea

imtrt004 commited on

perf: greedy decoding + dtype fix - 2-3x faster inference on CPU
d16b829

imtrt004 commited on

fix: rewrite loader.py as clean UTF-8 - remove Windows-1252 em-dashes causing SyntaxError
8210d54

imtrt004 commited on

feat: self-hosted Qwen2.5-1.5B-Instruct via transformers β€” no external API, no compilation
deea70e

imtrt004 commited on

feat: replace llama-cpp-python/Groq with free HF InferenceClient (zero compilation)
98e3f05

imtrt004 commited on

fix: restore build-essential+cmake, pin llama-cpp-python==0.3.16 for stable layer cache
bfaa120

imtrt004 commited on

fix: use pre-built llama-cpp-python CPU wheel β€” eliminates 8min C++ compile
6e6147b

imtrt004 commited on

fix: Dockerfile β€” pre-install CPU torch, upgrade llama-cpp-python to >=0.3.14 (qwen3 support)
5cfcd30

imtrt004 commited on

fix: upgrade llama-cpp-python >=0.3.14 for qwen3 arch support (was 0.3.8, pre-May 2025)
a0250ac

imtrt004 commited on

fix: double .gguf extension β€” skip symlink when path already ends in .gguf; add verbose step logging
dbce995

imtrt004 commited on

feat: LLM readiness tracking β€” 503 while loading, llm_ready in /health
256f0fc

imtrt004 commited on