Commit History

Refactor the C++ LLM manager into modular components, moves Python modules under python/, and keeps the current control-plane behavior intact. The C++ server now has clearer separation for config, model lifecycle, runtime services, request parsing, HTTP helpers, and server routing, while Docker build/runtime paths were updated to compile multiple C++ files and load Python code from the new package folder
332826f

Dmitry Beresnev commited on

add auth, token policy, queue scheduler, and cancel flow, etc
d9ce859

Dmitry Beresnev commited on

add new endpoint to cancel all processing prompts
8ef326a

Dmitry Beresnev commited on

add new build profile
a97386f

Dmitry Beresnev commited on

fix encoding
d211568

Dmitry Beresnev commited on

fix model config
057edf0

Dmitry Beresnev commited on

fix proxied response in llm manager
53e9f39

Dmitry Beresnev commited on

fix routing in llm manager
a4ee76d

Dmitry Beresnev commited on

add cpp server
fc0860f

Dmitry Beresnev commited on

change llm model
f41621b

Dmitry Beresnev commited on

change llm model
4f2dffc

Dmitry Beresnev commited on

change model to Qwen2.5-Math-7B-Instruct-GGUF
cca3c7b

Dmitry Beresnev commited on

change llm model to qwen2 math
fe7089d

Dmitry Beresnev commited on

change llm model to mistral
97d9520

Dmitry Beresnev commited on

fix dockerfile
c33410f

Dmitry Beresnev commited on

change compilation flags
0e913e4

Dmitry Beresnev commited on

change compilation flags
1a4efad

Dmitry Beresnev commited on

reduce context and batch
34775a7

Dmitry Beresnev commited on

fix repo name of model
dc883f9

Dmitry Beresnev commited on

fix repo of model
c7c8563

Dmitry Beresnev commited on

fix cmd in dockerfile
0fbce92

Dmitry Beresnev commited on

fix dockerfile
c261631

Dmitry Beresnev commited on

switch to qwen model via cpp server
9a590ac

Dmitry Beresnev commited on

Log elapsed time and token rate when the response arrives.
a8f6b6b

Dmitry Beresnev commited on

a “slow request” logging, log when a request exceeds the budgeted prompt and gets compacted
130d9e3

Dmitry Beresnev commited on

add simple compacting
6381e7f

Dmitry Beresnev commited on

fix context window size
62a5a49

Dmitry Beresnev commited on

fix payload processing
e1e4b82

Dmitry Beresnev commited on

fix logger
e9b8569

Dmitry Beresnev commited on

fix app to handle exceptions
7d65cc9

Dmitry Beresnev commited on

add requests logging
90f1c82

Dmitry Beresnev commited on

fix dockerfile
950f41b

Dmitry Beresnev commited on

fix dockerfile
f64a284

Dmitry Beresnev commited on

fix gitignore, app and logger, etc
7763bf4

Dmitry Beresnev commited on

add readme
c384ef1

Dmitry Beresnev commited on

add readme
1ccd330

Dmitry Beresnev commited on

Add automatic API documentation and in-memory model caching
2295174

Dmitry Beresnev commited on

Force Docker rebuild for web search dependencies
9345f95

Dmitry Beresnev commited on

fix app, dockerfile, pyproject.toml to add web search
55e1aa1

Dmitry Beresnev commited on

fix app
110f827

Dmitry Beresnev commited on

fix app
944c08a

Dmitry Beresnev commited on

fix dockerfile
e80973f

Dmitry Beresnev commited on

fix dockerfile
84bb7ea

Dmitry Beresnev commited on

fix compiling flags in dockerfile
3fd32cf

Dmitry Beresnev commited on

fix compiling flags in dockerfile
309e664

Dmitry Beresnev commited on

fix dockerfile and app module
dde400a

Dmitry Beresnev commited on

fix dockerfile
8837f11

Dmitry Beresnev commited on

fix dockerfile
8c68c1f

Dmitry Beresnev commited on

fix dockerfile
db57dc8

Dmitry Beresnev commited on

fix dockerfile
25f92ca

Dmitry Beresnev commited on