ScottzillaSystems commited on
Commit
0f63d2f
·
verified ·
1 Parent(s): 46a5462

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Agent Zero — Native HF Space
2
+
3
+ **Fixed version that loads your ACTUAL model weights** — not a proxy.
4
+
5
+ ## What was wrong with the old Agent Zero
6
+
7
+ The old Agent Zero (`agent-zero`, `agent-zero-pentesting`, etc.) was designed as a **Docker Compose multi-service stack** — LiteLLM proxy + TGI endpoints + PostgreSQL + SearXNG. On HF Spaces, only a single Docker container runs. The orchestrator tries to connect to `http://localhost:4000` (LiteLLM proxy) which **doesn't exist**, so **no models ever load**.
8
+
9
+ The "models_loaded: 3" in the logs was fake — the service_monitor was reporting ollama container health, not actual model availability.
10
+
11
+ ## What this does
12
+
13
+ - Loads your **actual model weights** from your HF repos via `AutoModelForCausalLM.from_pretrained()`
14
+ - No LiteLLM, no TGI, no PostgreSQL, no Docker Compose
15
+ - Models load on-demand, persist in memory cache
16
+ - ZeroGPU compatible (`@spaces.GPU` decorator)
17
+ - Select any model from the catalog dropdown
18
+
19
+ ## Models available
20
+
21
+ | Model | Tier | Size | Repo |
22
+ |---|---|---|---|
23
+ | chatgpt5 | T0 | 494M | `ScottzillaSystems/ChatGPT-5-Chat` |
24
+ | qwen3.5-9b | T1 | 9.6B | `ScottzillaSystems/Qwen3.5-9B-Chat` |
25
+ | cydonia-24b | T2 | 24B | `ScottzillaSystems/Cydonia-24B-v4.1` |
26
+ | qwen3.5-27b | T3 | 27B | `ScottzillaSystems/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled` |
27
+ | fallen-command | T4 | 111B | `ScottzillaSystems/Fallen-Command-A-111B-Chat` |
28
+
29
+ ## Hardware
30
+
31
+ Currently configured for `cpu-basic` startup. Upgrade to `a10g-large` or `a100-large` for larger models. ZeroGPU (`zero-a10g`) works for models up to 24B.