ScottzillaSystems
/

agent-zero-fixed

Model card Files Files and versions

ScottzillaSystems commited on 8 days ago

Commit

0f63d2f

·

verified ·

1 Parent(s): 46a5462

Upload README.md

Files changed (1) hide show

README.md +31 -0

README.md ADDED Viewed

	@@ -0,0 +1,31 @@

+# Agent Zero — Native HF Space
+**Fixed version that loads your ACTUAL model weights** — not a proxy.
+## What was wrong with the old Agent Zero
+The old Agent Zero (`agent-zero`, `agent-zero-pentesting`, etc.) was designed as a **Docker Compose multi-service stack** — LiteLLM proxy + TGI endpoints + PostgreSQL + SearXNG. On HF Spaces, only a single Docker container runs. The orchestrator tries to connect to `http://localhost:4000` (LiteLLM proxy) which **doesn't exist**, so **no models ever load**.
+The "models_loaded: 3" in the logs was fake — the service_monitor was reporting ollama container health, not actual model availability.
+## What this does
+- Loads your **actual model weights** from your HF repos via `AutoModelForCausalLM.from_pretrained()`
+- No LiteLLM, no TGI, no PostgreSQL, no Docker Compose
+- Models load on-demand, persist in memory cache
+- ZeroGPU compatible (`@spaces.GPU` decorator)
+- Select any model from the catalog dropdown
+## Models available
+| Model | Tier | Size | Repo |
+|---|---|---|---|
+| chatgpt5 | T0 | 494M | `ScottzillaSystems/ChatGPT-5-Chat` |
+| qwen3.5-9b | T1 | 9.6B | `ScottzillaSystems/Qwen3.5-9B-Chat` |
+| cydonia-24b | T2 | 24B | `ScottzillaSystems/Cydonia-24B-v4.1` |
+| qwen3.5-27b | T3 | 27B | `ScottzillaSystems/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled` |
+| fallen-command | T4 | 111B | `ScottzillaSystems/Fallen-Command-A-111B-Chat` |
+## Hardware
+Currently configured for `cpu-basic` startup. Upgrade to `a10g-large` or `a100-large` for larger models. ZeroGPU (`zero-a10g`) works for models up to 24B.