feat: HF Space judging-readiness pass
Browse files- Hero verified-stats one-liner at top (256K, 31/31, 200K needle 3/3, 9/9 e2e, $4.12)
- Reframe Status: HF Spaces don't ship MI300X by default β CPU mock is by design,
not because the project doesn't work. Verified numbers come from real MI300X
stress test on AMD Developer Cloud (124 min, $4.12)
- Add demo video link (youtu.be/BvSBR1QazLU)
- Add Lablab project page + AMD Developer Forum thread #505 + GitHub links
- Memory-architecture comparison as table (MI300X vs H100 80GB)
- Soften like-CTA: factual rather than pleading
- README.md: parallel updates for Space card display
README.md
CHANGED
|
@@ -80,13 +80,21 @@ Full stress test on a single AMD MI300X x1 (AMD Developer Cloud, $1.99/hr, vLLM
|
|
| 80 |
- β
14.5 active continuous queriers per MI300X, or 70β140 dev seats for typical bursty engineering teams
|
| 81 |
- β
Owned MI300X ($18K) breaks even vs Cursor in 3β6 months at team-of-100 usage
|
| 82 |
|
| 83 |
-
|
| 84 |
|
| 85 |
-
|
| 86 |
-
[github.com/SRKRZ23/repomind/tree/main/benchmarks/2026-05-05-mi300x-stress-test](https://github.com/SRKRZ23/repomind/tree/main/benchmarks/2026-05-05-mi300x-stress-test)
|
| 87 |
-
Extended PHASE 1+2 narrative (24-cell matrix + AITER A/B): [extended/SUMMARY.md](https://github.com/SRKRZ23/repomind/tree/main/benchmarks/2026-05-05-mi300x-stress-test/extended).
|
| 88 |
|
| 89 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 90 |
|
| 91 |
## Author
|
| 92 |
|
|
|
|
| 80 |
- β
14.5 active continuous queriers per MI300X, or 70β140 dev seats for typical bursty engineering teams
|
| 81 |
- β
Owned MI300X ($18K) breaks even vs Cursor in 3β6 months at team-of-100 usage
|
| 82 |
|
| 83 |
+
## Demo backend
|
| 84 |
|
| 85 |
+
HF Spaces ship CPU / consumer GPUs by default β not MI300X. So this Space serves a **CPU mock for UI demonstration only**. The verified performance numbers above come from a real MI300X stress test on AMD Developer Cloud (124 min, $4.12).
|
|
|
|
|
|
|
| 86 |
|
| 87 |
+
To wire a real MI300X endpoint, set Space secrets `VLLM_BASE_URL` + `MODEL_NAME=Qwen/Qwen3-Coder-Next-FP8` against a vLLM 0.17.1 server. For a live walkthrough on a hosted MI300X, contact razikovsardor1@gmail.com.
|
| 88 |
+
|
| 89 |
+
## Evidence
|
| 90 |
+
|
| 91 |
+
- **1-minute demo video**: <https://youtu.be/BvSBR1QazLU>
|
| 92 |
+
- **Lablab project page**: <https://lablab.ai/ai-hackathons/amd-developer/repomind/repomind>
|
| 93 |
+
- **AMD Developer Forum thread #505** (AITER FP8 regression filed): <https://devcommunity.amd.com/t/repomind-open-source-repo-scale-coding-agent-on-a-single-mi300x-256k-context-fp8-31-31x-concurrency-verified/505>
|
| 94 |
+
- **Full evidence pack** (7 JSON results + 5 PNG plots + e2e prompts/answers + 2Γ rocm-smi snapshots + run logs): [github.com/SRKRZ23/repomind/tree/main/benchmarks/2026-05-05-mi300x-stress-test](https://github.com/SRKRZ23/repomind/tree/main/benchmarks/2026-05-05-mi300x-stress-test)
|
| 95 |
+
- **Extended PHASE 1+2 narrative** (24-cell matrix + AITER A/B): [extended/SUMMARY.md](https://github.com/SRKRZ23/repomind/tree/main/benchmarks/2026-05-05-mi300x-stress-test/extended)
|
| 96 |
+
|
| 97 |
+
Built for the AMD Developer Hackathon 2026 β eligible for the **Hugging Face Special Prize**. If the verified MI300X numbers are useful, a Space like is appreciated. π€
|
| 98 |
|
| 99 |
## Author
|
| 100 |
|
app.py
CHANGED
|
@@ -36,11 +36,10 @@ from ingestion.cloner import clone
|
|
| 36 |
VLLM_BASE_URL = os.environ.get("VLLM_BASE_URL", "").strip()
|
| 37 |
MODEL_NAME = os.environ.get("MODEL_NAME", "Qwen/Qwen3-Coder-Next-FP8").strip()
|
| 38 |
LIVE_BACKEND = bool(VLLM_BASE_URL)
|
| 39 |
-
BACKEND_LABEL =
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
"Set the Space secrets `VLLM_BASE_URL` + `MODEL_NAME` to wire a real MI300X backend."
|
| 44 |
)
|
| 45 |
|
| 46 |
|
|
@@ -51,23 +50,30 @@ HEADER_MD = f"""
|
|
| 51 |
Ingest a git repository (up to 256K tokens, FP8) on a single GPU and
|
| 52 |
reason across the whole codebase with multi-step tool use.
|
| 53 |
|
| 54 |
-
>
|
| 55 |
-
> π Built for the <a href="https://lablab.ai/ai-hackathons/amd-developer" target="_blank" rel="noopener noreferrer">AMD Developer Hackathon 2026</a>
|
| 56 |
-
> π€ HF Special Prize candidate Β· π‘ Conservative claim discipline applied
|
| 57 |
|
| 58 |
-
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
-
|
| 61 |
-
- 256K KV cache @ FP8 = **94.58 GiB** available (2,065,744 tokens, verified)
|
| 62 |
-
- Activations + framework overhead β peak 176/191.7 GiB β **92% utilization**
|
| 63 |
-
- NVIDIA H100 80 GB cannot accommodate this on a single card by VRAM
|
| 64 |
-
accounting (~143 GB > 80 GB); MI300X 192 GB has the headroom
|
| 65 |
|
| 66 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
**Backend right now**: {BACKEND_LABEL}
|
| 69 |
|
| 70 |
-
|
| 71 |
"""
|
| 72 |
|
| 73 |
|
|
@@ -237,8 +243,8 @@ with gr.Blocks(
|
|
| 237 |
<a href="mailto:razikovsardor1@gmail.com">razikovsardor1@gmail.com</a> Β·
|
| 238 |
<a href="mailto:razikovs777@gmail.com">razikovs777@gmail.com</a>
|
| 239 |
</p>
|
| 240 |
-
<p><em>
|
| 241 |
-
<strong>
|
| 242 |
"""
|
| 243 |
)
|
| 244 |
|
|
|
|
| 36 |
VLLM_BASE_URL = os.environ.get("VLLM_BASE_URL", "").strip()
|
| 37 |
MODEL_NAME = os.environ.get("MODEL_NAME", "Qwen/Qwen3-Coder-Next-FP8").strip()
|
| 38 |
LIVE_BACKEND = bool(VLLM_BASE_URL)
|
| 39 |
+
BACKEND_LABEL = (
|
| 40 |
+
f"π’ Live AMD MI300X (vLLM endpoint `{VLLM_BASE_URL}`, model `{MODEL_NAME}`)"
|
| 41 |
+
if LIVE_BACKEND
|
| 42 |
+
else "π‘ CPU mock β HF Spaces ship CPU/T4 by default, not MI300X"
|
|
|
|
| 43 |
)
|
| 44 |
|
| 45 |
|
|
|
|
| 50 |
Ingest a git repository (up to 256K tokens, FP8) on a single GPU and
|
| 51 |
reason across the whole codebase with multi-step tool use.
|
| 52 |
|
| 53 |
+
> **Verified on a single MI300X (2026-05-05):** 256K context Β· 31/31 concurrent users at 8Kβ64K Β· 200K needle-in-haystack 3/3 Β· 9/9 end-to-end repo questions correct Β· $4.12 total stress test cost Β· AITER FP8 attention backend regression filed for AMD review.
|
|
|
|
|
|
|
| 54 |
|
| 55 |
+
> π¬ <a href="https://youtu.be/BvSBR1QazLU" target="_blank" rel="noopener noreferrer">1-minute demo video</a> Β·
|
| 56 |
+
> π¦ <a href="https://github.com/SRKRZ23/repomind" target="_blank" rel="noopener noreferrer">GitHub source (MIT)</a> Β·
|
| 57 |
+
> π <a href="https://lablab.ai/ai-hackathons/amd-developer/repomind/repomind" target="_blank" rel="noopener noreferrer">Lablab project page</a> Β·
|
| 58 |
+
> π <a href="https://devcommunity.amd.com/t/repomind-open-source-repo-scale-coding-agent-on-a-single-mi300x-256k-context-fp8-31-31x-concurrency-verified/505" target="_blank" rel="noopener noreferrer">AMD Developer Forum thread #505</a>
|
| 59 |
|
| 60 |
+
### Why AMD MI300X β memory architecture
|
|
|
|
|
|
|
|
|
|
|
|
|
| 61 |
|
| 62 |
+
| Component | Verified on MI300X | NVIDIA H100 80 GB |
|
| 63 |
+
|---|---|---|
|
| 64 |
+
| Qwen3-Coder-Next-FP8 weights in VRAM | **77.29 GiB** | fits |
|
| 65 |
+
| 256K KV cache @ FP8 (2,065,744 tokens) | **94.58 GiB** available | cannot fit |
|
| 66 |
+
| Total peak utilization | **176 / 191.7 GiB (92%)** | cannot accommodate (~143 GB > 80 GB) |
|
| 67 |
+
|
| 68 |
+
This is a memory-architecture story. AMD MI300X 192 GB has the headroom on a single card; NVIDIA H100 80 GB cannot accommodate the same configuration by VRAM accounting.
|
| 69 |
+
|
| 70 |
+
### Demo backend
|
| 71 |
+
|
| 72 |
+
**This Space serves a CPU mock for UI demonstration only** β HF Spaces don't ship MI300X GPUs. The verified performance numbers above and in the *Verified evidence* tab come from a real MI300X stress test on AMD Developer Cloud (124 min, $4.12).
|
| 73 |
|
| 74 |
**Backend right now**: {BACKEND_LABEL}
|
| 75 |
|
| 76 |
+
To wire a real MI300X endpoint, set Space secrets `VLLM_BASE_URL` + `MODEL_NAME=Qwen/Qwen3-Coder-Next-FP8`. For a live walkthrough on a hosted MI300X, contact razikovsardor1@gmail.com.
|
| 77 |
"""
|
| 78 |
|
| 79 |
|
|
|
|
| 243 |
<a href="mailto:razikovsardor1@gmail.com">razikovsardor1@gmail.com</a> Β·
|
| 244 |
<a href="mailto:razikovs777@gmail.com">razikovs777@gmail.com</a>
|
| 245 |
</p>
|
| 246 |
+
<p><em>Built for the AMD Developer Hackathon 2026 β eligible for the
|
| 247 |
+
<strong>Hugging Face Special Prize</strong>. If the verified MI300X numbers are useful, a Space like is appreciated. π€</em></p>
|
| 248 |
"""
|
| 249 |
)
|
| 250 |
|