Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
gary-boon
Claude Opus 4.5
commited on
Commit
·
688efad
1
Parent(s):
499afba
docs: Update phased plan with Phase 2/2b/2c completion status
Browse files- Mark Phase 2 (Devstral backend support) as COMPLETE
- Mark Phase 2b (Frontend dynamic layers) as COMPLETE
- Mark Phase 2c (API route conversion) as COMPLETE (partial)
- Note Spark toggle deferred until PyTorch GB10 support
- Update tag references
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
docs/devstral-spark-plan-phased.md
CHANGED
|
@@ -1817,10 +1817,25 @@ Before marking each phase complete, verify:
|
|
| 1817 |
- [x] **Phase 0**: Secure GPU HF Space + verify basic routing ✅ COMPLETE
|
| 1818 |
- [x] **Phase 0.5**: Fix critical API route routing (prove GPU routing works) ✅ COMPLETE
|
| 1819 |
- [ ] **Phase 1**: Deploy CodeGen to DGX Spark ⏸️ PAUSED (see blocker below)
|
| 1820 |
-
- [
|
| 1821 |
-
-
|
| 1822 |
-
-
|
| 1823 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1824 |
- [ ] **Phase 4**: Future enhancements (optional)
|
| 1825 |
|
| 1826 |
---
|
|
@@ -1933,9 +1948,15 @@ When PyTorch officially supports sm_121 (expected in PyTorch 2.9.x patch or 2.10
|
|
| 1933 |
|
| 1934 |
### Pre-Devstral Tag
|
| 1935 |
|
| 1936 |
-
Before making
|
| 1937 |
|
| 1938 |
To restore to this state if needed:
|
| 1939 |
```bash
|
| 1940 |
-
git checkout pre-devstral-v1
|
| 1941 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1817 |
- [x] **Phase 0**: Secure GPU HF Space + verify basic routing ✅ COMPLETE
|
| 1818 |
- [x] **Phase 0.5**: Fix critical API route routing (prove GPU routing works) ✅ COMPLETE
|
| 1819 |
- [ ] **Phase 1**: Deploy CodeGen to DGX Spark ⏸️ PAUSED (see blocker below)
|
| 1820 |
+
- [x] **Phase 2**: Add Devstral backend support ✅ COMPLETE
|
| 1821 |
+
- MistralAdapter added for Mistral/Devstral architecture
|
| 1822 |
+
- devstral-small config with 40 layers, GQA (32 Q heads, 8 KV heads)
|
| 1823 |
+
- Model-specific dtype (recommended_dtype field: codegen→fp16, devstral→bf16)
|
| 1824 |
+
- Percentage-based layer classification (works for any layer count)
|
| 1825 |
+
- /models and /models/current endpoints added
|
| 1826 |
+
- Environment variable support (DEFAULT_MODEL, TORCH_DTYPE, MAX_CONTEXT, BATCH_SIZE)
|
| 1827 |
+
- [x] **Phase 2b**: Frontend dynamic layer handling ✅ COMPLETE
|
| 1828 |
+
- Percentage-based stage boundaries in VerticalPipeline
|
| 1829 |
+
- Dynamic vocab size from modelInfo
|
| 1830 |
+
- Dynamic head_dim derived from actual matrix data
|
| 1831 |
+
- Removed hardcoded "64 dimensions" in tutorial
|
| 1832 |
+
- [x] **Phase 2c**: API route conversion ✅ COMPLETE (partial)
|
| 1833 |
+
- All 8 API routes converted to use backendFetch helper
|
| 1834 |
+
- Server-side auth with HF token for private Spaces
|
| 1835 |
+
- Per-user backend routing working
|
| 1836 |
+
- ⏸️ Spark toggle deferred (no benefit until PyTorch supports GB10)
|
| 1837 |
+
- ⏸️ GPU HF Space Devstral deployment pending (requires VRAM upgrade)
|
| 1838 |
+
- [ ] **Phase 3**: Deploy Devstral to DGX Spark ⏸️ BLOCKED (PyTorch sm_121 support)
|
| 1839 |
- [ ] **Phase 4**: Future enhancements (optional)
|
| 1840 |
|
| 1841 |
---
|
|
|
|
| 1948 |
|
| 1949 |
### Pre-Devstral Tag
|
| 1950 |
|
| 1951 |
+
Before making Phase 2 changes, both repos were tagged: `pre-devstral-phase2-v1`
|
| 1952 |
|
| 1953 |
To restore to this state if needed:
|
| 1954 |
```bash
|
| 1955 |
+
git checkout pre-devstral-phase2-v1
|
| 1956 |
```
|
| 1957 |
+
|
| 1958 |
+
### Phase 2 Completion Tags
|
| 1959 |
+
|
| 1960 |
+
After Phase 2/2b/2c completion (December 2024):
|
| 1961 |
+
- Backend: Contains MistralAdapter, devstral-small config, /models endpoints
|
| 1962 |
+
- Frontend: Contains dynamic layer handling, backendFetch conversion
|