gary-boon Claude Opus 4.5 commited on
Commit
688efad
·
1 Parent(s): 499afba

docs: Update phased plan with Phase 2/2b/2c completion status

Browse files

- Mark Phase 2 (Devstral backend support) as COMPLETE
- Mark Phase 2b (Frontend dynamic layers) as COMPLETE
- Mark Phase 2c (API route conversion) as COMPLETE (partial)
- Note Spark toggle deferred until PyTorch GB10 support
- Update tag references

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Files changed (1) hide show
  1. docs/devstral-spark-plan-phased.md +27 -6
docs/devstral-spark-plan-phased.md CHANGED
@@ -1817,10 +1817,25 @@ Before marking each phase complete, verify:
1817
  - [x] **Phase 0**: Secure GPU HF Space + verify basic routing ✅ COMPLETE
1818
  - [x] **Phase 0.5**: Fix critical API route routing (prove GPU routing works) ✅ COMPLETE
1819
  - [ ] **Phase 1**: Deploy CodeGen to DGX Spark ⏸️ PAUSED (see blocker below)
1820
- - [ ] **Phase 2**: Add Devstral backend support
1821
- - [ ] **Phase 2b**: Frontend dynamic layer handling
1822
- - [ ] **Phase 2c**: Wire Spark into frontend backend router + Deploy Devstral to GPU HF Space
1823
- - [ ] **Phase 3**: Deploy Devstral to DGX Spark
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1824
  - [ ] **Phase 4**: Future enhancements (optional)
1825
 
1826
  ---
@@ -1933,9 +1948,15 @@ When PyTorch officially supports sm_121 (expected in PyTorch 2.9.x patch or 2.10
1933
 
1934
  ### Pre-Devstral Tag
1935
 
1936
- Before making these changes, both repos were tagged: `pre-devstral-v1`
1937
 
1938
  To restore to this state if needed:
1939
  ```bash
1940
- git checkout pre-devstral-v1
1941
  ```
 
 
 
 
 
 
 
1817
  - [x] **Phase 0**: Secure GPU HF Space + verify basic routing ✅ COMPLETE
1818
  - [x] **Phase 0.5**: Fix critical API route routing (prove GPU routing works) ✅ COMPLETE
1819
  - [ ] **Phase 1**: Deploy CodeGen to DGX Spark ⏸️ PAUSED (see blocker below)
1820
+ - [x] **Phase 2**: Add Devstral backend support ✅ COMPLETE
1821
+ - MistralAdapter added for Mistral/Devstral architecture
1822
+ - devstral-small config with 40 layers, GQA (32 Q heads, 8 KV heads)
1823
+ - Model-specific dtype (recommended_dtype field: codegen→fp16, devstral→bf16)
1824
+ - Percentage-based layer classification (works for any layer count)
1825
+ - /models and /models/current endpoints added
1826
+ - Environment variable support (DEFAULT_MODEL, TORCH_DTYPE, MAX_CONTEXT, BATCH_SIZE)
1827
+ - [x] **Phase 2b**: Frontend dynamic layer handling ✅ COMPLETE
1828
+ - Percentage-based stage boundaries in VerticalPipeline
1829
+ - Dynamic vocab size from modelInfo
1830
+ - Dynamic head_dim derived from actual matrix data
1831
+ - Removed hardcoded "64 dimensions" in tutorial
1832
+ - [x] **Phase 2c**: API route conversion ✅ COMPLETE (partial)
1833
+ - All 8 API routes converted to use backendFetch helper
1834
+ - Server-side auth with HF token for private Spaces
1835
+ - Per-user backend routing working
1836
+ - ⏸️ Spark toggle deferred (no benefit until PyTorch supports GB10)
1837
+ - ⏸️ GPU HF Space Devstral deployment pending (requires VRAM upgrade)
1838
+ - [ ] **Phase 3**: Deploy Devstral to DGX Spark ⏸️ BLOCKED (PyTorch sm_121 support)
1839
  - [ ] **Phase 4**: Future enhancements (optional)
1840
 
1841
  ---
 
1948
 
1949
  ### Pre-Devstral Tag
1950
 
1951
+ Before making Phase 2 changes, both repos were tagged: `pre-devstral-phase2-v1`
1952
 
1953
  To restore to this state if needed:
1954
  ```bash
1955
+ git checkout pre-devstral-phase2-v1
1956
  ```
1957
+
1958
+ ### Phase 2 Completion Tags
1959
+
1960
+ After Phase 2/2b/2c completion (December 2024):
1961
+ - Backend: Contains MistralAdapter, devstral-small config, /models endpoints
1962
+ - Frontend: Contains dynamic layer handling, backendFetch conversion