Spaces:
Runtime error
Runtime error
| # T-001: Bootstrap Synesthesia β Dioxus Fullstack, Tailwind v4, Kansas UI Port, Full ML Pipeline | |
| **Type:** Task | |
| **Autonomy:** `agent:human-led` β Do not merge without human review. Read every constraint before touching any file. | |
| **Stack:** `stack:rust` `stack:dioxus` `stack:tailwind` `stack:ml` | |
| **Version:** v0.1 | |
| **Iteration:** iter-1 | |
| **Effort:** L (multi-session β Jules may run for many hours across subtasks) | |
| **Blocks:** All subsequent tasks β nothing else starts until all subtasks are green. | |
| --- | |
| > β οΈ **CRITICAL β READ BEFORE ANY ACTION** | |
| > | |
| > The default Dioxus fullstack template (`dx new`) is **already working**. The project builds and serves. Your job is to **add** to it, not restructure it. Agents routinely break this template and leave it broken. You will not do that. | |
| > | |
| > Before touching any file: | |
| > 1. Run `dx serve --package desktop --features desktop` β it must succeed. | |
| > 2. Run `cargo check --workspace` β it must succeed. | |
| > 3. If either fails, **stop and file a blocker comment** β do not attempt fixes. | |
| > 4. Make each subtask's changes, then re-run both checks before the next subtask. | |
| > | |
| > If at any point `dx serve` stops working, **revert your last change and file a blocker**. Do not push forward through a broken build. | |
| --- | |
| ## Workspace Ground Truth | |
| These files are already correct. **Do not modify them** unless a subtask explicitly instructs it. | |
| ### Root `Cargo.toml` | |
| ```toml | |
| [workspace] | |
| resolver = "2" | |
| members = [ | |
| "packages/ui", | |
| "packages/web", | |
| "packages/desktop", | |
| "packages/api", | |
| ] | |
| [workspace.dependencies] | |
| dioxus = { version = "0.7.3" } | |
| ui = { path = "packages/ui" } | |
| api = { path = "packages/api" } | |
| ``` | |
| ### `packages/api/Cargo.toml` (current state) | |
| ```toml | |
| [package] | |
| name = "api" | |
| version = "0.1.0" | |
| edition = "2024" | |
| [dependencies] | |
| dioxus = { workspace = true, features = ["fullstack"] } | |
| [features] | |
| server = ["dioxus/server"] | |
| ``` | |
| ### `packages/ui/Cargo.toml` (current state) | |
| ```toml | |
| [package] | |
| name = "ui" | |
| version = "0.1.0" | |
| edition = "2024" | |
| [dependencies] | |
| dioxus = { workspace = true } | |
| api = { workspace = true } | |
| [features] | |
| server = ["api/server"] | |
| ``` | |
| ### `packages/desktop/Cargo.toml` (current state) | |
| ```toml | |
| [package] | |
| name = "desktop" | |
| version = "0.1.0" | |
| edition = "2024" | |
| [dependencies] | |
| dioxus = { workspace = true, features = ["router", "fullstack"] } | |
| ui = { workspace = true } | |
| [features] | |
| default = [] | |
| desktop = ["dioxus/desktop"] | |
| server = ["dioxus/server", "ui/server"] | |
| ``` | |
| ### `packages/web/Cargo.toml` (current state) | |
| ```toml | |
| [package] | |
| name = "web" | |
| version = "0.1.0" | |
| edition = "2024" | |
| [dependencies] | |
| dioxus = { workspace = true, features = ["router", "fullstack"] } | |
| ui = { workspace = true } | |
| [features] | |
| default = [] | |
| web = ["dioxus/web"] | |
| server = ["dioxus/server", "ui/server"] | |
| ``` | |
| --- | |
| ## Package Roles | |
| | Package | Role | Builds for | | |
| |---------|------|-----------| | |
| | `api` | Server functions, Burn inference stubs, HF hub pull | Server + WASM stub | | |
| | `ui` | All shared Dioxus components β panels, layout, tokens | Desktop + Web + Server | | |
| | `desktop` | Native window entry point | Desktop + Server (for LAN) | | |
| | `web` | WASM entry point, served to iPhone/browser | WASM | | |
| --- | |
| ## File Structure After This Task | |
| ``` | |
| synesthesia/ | |
| βββ Cargo.toml β unchanged | |
| βββ Cargo.lock β unchanged | |
| βββ Dioxus.toml β NEW: dx CLI project config | |
| βββ MODELS.md β NEW: model roadmap (T-001e) | |
| βββ AGENTS.md β keep as-is | |
| β | |
| βββ packages/ | |
| β βββ api/ | |
| β β βββ Cargo.toml β MODIFIED: add hf-hub, burn stubs, once_cell (T-001d) | |
| β β βββ src/ | |
| β β βββ lib.rs β NEW: pub mod declarations | |
| β β βββ quality.rs β NEW: Quality enum stub | |
| β β βββ models/ | |
| β β βββ mod.rs β NEW: model stub registry | |
| β β | |
| β βββ ui/ | |
| β β βββ Cargo.toml β unchanged | |
| β β βββ src/ | |
| β β βββ lib.rs β MODIFIED: export all panel components | |
| β β βββ layout.rs β NEW: root grid layout component | |
| β β βββ header.rs β NEW: header bar (T-001c) | |
| β β βββ track_mixer.rs β NEW: left panel stub (T-001c) | |
| β β βββ center_canvas.rs β NEW: center panel (T-001c) | |
| β β βββ pitch_intel.rs β NEW: right panel stub (T-001c) | |
| β β βββ bottom_bar.rs β NEW: bottom bar stub (T-001c) | |
| β β | |
| β βββ desktop/ | |
| β β βββ Cargo.toml β MODIFIED: add window-vibrancy, tokio (T-001a) | |
| β β βββ assets/ | |
| β β β βββ tailwind.css β NEW: Tailwind v4 input (T-001b) | |
| β β β βββ main.css β keep existing if present; do not delete | |
| β β βββ src/ | |
| β β βββ main.rs β MODIFIED: window config (T-001a) | |
| β β | |
| β βββ web/ | |
| β βββ Cargo.toml β unchanged | |
| β βββ assets/ | |
| β β βββ tailwind.css β NEW: same Tailwind input as desktop (symlink or copy) | |
| β βββ src/ | |
| β βββ main.rs β unchanged | |
| β | |
| βββ .github/ | |
| βββ workflows/ | |
| βββ ci.yml β NEW: cargo check + dx check stub | |
| ``` | |
| --- | |
| ## Subtask T-001a β Tailwind v4 + Window Config | |
| ### Goal | |
| Add Tailwind v4 to both desktop and web assets. Configure the native desktop window. Do not break `dx serve`. | |
| ### 1. Tailwind v4 Setup | |
| Install the standalone Tailwind v4 CLI (no Node/npm required): | |
| ```bash | |
| # Windows β download from https://github.com/tailwindlabs/tailwindcss/releases/latest | |
| # Place tailwindcss.exe in your PATH or reference it directly in tasks | |
| # Verify | |
| tailwindcss --version # must be v4.x | |
| ``` | |
| Create `packages/desktop/assets/tailwind.css`: | |
| ```css | |
| @import "tailwindcss"; | |
| /* Synesthesia design tokens as Tailwind v4 theme variables */ | |
| @theme { | |
| /* Backgrounds */ | |
| --color-bg: #070b10; | |
| --color-surface: #0d1520; | |
| --color-surface-2: #111d2e; | |
| --color-border: #1a2d45; | |
| /* Text */ | |
| --color-text: #c8ddf0; | |
| --color-text-dim: #5a7a9a; | |
| --color-text-bright: #e8f4ff; | |
| /* Accent */ | |
| --color-accent: #00c8d4; | |
| --color-accent-dim: rgba(0, 200, 212, 0.12); | |
| --color-accent-2: #0090a8; | |
| /* Status */ | |
| --color-red: #e05c5c; | |
| --color-green: #4ade80; | |
| --color-yellow: #fbbf24; | |
| --color-orange: #fb923c; | |
| /* Stack badge colors */ | |
| --color-rust: #f0734a; | |
| --color-cpp: #5b9bd5; | |
| --color-ts: #3fb9e0; | |
| --color-python: #f5cb42; | |
| --color-ml: #a78bfa; | |
| /* Typography */ | |
| --font-mono: "JetBrains Mono", monospace; | |
| --font-ui: "Space Grotesk", sans-serif; | |
| /* Font sizes */ | |
| --text-xs: 11px; | |
| --text-sm: 12px; | |
| --text-base: 14px; | |
| --text-md: 15px; | |
| --text-lg: 18px; | |
| /* Spacing (4px grid) */ | |
| --spacing-1: 4px; | |
| --spacing-2: 8px; | |
| --spacing-3: 12px; | |
| --spacing-4: 16px; | |
| --spacing-5: 20px; | |
| --spacing-6: 24px; | |
| --spacing-8: 32px; | |
| /* Radius */ | |
| --radius-sm: 4px; | |
| --radius-md: 8px; | |
| --radius-lg: 12px; | |
| /* Transitions */ | |
| --transition-fast: 120ms ease; | |
| --transition-base: 200ms ease; | |
| /* Layout */ | |
| --panel-left-width: 240px; | |
| --panel-right-width: 280px; | |
| --header-height: 56px; | |
| --bottom-bar-height: 100px; | |
| } | |
| @layer base { | |
| *, *::before, *::after { box-sizing: border-box; } | |
| html, body { margin: 0; padding: 0; height: 100%; } | |
| body { | |
| background-color: var(--color-bg); | |
| color: var(--color-text); | |
| font-family: var(--font-ui); | |
| font-size: var(--text-base); | |
| -webkit-font-smoothing: antialiased; | |
| } | |
| } | |
| ``` | |
| Copy `tailwind.css` to `packages/web/assets/tailwind.css` (identical content). | |
| Build Tailwind output (run alongside `dx serve`): | |
| ```bash | |
| # Desktop | |
| tailwindcss -i packages/desktop/assets/tailwind.css -o packages/desktop/assets/output.css --watch | |
| # Web | |
| tailwindcss -i packages/web/assets/tailwind.css -o packages/web/assets/output.css --watch | |
| ``` | |
| ### 2. desktop/Cargo.toml Additions | |
| Add to the existing `packages/desktop/Cargo.toml` β **do not change anything that already exists**: | |
| ```toml | |
| [dependencies] | |
| # ... existing entries unchanged ... | |
| window-vibrancy = "0.7.1" | |
| tokio = { version = "1.43.0", features = ["full"] } | |
| ``` | |
| Add to `[features]`: | |
| ```toml | |
| [features] | |
| default = [] | |
| desktop = ["dioxus/desktop"] | |
| server = ["dioxus/server", "ui/server"] | |
| # no changes needed β already correct | |
| ``` | |
| ### 3. desktop/src/main.rs | |
| Replace with the following. **Critical:** The stylesheet link uses `asset!()` pointing to `output.css` (the compiled Tailwind file), not `tailwind.css` (the input): | |
| ```rust | |
| // packages/desktop/src/main.rs | |
| #![allow(non_snake_case)] | |
| use dioxus::prelude::*; | |
| use dioxus::desktop::{Config, WindowBuilder, LogicalSize}; | |
| use ui::Layout; | |
| fn main() { | |
| // Spawn Axum web server for LAN access (iPhone, browser) | |
| // Only active with --features server | |
| #[cfg(feature = "server")] | |
| std::thread::spawn(|| { | |
| tokio::runtime::Runtime::new() | |
| .unwrap() | |
| .block_on(start_server()); | |
| }); | |
| dioxus::LaunchBuilder::new() | |
| .with_cfg( | |
| Config::new() | |
| .with_window( | |
| WindowBuilder::new() | |
| .with_title("synesthesia") | |
| .with_inner_size(LogicalSize::new(1280.0, 800.0)) | |
| .with_min_inner_size(LogicalSize::new(960.0, 600.0)) | |
| .with_transparent(true) | |
| .with_decorations(false), | |
| ) | |
| .with_background_color((7, 11, 16, 255)) // #070b10 | |
| .with_on_window_ready(|window| { | |
| // Apply Windows 11 acrylic vibrancy | |
| #[cfg(target_os = "windows")] | |
| { | |
| use window_vibrancy::apply_acrylic; | |
| let _ = apply_acrylic(&window, Some((7, 11, 16, 200))); | |
| } | |
| // wgpu surface init goes here in a later task | |
| }) | |
| .with_devtools(cfg!(debug_assertions)) | |
| .with_disable_context_menu(true), | |
| ) | |
| .launch(App); | |
| } | |
| #[component] | |
| fn App() -> Element { | |
| rsx! { | |
| document::Stylesheet { href: asset!("/assets/output.css") } | |
| Layout {} | |
| } | |
| } | |
| #[cfg(feature = "server")] | |
| async fn start_server() { | |
| use dioxus::fullstack::prelude::ServeConfig; | |
| let router = axum::Router::new() | |
| .serve_static_assets("dist") | |
| .serve_dioxus_application(ServeConfig::new(), App); | |
| let listener = tokio::net::TcpListener::bind("0.0.0.0:3000").await.unwrap(); | |
| println!("[synesthesia] web server β http://0.0.0.0:3000"); | |
| axum::serve(listener, router).await.unwrap(); | |
| } | |
| ``` | |
| ### 4. Dioxus.toml | |
| Create at the repo root: | |
| ```toml | |
| [application] | |
| name = "synesthesia" | |
| default_platform = "desktop" | |
| out_dir = "dist" | |
| [web.app] | |
| base_path = "/" | |
| [web.watcher] | |
| reload_html = true | |
| watch_path = ["packages/ui/src", "packages/desktop/assets", "packages/web/assets"] | |
| [desktop.window] | |
| title = "synesthesia" | |
| width = 1280 | |
| height = 800 | |
| min_width = 960 | |
| min_height = 600 | |
| ``` | |
| ### Acceptance Criteria β T-001a | |
| - [ ] `cargo check --workspace` passes after changes | |
| - [ ] `dx serve --package desktop --features desktop` starts the window | |
| - [ ] Window background is `#070b10` β no white flash | |
| - [ ] Window is 1280Γ800, min-resizable to 960Γ600 | |
| - [ ] Tailwind `output.css` builds without errors | |
| - [ ] `window-vibrancy` compiles; acrylic applied on Windows 11 (log error, do not crash if unavailable) | |
| --- | |
| ## Subtask T-001b β Design Token Verification | |
| ### Goal | |
| Confirm all design tokens from the Kansas prototype are represented in Tailwind `@theme`. No separate `tokens.css` file β Tailwind is the token system. | |
| The token mapping from the Kansas CSS variables to Tailwind v4 `@theme` is already done in T-001a. This subtask verifies the mapping is correct and adds Tailwind utility aliases for the most common patterns. | |
| Add to the bottom of `tailwind.css` (after `@theme {}`): | |
| ```css | |
| @layer utilities { | |
| /* Surface helpers */ | |
| .bg-synth-bg { background-color: var(--color-bg); } | |
| .bg-synth-surface { background-color: var(--color-surface); } | |
| .bg-synth-surface-2 { background-color: var(--color-surface-2); } | |
| /* Text helpers */ | |
| .text-synth { color: var(--color-text); } | |
| .text-synth-dim { color: var(--color-text-dim); } | |
| .text-synth-bright { color: var(--color-text-bright); } | |
| .text-accent { color: var(--color-accent); } | |
| /* Border helpers */ | |
| .border-synth { border-color: var(--color-border); } | |
| /* Font helpers */ | |
| .font-mono-synth { font-family: var(--font-mono); } | |
| .font-ui-synth { font-family: var(--font-ui); } | |
| /* Panel geometry */ | |
| .w-panel-left { width: var(--panel-left-width); } | |
| .w-panel-right { width: var(--panel-right-width); } | |
| .h-header { height: var(--header-height); } | |
| .h-bottom-bar { height: var(--bottom-bar-height); } | |
| /* Status dot */ | |
| .status-dot { | |
| width: 8px; height: 8px; | |
| border-radius: 50%; | |
| display: inline-block; | |
| } | |
| .status-dot-green { background-color: var(--color-green); } | |
| .status-dot-yellow { background-color: var(--color-yellow); } | |
| .status-dot-red { background-color: var(--color-red); } | |
| } | |
| ``` | |
| ### Acceptance Criteria β T-001b | |
| - [ ] All Kansas design token names exist in `@theme {}` | |
| - [ ] No hardcoded hex colors in any `.rs` component file | |
| - [ ] Tailwind builds with zero warnings | |
| --- | |
| ## Subtask T-001c β Kansas UI Port to Dioxus RSX | |
| ### Goal | |
| Translate the Kansas prototype five-panel layout into Dioxus RSX components in `packages/ui/src/`. All panels are **visual stubs** β correct structure and tokens, no wired functionality. Functionality comes from T-002 onward. | |
| ### Root Layout (`packages/ui/src/layout.rs`) | |
| ```rust | |
| // packages/ui/src/layout.rs | |
| use dioxus::prelude::*; | |
| use crate::{Header, TrackMixer, CenterCanvas, PitchIntel, BottomBar}; | |
| #[component] | |
| pub fn Layout() -> Element { | |
| rsx! { | |
| div { | |
| class: "grid h-screen w-screen overflow-hidden", | |
| style: " | |
| grid-template-rows: var(--header-height) 1fr var(--bottom-bar-height); | |
| grid-template-columns: var(--panel-left-width) 1fr var(--panel-right-width); | |
| grid-template-areas: | |
| 'header header header' | |
| 'left center right' | |
| 'bottom bottom bottom'; | |
| background-color: var(--color-bg); | |
| ", | |
| Header {} | |
| TrackMixer {} | |
| CenterCanvas {} | |
| PitchIntel {} | |
| BottomBar {} | |
| } | |
| } | |
| } | |
| ``` | |
| ### Header Bar (`packages/ui/src/header.rs`) | |
| ```rust | |
| // packages/ui/src/header.rs | |
| use dioxus::prelude::*; | |
| #[component] | |
| pub fn Header() -> Element { | |
| // WebGPU status β placeholder; wired in a later task | |
| let webgpu_active = use_signal(|| false); | |
| rsx! { | |
| header { | |
| class: "flex items-center justify-between px-4 border-b border-synth", | |
| style: "grid-area: header; background-color: var(--color-surface); height: var(--header-height);", | |
| // Left: logo + name + version | |
| div { class: "flex items-center gap-3", | |
| span { class: "text-accent font-mono-synth", "~" } | |
| span { | |
| class: "font-mono-synth text-synth-bright", | |
| style: "font-size: var(--text-xs); letter-spacing: 0.15em;", | |
| "SYNESTHESIA" | |
| } | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs);", | |
| "[V3026.4]" | |
| } | |
| } | |
| // Right: WebGPU status + FREEZE + settings | |
| div { class: "flex items-center gap-4", | |
| // WebGPU status pill | |
| div { | |
| class: "flex items-center gap-1.5 px-2 py-1 rounded", | |
| style: "background-color: var(--color-surface-2); font-size: var(--text-xs);", | |
| span { | |
| class: "status-dot", | |
| style: if webgpu_active() { | |
| "background-color: var(--color-green);" | |
| } else { | |
| "background-color: var(--color-yellow);" | |
| }, | |
| } | |
| span { | |
| class: "font-mono-synth", | |
| style: "color: var(--color-text-dim);", | |
| if webgpu_active() { "WebGPU Active" } else { "WebGPU Pending" } | |
| } | |
| } | |
| // FREEZE button stub | |
| button { | |
| class: "flex items-center gap-1.5 px-3 py-1 rounded border border-synth", | |
| style: "background: transparent; font-size: var(--text-xs); color: var(--color-text-dim); font-family: var(--font-mono); cursor: pointer;", | |
| onclick: move |_| { /* T-002+ */ }, | |
| "β FREEZE" | |
| } | |
| // Settings stub | |
| button { | |
| style: "background: transparent; border: none; color: var(--color-text-dim); cursor: pointer; font-size: var(--text-lg);", | |
| onclick: move |_| { /* T-002+ */ }, | |
| "β" | |
| } | |
| } | |
| } | |
| } | |
| } | |
| ``` | |
| ### Track Mixer β Left Panel (`packages/ui/src/track_mixer.rs`) | |
| ```rust | |
| // packages/ui/src/track_mixer.rs | |
| use dioxus::prelude::*; | |
| const CHANNELS: &[&str] = &["LEAD", "PAD", "BASS"]; | |
| #[component] | |
| pub fn TrackMixer() -> Element { | |
| rsx! { | |
| aside { | |
| class: "flex flex-col overflow-hidden border-r border-synth", | |
| style: "grid-area: left; background-color: var(--color-bg); width: var(--panel-left-width);", | |
| // Panel header | |
| div { | |
| class: "flex items-center justify-between px-3 py-2 border-b border-synth", | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs); letter-spacing: 0.1em;", | |
| "TRACK MIXER" | |
| } | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs);", | |
| "[MODEL: MUSICVAE]" | |
| } | |
| } | |
| // Channel strips | |
| div { class: "flex flex-col gap-0 flex-1", | |
| for name in CHANNELS { | |
| ChannelStrip { name: name.to_string() } | |
| } | |
| } | |
| } | |
| } | |
| } | |
| #[component] | |
| fn ChannelStrip(name: String) -> Element { | |
| rsx! { | |
| div { | |
| class: "flex flex-col gap-2 p-3 border-b border-synth", | |
| // Channel name | |
| span { | |
| class: "font-mono-synth text-synth-bright", | |
| style: "font-size: var(--text-xs); font-weight: 700; letter-spacing: 0.1em;", | |
| "{name}" | |
| } | |
| // VOL fader stub | |
| div { class: "flex items-center gap-2", | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs); min-width: 28px;", | |
| "VOL" | |
| } | |
| input { | |
| r#type: "range", | |
| min: "0", max: "100", value: "80", | |
| class: "flex-1", | |
| style: "accent-color: var(--color-accent);", | |
| } | |
| } | |
| // PAN fader stub | |
| div { class: "flex items-center gap-2", | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs); min-width: 28px;", | |
| "PAN" | |
| } | |
| input { | |
| r#type: "range", | |
| min: "-50", max: "50", value: "0", | |
| class: "flex-1", | |
| style: "accent-color: var(--color-accent);", | |
| } | |
| } | |
| } | |
| } | |
| } | |
| ``` | |
| ### Center Canvas (`packages/ui/src/center_canvas.rs`) | |
| ```rust | |
| // packages/ui/src/center_canvas.rs | |
| use dioxus::prelude::*; | |
| #[component] | |
| pub fn CenterCanvas() -> Element { | |
| rsx! { | |
| main { | |
| class: "relative overflow-hidden", | |
| style: "grid-area: center; background-color: var(--color-bg);", | |
| // WebGPU canvas β wired in a later task | |
| canvas { | |
| id: "webgpu-canvas", | |
| style: "width: 100%; height: 100%; display: block;", | |
| } | |
| // Top-left overlay | |
| div { | |
| class: "absolute top-3 left-3 px-2 py-1 rounded", | |
| style: "background-color: var(--color-surface); font-size: var(--text-xs); font-family: var(--font-mono); color: var(--color-text-dim);", | |
| "[ENGINE: WEBGPU] REAL-TIME SKELETAL POSE" | |
| } | |
| // Top-right overlay: REC indicator | |
| div { | |
| class: "absolute top-3 right-3", | |
| style: "font-size: var(--text-xs); font-family: var(--font-mono); color: var(--color-red);", | |
| "β REC 00:00:00" | |
| } | |
| } | |
| } | |
| } | |
| ``` | |
| ### Pitch Intelligence β Right Panel (`packages/ui/src/pitch_intel.rs`) | |
| ```rust | |
| // packages/ui/src/pitch_intel.rs | |
| use dioxus::prelude::*; | |
| #[component] | |
| pub fn PitchIntel() -> Element { | |
| rsx! { | |
| aside { | |
| class: "flex flex-col overflow-hidden border-l border-synth", | |
| style: "grid-area: right; background-color: var(--color-surface); width: var(--panel-right-width);", | |
| div { | |
| class: "flex items-center px-3 py-2 border-b border-synth", | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs); letter-spacing: 0.1em;", | |
| "PITCH INTELLIGENCE" | |
| } | |
| } | |
| div { class: "flex flex-col gap-3 p-3", | |
| MetricBar { label: "AI AUTONOMY", value: 0.0 } | |
| MetricBar { label: "GPU COMPUTE", value: 0.0 } | |
| } | |
| } | |
| } | |
| } | |
| #[component] | |
| fn MetricBar(label: String, value: f32) -> Element { | |
| rsx! { | |
| div { class: "flex flex-col gap-1", | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs);", | |
| "{label}" | |
| } | |
| div { | |
| class: "w-full rounded overflow-hidden", | |
| style: "height: 4px; background-color: var(--color-surface-2);", | |
| div { | |
| style: "height: 100%; width: {value * 100.0}%; background-color: var(--color-accent); transition: width var(--transition-base);", | |
| } | |
| } | |
| } | |
| } | |
| } | |
| ``` | |
| ### Bottom Bar (`packages/ui/src/bottom_bar.rs`) | |
| ```rust | |
| // packages/ui/src/bottom_bar.rs | |
| use dioxus::prelude::*; | |
| const MACROS: &[&str] = &["MACRO 1", "MACRO 2", "FILTER", "REVERB", "DELAY", "DRIVE"]; | |
| #[component] | |
| pub fn BottomBar() -> Element { | |
| rsx! { | |
| footer { | |
| class: "flex items-center gap-6 px-4 border-t border-synth overflow-hidden", | |
| style: "grid-area: bottom; background-color: var(--color-surface); height: var(--bottom-bar-height);", | |
| // Transport left | |
| div { class: "flex items-center gap-3 shrink-0", | |
| // DCC ORIENT button stub | |
| div { | |
| style: "width: 40px; height: 40px; border-radius: 50%; border: 1px solid var(--color-border); display: flex; align-items: center; justify-content: center; cursor: pointer;", | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: 8px; letter-spacing: 0.05em;", | |
| "DCC" | |
| } | |
| } | |
| div { class: "flex flex-col gap-0.5", | |
| span { | |
| class: "font-mono-synth", | |
| style: "font-size: var(--text-xs); color: var(--color-green);", | |
| "β LIVE β NEURAL-CORE v4" | |
| } | |
| div { class: "flex items-center gap-3", | |
| span { | |
| class: "font-mono-synth text-synth-bright", | |
| style: "font-size: var(--text-md); font-weight: 700;", | |
| "128 BPM" | |
| } | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: var(--text-xs);", | |
| "4/4 BAR 001 00:00:00" | |
| } | |
| } | |
| } | |
| } | |
| // Macro faders | |
| div { class: "flex items-end gap-4 flex-1 justify-center", | |
| for label in MACROS { | |
| MacroFader { label: label.to_string() } | |
| } | |
| } | |
| } | |
| } | |
| } | |
| #[component] | |
| fn MacroFader(label: String) -> Element { | |
| rsx! { | |
| div { class: "flex flex-col items-center gap-1", | |
| input { | |
| r#type: "range", | |
| min: "0", max: "100", value: "50", | |
| style: " | |
| writing-mode: vertical-lr; | |
| direction: rtl; | |
| height: 60px; | |
| width: 20px; | |
| accent-color: var(--color-accent); | |
| ", | |
| } | |
| span { | |
| class: "font-mono-synth text-synth-dim", | |
| style: "font-size: 9px; letter-spacing: 0.08em;", | |
| "{label}" | |
| } | |
| } | |
| } | |
| } | |
| ``` | |
| ### `packages/ui/src/lib.rs` | |
| ```rust | |
| // packages/ui/src/lib.rs | |
| pub mod layout; | |
| pub mod header; | |
| pub mod track_mixer; | |
| pub mod center_canvas; | |
| pub mod pitch_intel; | |
| pub mod bottom_bar; | |
| pub use layout::Layout; | |
| pub use header::Header; | |
| pub use track_mixer::TrackMixer; | |
| pub use center_canvas::CenterCanvas; | |
| pub use pitch_intel::PitchIntel; | |
| pub use bottom_bar::BottomBar; | |
| ``` | |
| ### Acceptance Criteria β T-001c | |
| - [ ] `cargo check --workspace` passes with all new component files | |
| - [ ] `dx serve --package desktop --features desktop` renders the five-panel layout | |
| - [ ] Background is `#070b10` β no white flash | |
| - [ ] Header shows: `~ SYNESTHESIA [V3026.4]` left, WebGPU pill + FREEZE + β right | |
| - [ ] Left panel (240px): TRACK MIXER / [MODEL: MUSICVAE] header + LEAD/PAD/BASS channel strips with VOL/PAN sliders | |
| - [ ] Center panel: canvas fills remaining space, [ENGINE: WEBGPU] top-left overlay, β REC top-right | |
| - [ ] Right panel (280px): PITCH INTELLIGENCE header + AI AUTONOMY + GPU COMPUTE metric bars | |
| - [ ] Bottom bar (100px): DCC circle + LIVE indicator + 128 BPM + 4/4 + MACRO 1-6 faders | |
| - [ ] Window resizes: layout holds at 960Γ600 minimum, no overflow | |
| - [ ] No hardcoded hex values in any `.rs` file β all via CSS variables | |
| --- | |
| ## Subtask T-001d β API Dependencies (Full Runtime Stack) | |
| ### Goal | |
| Wire up the complete three-tier inference runtime in `packages/api/Cargo.toml`: | |
| Burn wgpu (primary, pure Rust), llama.cpp + Vulkan (LLMs), ORT DirectML (fallback | |
| until Burn op coverage catches up). All WASM-gated correctly. Add all 22 model | |
| stub modules matching the HF repo structure. | |
| ### Updated `packages/api/Cargo.toml` | |
| Replace entirely with: | |
| ```toml | |
| [package] | |
| name = "api" | |
| version = "0.1.0" | |
| edition = "2024" | |
| [dependencies] | |
| dioxus = { workspace = true, features = ["fullstack"] } | |
| burn = { version = "0.21.0-pre.2", default-features = false, features = ["train"] } | |
| hf-hub = { version = "0.5.0", default-features = false } | |
| ndarray = "0.17.2" | |
| serde = { version = "1.0.228", features = ["derive"] } | |
| serde_json = "1.0.149" | |
| anyhow = "1.0.102" | |
| once_cell = "1.21.4" | |
| # WASM: webgpu backend + wasm-compatible TLS + getrandom | |
| [target.'cfg(target_arch = "wasm32")'.dependencies] | |
| burn = { version = "0.21.0-pre.2", default-features = false, features = ["train", "webgpu"] } | |
| getrandom = { version = "0.4.2", features = ["wasm_js"] } | |
| hf-hub = { version = "0.5.0", default-features = false, features = ["rustls-tls"] } | |
| # Native: full runtime stack | |
| [target.'cfg(not(target_arch = "wasm32"))'.dependencies] | |
| burn = { version = "0.21.0-pre.2", default-features = false, features = ["train", "wgpu"] } | |
| hf-hub = { version = "0.5.0", default-features = false, features = ["native-tls"] } | |
| tokio = { version = "1.43.0", features = ["full"] } | |
| # llama.cpp + Vulkan β LLM inference (Gemma-3N, future LLMs) | |
| # Zero-config: auto-downloads pre-built Vulkan binaries from llama.cpp releases. | |
| # Same stack as LM Studio. Vulkan works on RX 6700 XT without ROCm. | |
| llama-cpp-v3 = { version = "*" } | |
| # ORT DirectML β temporary fallback for models burn-import cannot yet handle | |
| # (LSTM-heavy: Performance RNN, MusicVAE, Melody/Drums/Improv/Polyphony RNN) | |
| # Remove per-model as Burn op coverage matures. Tracked in MODELS.md. | |
| ort = { version = "2", features = ["directml"], optional = true } | |
| [features] | |
| server = ["dioxus/server"] | |
| ort-models = ["ort"] # activate only for models blocked on Burn LSTM support | |
| ``` | |
| ### `packages/api/src/lib.rs` | |
| ```rust | |
| // packages/api/src/lib.rs | |
| pub mod quality; | |
| pub mod models; | |
| // Server-only modules β not compiled to WASM | |
| #[cfg(not(target_arch = "wasm32"))] | |
| pub mod backend; | |
| ``` | |
| ### `packages/api/src/quality.rs` | |
| ```rust | |
| // packages/api/src/quality.rs | |
| /// Model quality tier β controls which ONNX variant is loaded from HF. | |
| /// Full = fp32 (reference), Half = fp16 (recommended), Lite = int8 (low latency). | |
| #[derive(Debug, Clone, Copy, PartialEq, Eq, serde::Serialize, serde::Deserialize, Default)] | |
| pub enum Quality { | |
| Full, | |
| #[default] | |
| Half, | |
| Lite, | |
| } | |
| impl Quality { | |
| pub fn onnx_suffix(self) -> &'static str { | |
| match self { | |
| Quality::Full => "fp32.onnx", | |
| Quality::Half => "fp16.onnx", | |
| Quality::Lite => "int8.onnx", | |
| } | |
| } | |
| } | |
| ``` | |
| ### `packages/api/src/models/mod.rs` | |
| All 22 models stubbed with ID, HF path, runtime tier, and Burn compatibility note. | |
| Follow this exact pattern for every module file β one `pub struct` with `name()`, | |
| `hf_repo()`, and `hf_path()` methods. No inference logic in T-001. | |
| ```rust | |
| // packages/api/src/models/mod.rs | |
| // ββ Magenta RT ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| /// MRT-001: Magenta RT LLM β 800M param autoregressive transformer | |
| pub mod magenta_rt; | |
| /// MRT-002/003: SpectroStream β 48kHz stereo audio codec (25Hz, 64 RVQ) | |
| pub mod spectrostream; | |
| /// MRT-004/005: MusicCoCa β text + audio β 768-dim music embeddings | |
| pub mod musiccoca; | |
| // ββ Magenta Classic MIDI βββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| /// MC-001: Performance RNN β expressive MIDI performance generation | |
| /// Runtime: ORT fallback (LSTM) β Burn when op coverage lands | |
| pub mod perfrnn; | |
| /// MC-002: Melody RNN β melody continuation (LSTM) | |
| pub mod melody_rnn; | |
| /// MC-003: Drums RNN β drum pattern generation (LSTM) | |
| pub mod drums_rnn; | |
| /// MC-004: Improv RNN β chord-conditioned melody (LSTM) | |
| pub mod improv_rnn; | |
| /// MC-005: Polyphony RNN β polyphonic generation (LSTM) | |
| pub mod polyphony_rnn; | |
| /// MC-006: MusicVAE β latent music VAE enc/dec (BiLSTM) | |
| pub mod musicvae; | |
| /// MC-007: GrooVAE β drum humanization VAE | |
| pub mod groovae; | |
| /// MC-008: MidiMe β personalize MusicVAE in-session | |
| pub mod midime; | |
| /// MC-009: Music Transformer β long-form piano generation (Attention) | |
| pub mod music_transformer; | |
| /// MC-010: Coconet β counterpoint by convolution | |
| pub mod coconet; | |
| // ββ Magenta Classic Audio ββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| /// MA-001: GANSynth β GAN audio synthesis (NSynth timbres) | |
| /// Runtime: Burn wgpu (conv-heavy, good op coverage) | |
| pub mod gansynth; | |
| /// MA-002: NSynth β WaveNet neural audio synthesis | |
| pub mod nsynth; | |
| /// MA-003/004: DDSP β differentiable DSP enc/dec | |
| pub mod ddsp; | |
| /// MA-005: Piano Genie β 8-button β 88-key VQ-VAE | |
| pub mod piano_genie; | |
| /// MA-006: Onsets and Frames β polyphonic piano transcription | |
| pub mod onsets_and_frames; | |
| /// MA-007: SPICE β monophonic pitch extraction | |
| pub mod spice; | |
| // ββ LLM / Vision βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| /// LV-001: Gemma-3N e2b-it β vision + text β mood/energy/key JSON | |
| /// Runtime: llama.cpp + Vulkan (GGUF). NOT Burn or ORT. | |
| pub mod gemma3n; | |
| ``` | |
| **Stub file template** β repeat for every module above, updating the doc comment: | |
| ```rust | |
| // packages/api/src/models/gansynth.rs | |
| /// MA-001: GANSynth β GAN-based NSynth audio synthesis. | |
| /// | |
| /// HF repo: Ashiedu/Synesthesia | |
| /// HF paths: audio/gansynth/fp32.onnx | fp16.onnx | |
| /// Source: google/magenta GANSynth (ICLR 2019) | |
| /// Export: tf2onnx from SavedModel checkpoint | |
| /// Runtime: Burn wgpu (primary) β conv-heavy, good op coverage | |
| /// | |
| /// Inputs: | |
| /// z: [batch, 256] β latent vector, sample from N(0,1) | |
| /// pitch: [batch] β MIDI pitch integer, range 24β84 | |
| /// Output: | |
| /// audio: [batch, 64000] β raw PCM @ 16kHz, 4 seconds | |
| pub struct GanSynth; | |
| impl GanSynth { | |
| pub fn name() -> &'static str { "gansynth" } | |
| pub fn hf_repo() -> &'static str { "Ashiedu/Synesthesia" } | |
| pub fn hf_path(quality: crate::quality::Quality) -> String { | |
| format!("audio/gansynth/{}", quality.onnx_suffix()) | |
| } | |
| } | |
| ``` | |
| **Key differences per model group:** | |
| | Group | `hf_path` prefix | Runtime note in doc | | |
| |-------|-----------------|---------------------| | |
| | Magenta RT | `magenta_rt/` | JAX checkpoint; ONNX export via jax2onnx | | |
| | Magenta MIDI | `midi/<name>/` | ORT fallback (LSTM) β Burn when ready | | |
| | Magenta Audio | `audio/<name>/` | Burn wgpu primary | | |
| | Gemma-3N | `llm/gemma3n_e2b/` | llama.cpp + Vulkan; GGUF not ONNX | | |
| ### `packages/api/src/backend.rs` (native only) | |
| ```rust | |
| // packages/api/src/backend.rs | |
| // cfg guard is in lib.rs β this file only compiles on native targets | |
| use burn::backend::wgpu::{Wgpu, WgpuDevice}; | |
| pub type Backend = Wgpu; | |
| /// Returns the best available wgpu device. | |
| /// On Windows with RX 6700 XT, selects AMD GPU via Vulkan/DX12. | |
| pub fn best_device() -> WgpuDevice { | |
| WgpuDevice::BestAvailable | |
| } | |
| ``` | |
| ### Acceptance Criteria β T-001d | |
| - [ ] `cargo check --workspace` passes for all targets | |
| - [ ] `cargo check --package api --target wasm32-unknown-unknown` passes | |
| - [ ] `cargo tree -p api --target wasm32-unknown-unknown` shows no `tokio`, `burn/wgpu`, `llama-cpp-v3`, or `ort` | |
| - [ ] All 22 model stub files exist with correct IDs, HF paths, and runtime notes | |
| - [ ] `Quality` enum compiles on both targets with `hf_path()` helper | |
| - [ ] `llama-cpp-v3` present only under `cfg(not(wasm32))` | |
| - [ ] `ort` only activates under `--features ort-models` β not compiled by default | |
| --- | |
| ## Subtask T-001e β HuggingFace Repo + MODELS.md | |
| ### Goal | |
| Push the canonical README to `Ashiedu/Synesthesia`, create `manifest.json`, | |
| create per-model directory README stubs, and write `MODELS.md` at the repo root. | |
| This makes the HF repo authoritative before any model files exist in it. | |
| ### 1. Push updated README to HuggingFace | |
| The file `HF_README.md` (provided separately) is the canonical README for | |
| `Ashiedu/Synesthesia`. Push it as `README.md` to the HF repo: | |
| ```python | |
| from huggingface_hub import HfApi | |
| api = HfApi(token=HF_TOKEN) # HF_TOKEN in environment or Colab Secrets | |
| api.upload_file( | |
| path_or_fileobj="HF_README.md", | |
| path_in_repo="README.md", | |
| repo_id="Ashiedu/Synesthesia", | |
| commit_message="T-001e: canonical README β full model inventory + runtime strategy", | |
| ) | |
| ``` | |
| If running locally (not Colab), use `huggingface-cli login` first. | |
| ### 2. manifest.json | |
| Create at the repo root and push to HF at `manifest.json`: | |
| ```json | |
| { | |
| "version": "0.1.0", | |
| "repo": "Ashiedu/Synesthesia", | |
| "runtime_tiers": { | |
| "Full": { "suffix": "_fp32.onnx", "description": "Reference quality" }, | |
| "Half": { "suffix": "_fp16.onnx", "description": "Default β RX 6700 XT" }, | |
| "Lite": { "suffix": "_int8.onnx", "description": "Lowest latency (MIDI models only)" }, | |
| "GGUF_Q4": { "suffix": "_q4_k_m.gguf", "description": "Default LLM tier" }, | |
| "GGUF_Q2": { "suffix": "_q2_k.gguf", "description": "Lite LLM tier" }, | |
| "GGUF_F16": { "suffix": "_f16.gguf", "description": "Full LLM tier" } | |
| }, | |
| "models": { | |
| "MRT-001": { "id": "MRT-001", "name": "Magenta RT LLM", "path": "magenta_rt/llm/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": false, "export": "jax2onnx" }, | |
| "MRT-002": { "id": "MRT-002", "name": "SpectroStream Enc", "path": "magenta_rt/spectrostream/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "jax2onnx" }, | |
| "MRT-003": { "id": "MRT-003", "name": "SpectroStream Dec", "path": "magenta_rt/spectrostream/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "jax2onnx" }, | |
| "MRT-004": { "id": "MRT-004", "name": "MusicCoCa Text", "path": "magenta_rt/musiccoca/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "jax2onnx" }, | |
| "MRT-005": { "id": "MRT-005", "name": "MusicCoCa Audio", "path": "magenta_rt/musiccoca/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "jax2onnx" }, | |
| "MC-001": { "id": "MC-001", "name": "Performance RNN", "path": "midi/perfrnn/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx", "ui_role": "AI arpeggiator" }, | |
| "MC-002": { "id": "MC-002", "name": "Melody RNN", "path": "midi/melody_rnn/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx", "ui_role": "Melody continuation" }, | |
| "MC-003": { "id": "MC-003", "name": "Drums RNN", "path": "midi/drums_rnn/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx", "ui_role": "Beat generation" }, | |
| "MC-004": { "id": "MC-004", "name": "Improv RNN", "path": "midi/improv_rnn/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx", "ui_role": "Live improv" }, | |
| "MC-005": { "id": "MC-005", "name": "Polyphony RNN", "path": "midi/polyphony_rnn/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx" }, | |
| "MC-006": { "id": "MC-006", "name": "MusicVAE", "path": "midi/musicvae/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx", "ui_role": "Latent interpolation" }, | |
| "MC-007": { "id": "MC-007", "name": "GrooVAE", "path": "midi/groovae/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx", "ui_role": "Drum humanization" }, | |
| "MC-008": { "id": "MC-008", "name": "MidiMe", "path": "midi/midime/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx" }, | |
| "MC-009": { "id": "MC-009", "name": "Music Transformer", "path": "midi/music_transformer/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx" }, | |
| "MC-010": { "id": "MC-010", "name": "Coconet", "path": "midi/coconet/", "format": "onnx", "runtime": "ort-fallback","burn_ready": false, "export": "tf2onnx" }, | |
| "MA-001": { "id": "MA-001", "name": "GANSynth", "path": "audio/gansynth/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": true, "export": "tf2onnx", "ui_role": "Timbre synthesis" }, | |
| "MA-002": { "id": "MA-002", "name": "NSynth", "path": "audio/nsynth/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": true, "export": "tf2onnx" }, | |
| "MA-003": { "id": "MA-003", "name": "DDSP Encoder", "path": "audio/ddsp/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": true, "export": "tf2onnx" }, | |
| "MA-004": { "id": "MA-004", "name": "DDSP Decoder", "path": "audio/ddsp/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": true, "export": "tf2onnx" }, | |
| "MA-005": { "id": "MA-005", "name": "Piano Genie", "path": "audio/piano_genie/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": true, "export": "tf2onnx", "ui_role": "Accessible performance" }, | |
| "MA-006": { "id": "MA-006", "name": "Onsets and Frames", "path": "audio/onsets_and_frames/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": false, "export": "tf2onnx", "ui_role": "Audio β MIDI" }, | |
| "MA-007": { "id": "MA-007", "name": "SPICE", "path": "audio/spice/", "format": "onnx", "runtime": "burn-wgpu", "burn_ready": true, "export": "tf2onnx", "ui_role": "Pitch tracking" }, | |
| "LV-001": { "id": "LV-001", "name": "Gemma-3N e2b-it", "path": "llm/gemma3n_e2b/", "format": "gguf", "runtime": "llama-cpp", "burn_ready": false, "export": "unsloth", "ui_role": "Vision β mood/energy/key" } | |
| } | |
| } | |
| ``` | |
| Push to HF: | |
| ```python | |
| api.upload_file( | |
| path_or_fileobj="manifest.json", | |
| path_in_repo="manifest.json", | |
| repo_id="Ashiedu/Synesthesia", | |
| commit_message="T-001e: manifest.json β authoritative model registry", | |
| ) | |
| ``` | |
| ### 3. Per-model directory README stubs | |
| For every model in `manifest.json`, create a `README.md` in its HF directory. | |
| Use this template, filling in the model-specific fields: | |
| ```markdown | |
| # <Model Name> β Synesthesia | |
| **ID:** <MRT-001 etc> | |
| **HF path:** `Ashiedu/Synesthesia/<path>/` | |
| **Source:** <Magenta / Magenta RT / Google> | |
| **Task:** <one line> | |
| **Synesthesia role:** <UI role> | |
| **Export method:** `<tf2onnx | jax2onnx | unsloth>` | |
| **Runtime:** <Burn wgpu | ORT DirectML | llama.cpp Vulkan> | |
| **Burn compatible:** <yes / not yet β blocked on <op>> | |
| ## Quality Tiers Available | |
| | Tier | File | Status | | |
| |------|------|--------| | |
| | Full | `fp32.onnx` | Planned | | |
| | Half | `fp16.onnx` | Planned | | |
| | Lite | `int8.onnx` | Planned (MIDI only) | | |
| ## Inputs / Outputs | |
| | Tensor | Shape | Description | | |
| |--------|-------|-------------| | |
| | ... | ... | ... | | |
| ## Export Command (Colab) | |
| See T-001f Colab template. Model-specific command: | |
| \`\`\`bash | |
| <export command> | |
| \`\`\` | |
| ## Fine-tuning | |
| Train after the app is functional. Data source TBD from live usage. | |
| ``` | |
| Push all README stubs in a single batch: | |
| ```python | |
| from pathlib import Path | |
| for model_id, model in manifest["models"].items(): | |
| readme_content = generate_readme(model) # fill template above | |
| api.upload_file( | |
| path_or_fileobj=readme_content.encode(), | |
| path_in_repo=f"{model['path']}README.md", | |
| repo_id="Ashiedu/Synesthesia", | |
| commit_message=f"T-001e: {model_id} {model['name']} README stub", | |
| ) | |
| ``` | |
| ### 4. MODELS.md at repo root | |
| Create `MODELS.md` in the Synesthesia GitHub repo (not HF): | |
| ```markdown | |
| # Synesthesia β Model Roadmap | |
| **HF Repo:** [Ashiedu/Synesthesia](https://hf.co/Ashiedu/Synesthesia) | |
| **Authoritative registry:** `manifest.json` in the HF repo | |
| **Pipeline docs:** `docs/ml-pipeline.md` | |
| --- | |
| ## Inference Runtime Strategy | |
| ### Tier 1 β llama.cpp + Vulkan | |
| - **Crate:** `llama-cpp-v3` β zero-config Vulkan binaries, same stack as LM Studio | |
| - **Models:** Gemma-3N e2b-it (GGUF). All future LLMs added here. | |
| - **GPU:** RX 6700 XT via Vulkan β no ROCm, no CUDA needed on Windows 11 | |
| ### Tier 2 β Burn wgpu (primary, pure Rust) | |
| - **Crate:** `burn 0.21.0-pre.2`, wgpu backend β Vulkan/DX12 | |
| - **Models:** GANSynth, NSynth, DDSP, Piano Genie, SPICE (conv-heavy, good op coverage) | |
| - **Migration:** Each ORT model moves here when `burn-onnx` succeeds in CI | |
| ### Tier 3 β ORT + DirectML (fallback, temporary) | |
| - **Crate:** `ort = { features = ["directml"], optional = true }`, feature = `ort-models` | |
| - **Models:** All LSTM/Attention models until Burn op coverage reaches them | |
| - **Lifecycle:** Per-model exit tracked in the table below | |
| --- | |
| ## Model Inventory | |
| ### Magenta RT | |
| | ID | Model | Task | Burn | ORT | llama.cpp | | |
| |----|-------|------|------|-----|-----------| | |
| | MRT-001 | Magenta RT LLM | Real-time stereo audio generation | π | β | β | | |
| | MRT-002 | SpectroStream Enc | Audio β spectral tokens (48kHz) | π | β | β | | |
| | MRT-003 | SpectroStream Dec | Spectral tokens β audio | π | β | β | | |
| | MRT-004 | MusicCoCa Text | Text β 768-dim music embedding | π | β | β | | |
| | MRT-005 | MusicCoCa Audio | Audio β 768-dim music embedding | π | β | β | | |
| ### Magenta Classic β MIDI | |
| | ID | Model | Synesthesia Role | Burn | ORT | Export | | |
| |----|-------|-----------------|------|-----|--------| | |
| | MC-001 | Performance RNN | AI arpeggiator | π LSTM | β | tf2onnx | | |
| | MC-002 | Melody RNN | Melody continuation | π LSTM | β | tf2onnx | | |
| | MC-003 | Drums RNN | Beat generation | π LSTM | β | tf2onnx | | |
| | MC-004 | Improv RNN | Live improv over chords | π LSTM | β | tf2onnx | | |
| | MC-005 | Polyphony RNN | Harmonic voice gen | π LSTM | β | tf2onnx | | |
| | MC-006 | MusicVAE | Latent interpolation | π BiLSTM | β | tf2onnx | | |
| | MC-007 | GrooVAE | Drum humanization | π BiLSTM | β | tf2onnx | | |
| | MC-008 | MidiMe | User-adaptive latent space | π | β | tf2onnx | | |
| | MC-009 | Music Transformer | Long-form piano gen | π Attn | β | tf2onnx | | |
| | MC-010 | Coconet | Counterpoint / harmony fill | π Conv | β | tf2onnx | | |
| ### Magenta Classic β Audio | |
| | ID | Model | Synesthesia Role | Burn | ORT | Export | | |
| |----|-------|-----------------|------|-----|--------| | |
| | MA-001 | GANSynth | GANHarp timbre instrument | β | β | tf2onnx | | |
| | MA-002 | NSynth | Neural sample synthesis | β | β | tf2onnx | | |
| | MA-003 | DDSP Encoder | Audio β harmonic params | β | β | tf2onnx | | |
| | MA-004 | DDSP Decoder | Harmonic params β audio | β | β | tf2onnx | | |
| | MA-005 | Piano Genie | 8-button β 88-key piano | β | β | tf2onnx | | |
| | MA-006 | Onsets & Frames | Audio β MIDI transcription | π Conv+LSTM | β | tf2onnx | | |
| | MA-007 | SPICE | Monophonic pitch tracking | β | β | tf2onnx | | |
| ### LLM / Vision | |
| | ID | Model | Role | Runtime | Format | | |
| |----|-------|------|---------|--------| | |
| | LV-001 | Gemma-3N e2b-it | Camera β mood/energy/key JSON | llama.cpp + Vulkan | GGUF | | |
| **Legend:** β active Β· β not used Β· π planned (blocked on op) | |
| --- | |
| ## CI β Burn Migration Tracking | |
| Weekly CI job attempts `burn-onnx ModelGen` on each exported model. | |
| On success: flip `burn_ready` in `manifest.json` and remove from `ort-models` feature. | |
| ```bash | |
| # .github/workflows/burn-migration.yml | |
| # Runs weekly: cargo run --bin burn-compat-check | |
| ``` | |
| --- | |
| ## Training Philosophy | |
| **Train after the app works.** Identify real input distributions from live usage. | |
| Fine-tune on your own audio and MIDI once the signal chain is operational. | |
| Tentative fine-tuning sequence: | |
| 1. Performance RNN β live MIDI from Track Mixer (MC-001) | |
| 2. Melody RNN β melody continuation from Track Mixer input (MC-002) | |
| 3. MusicVAE + GrooVAE β latent interpolation + drum humanization (MC-006/007) | |
| 4. GANSynth β timbre generation from Pitch Intelligence panel (MA-001) | |
| 5. DDSP β resynthesis of GANSynth outputs (MA-003/004) | |
| 6. Magenta RT β full audio, conditioned on your own catalog (MRT-001) | |
| 7. Gemma-3N β fine-tune on camera sessions for mood/energy mapping (LV-001) | |
| ``` | |
| ### Acceptance Criteria β T-001e | |
| - [ ] Updated README pushed to `Ashiedu/Synesthesia` on HuggingFace | |
| - [ ] `manifest.json` pushed to HF with all 22 model entries | |
| - [ ] Per-model README stub exists in every model subdirectory on HF | |
| - [ ] `MODELS.md` exists at GitHub repo root with full inventory + CI tracking table | |
| - [ ] Runtime strategy section present: llama.cpp / Burn / ORT tiers documented | |
| - [ ] Training sequence present with "train after the app works" rule | |
| --- | |
| ## Subtask T-001f β Colab Export Template | |
| ### Goal | |
| Create a generic Colab-ready export notebook template that covers all three | |
| export paths (tf2onnx, jax2onnx, Unsloth GGUF) and the push-to-HF workflow. | |
| Store as `docs/colab_export_template.ipynb` in the GitHub repo. | |
| Gemini on Colab can execute this directly β paste the model's README as context. | |
| ### `docs/colab_export_template.ipynb` β Key Cells | |
| Create a Jupyter notebook with these cells as markdown + code: | |
| **Cell 1 β Setup** | |
| ```python | |
| # Install dependencies | |
| !pip install -q tf2onnx onnx onnxruntime onnxconverter-common \ | |
| huggingface_hub unsloth transformers accelerate | |
| # HF token β set in Colab Secrets (key: HF_TOKEN) | |
| import os | |
| from google.colab import userdata | |
| HF_TOKEN = userdata.get("HF_TOKEN") | |
| from huggingface_hub import HfApi, snapshot_download | |
| api = HfApi(token=HF_TOKEN) | |
| REPO_ID = "Ashiedu/Synesthesia" | |
| ``` | |
| **Cell 2 β Pull current checkpoint from HF (if updating)** | |
| ```python | |
| snapshot_download(REPO_ID, local_dir="./models", token=HF_TOKEN) | |
| ``` | |
| **Cell 3a β Magenta Classic export (tf2onnx)** | |
| ```python | |
| # Clone Magenta | |
| !git clone --depth 1 https://github.com/magenta/magenta | |
| # Load checkpoint and export β MODEL_NAME set per-model | |
| import tensorflow as tf | |
| import tf2onnx, onnx | |
| MODEL_NAME = "performance_rnn" # change per model | |
| CHECKPOINT = f"./magenta_checkpoints/{MODEL_NAME}" | |
| OUTPUT_ONNX = f"{MODEL_NAME}.onnx" | |
| # Load SavedModel or concrete function | |
| model = tf.saved_model.load(CHECKPOINT) | |
| spec = (tf.TensorSpec((None, 128, 388), tf.float32, name="input"),) | |
| model_proto, _ = tf2onnx.convert.from_function( | |
| model.signatures["serving_default"], | |
| input_signature=spec, | |
| opset=17, | |
| output_path=OUTPUT_ONNX, | |
| ) | |
| print(f"Exported: {OUTPUT_ONNX}") | |
| ``` | |
| **Cell 3b β Magenta RT export (jax2onnx)** | |
| ```python | |
| # Clone Magenta RT | |
| !git clone --depth 1 https://github.com/magenta/magenta-realtime | |
| !pip install -q jax jaxlib jax2onnx | |
| # MODEL_NAME: spectrostream_encoder | spectrostream_decoder | musiccoca_text | musiccoca_audio | |
| MODEL_NAME = "spectrostream_encoder" | |
| # Follow magenta-realtime export docs for each component | |
| ``` | |
| **Cell 3c β Gemma-3N GGUF (Unsloth, free T4)** | |
| ```python | |
| from unsloth import FastLanguageModel | |
| model, tokenizer = FastLanguageModel.from_pretrained( | |
| "google/gemma-3n-e2b-it", | |
| max_seq_length=512, | |
| load_in_4bit=True, | |
| ) | |
| # Fine-tune if desired (LoRA), then export tiers | |
| for quant in ["q4_k_m", "q2_k", "f16"]: | |
| out = f"gemma3n_e2b_{quant}" | |
| model.save_pretrained_gguf(out, tokenizer, quantization_method=quant) | |
| api.upload_folder( | |
| folder_path=out, | |
| path_in_repo=f"llm/gemma3n_e2b/", | |
| repo_id=REPO_ID, | |
| commit_message=f"LV-001 Gemma-3N {quant}", | |
| ) | |
| ``` | |
| **Cell 4 β Quantize ONNX to fp16 and int8** | |
| ```python | |
| import onnx | |
| import onnxconverter_common as occ | |
| from onnxruntime.quantization import quantize_dynamic, QuantType | |
| fp32_model = onnx.load(OUTPUT_ONNX) | |
| # fp16 | |
| fp16_model = occ.convert_float_to_float16(fp32_model, keep_io_types=True) | |
| fp16_path = OUTPUT_ONNX.replace(".onnx", "_fp16.onnx") | |
| onnx.save(fp16_model, fp16_path) | |
| # int8 (MIDI models only β skip for audio models) | |
| int8_path = OUTPUT_ONNX.replace(".onnx", "_int8.onnx") | |
| quantize_dynamic(OUTPUT_ONNX, int8_path, weight_type=QuantType.QInt8) | |
| print(f"fp32: {os.path.getsize(OUTPUT_ONNX) / 1e6:.1f} MB") | |
| print(f"fp16: {os.path.getsize(fp16_path) / 1e6:.1f} MB") | |
| print(f"int8: {os.path.getsize(int8_path) / 1e6:.1f} MB") | |
| ``` | |
| **Cell 5 β Validate with ORT before push** | |
| ```python | |
| import onnxruntime as ort | |
| import numpy as np | |
| sess = ort.InferenceSession(fp16_path, | |
| providers=["CPUExecutionProvider"]) # CPU on Colab is fine for validation | |
| # Run with dummy input matching the model's input shape | |
| dummy = {inp.name: np.zeros(inp.shape, dtype=np.float32) | |
| for inp in sess.get_inputs()} | |
| out = sess.run(None, dummy) | |
| print("Validation passed:", [o.shape for o in out]) | |
| ``` | |
| **Cell 6 β Push all tiers to HF** | |
| ```python | |
| MODEL_ID = "MC-001" # change per model | |
| HF_PREFIX = "midi/perfrnn" # change per model β matches manifest.json path | |
| for local_path, repo_suffix in [ | |
| (OUTPUT_ONNX, "fp32.onnx"), | |
| (fp16_path, "fp16.onnx"), | |
| (int8_path, "int8.onnx"), | |
| ]: | |
| api.upload_file( | |
| path_or_fileobj=local_path, | |
| path_in_repo=f"{HF_PREFIX}/{repo_suffix}", | |
| repo_id=REPO_ID, | |
| commit_message=f"{MODEL_ID} {repo_suffix}", | |
| ) | |
| print(f"Pushed {HF_PREFIX}/{repo_suffix}") | |
| # Update manifest.json burn_ready if applicable | |
| ``` | |
| ### Gemini on Colab Workflow | |
| Gemini does not need GitHub integration. The workflow: | |
| 1. Open a new Colab notebook | |
| 2. Paste the model's HF directory README as a Gemini context cell | |
| 3. Paste the relevant cells from `colab_export_template.ipynb` | |
| 4. Set `MODEL_NAME`, `MODEL_ID`, `HF_PREFIX` to match the model's README | |
| 5. Run β Gemini can drive execution and debug errors | |
| ### Acceptance Criteria β T-001f | |
| - [ ] `docs/colab_export_template.ipynb` exists in the GitHub repo | |
| - [ ] Notebook has all 6 cells above with correct code | |
| - [ ] All three export paths covered: tf2onnx, jax2onnx, Unsloth GGUF | |
| - [ ] Validation cell present (ORT CPU check before push) | |
| - [ ] Gemini workflow documented in a markdown cell in the notebook | |
| --- | |
| ## Global Acceptance Criteria | |
| All must pass before closing T-001: | |
| ### Build | |
| - [ ] `cargo check --workspace` β zero errors, zero warnings | |
| - [ ] `dx serve --package desktop --features desktop` β window opens, layout renders | |
| - [ ] `dx serve --package web --features web` β WASM builds, serves on localhost | |
| - [ ] `cargo check --package api --target wasm32-unknown-unknown` β passes cleanly | |
| - [ ] Tailwind `output.css` builds with zero errors | |
| ### Visual | |
| - [ ] Five-panel layout matches Kansas prototype structure | |
| - [ ] All design tokens from Kansas `tokens.css` present in Tailwind `@theme` | |
| - [ ] No white flash on window open; background is `#070b10` | |
| - [ ] Window min-size 960Γ600 enforced | |
| - [ ] No hardcoded hex values in any `.rs` file | |
| ### Codebase | |
| - [ ] All 22 model stub files exist with correct IDs, HF paths, runtime tier, and doc comments | |
| - [ ] `llama-cpp-v3` only in native target, not in WASM | |
| - [ ] `ort` only activates under `--features ort-models` | |
| - [ ] `MODELS.md` at repo root β full 22-model inventory, CI tracking table, training sequence | |
| - [ ] `docs/colab_export_template.ipynb` exists with all 6 cells | |
| - [ ] `Dioxus.toml` at repo root | |
| ### HuggingFace | |
| - [ ] `Ashiedu/Synesthesia` README updated β full model inventory, correct runtime strategy | |
| - [ ] `manifest.json` pushed to HF with all 22 model entries | |
| - [ ] Per-model README stub exists in every model subdirectory on HF | |
| ### Do Not Do | |
| - [ ] Do not delete or rename any existing file from the working template | |
| - [ ] Do not change `resolver = "2"` in the workspace root | |
| - [ ] Do not change any existing dep versions β only add new ones | |
| - [ ] Do not add `tokio` to `[dependencies]` in `api` β native `cfg` block only | |
| - [ ] Do not add logic to stub components β structure + tokens only | |
| - [ ] Do not start T-002 until all T-001 criteria are green | |
| --- | |
| ## Blocker Protocol | |
| If any criterion cannot be met: | |
| ```markdown | |
| ## Blocker β T-001[a/b/c/d/e/f] β [date] | |
| **Subtask:** T-001x | |
| **Criterion failing:** [verbatim] | |
| **Root cause:** [what happened] | |
| **Attempted fixes:** [what was tried] | |
| **Decision needed:** [specific question for human] | |
| ``` | |
| Set `status:blocked`, remove `status:in-progress`. Do not guess or push forward. | |
| --- | |
| ## Subtask Order for Jules | |
| Jules should work through subtasks in this order. Complete and verify each | |
| before starting the next β do not parallelize across subtasks. | |
| ``` | |
| T-001a β T-001b β T-001c β T-001d β T-001e β T-001f | |
| UI shell Tokens Layout Deps HF repo Colab | |
| ``` | |
| T-001e (HF repo) requires Python + `huggingface_hub` and a valid `HF_TOKEN` | |
| in the environment. If not available locally, complete T-001aβd first and | |
| flag T-001e/f as requiring Colab or a machine with HF credentials. | |
| --- | |
| ## GitHub CLI | |
| ```bash | |
| # Create labels (safe to run if they already exist) | |
| gh label create "stack:dioxus" --color "#DEA584" --description "Dioxus UI framework" 2>/dev/null || true | |
| gh label create "stack:tailwind" --color "#38BDF8" --description "Tailwind CSS" 2>/dev/null || true | |
| gh label create "stack:ml" --color "#A78BFA" --description "ML / model pipeline" 2>/dev/null || true | |
| gh label create "agent:human-led" --color "#F9A825" --description "Requires human review before merge" 2>/dev/null || true | |
| gh label create "status:ready" --color "#0075CA" --description "Ready to start" 2>/dev/null || true | |
| gh label create "status:blocked" --color "#E4E669" --description "Blocked, needs human input" 2>/dev/null || true | |
| gh label create "day:1" --color "#C5DEF5" --description "Day 1 tasks" 2>/dev/null || true | |
| gh issue create \ | |
| --title "T-001: Bootstrap Synesthesia β Dioxus fullstack, Tailwind v4, Kansas UI, full ML pipeline" \ | |
| --label "type:task,stack:rust,stack:dioxus,stack:tailwind,stack:ml,agent:human-led,priority:critical,status:ready,day:1" \ | |
| --body-file T-001.md | |
| ``` | |
| --- | |
| **Blocks:** T-002, T-003, T-004 and all subsequent tasks | |
| **Blocked By:** none | |
| **Version:** v0.1 | |
| **Iteration:** iter-1 | |
| **Effort:** L | |
| **Subtasks:** T-001a Β· T-001b Β· T-001c Β· T-001d Β· T-001e Β· T-001f | |