Spaces:
Runtime error
Runtime error
| # T-006: `synesthesia-core` Crate Scaffold β Shared Constants, Types, Module Structure | |
| **Type:** Task | |
| **Phase:** 0 β Foundation | |
| **Autonomy:** `agent:autonomous` β Pure Rust library, no external systems, no decisions required. | |
| **Stack:** `stack:rust` | |
| **Version:** v0.1 | |
| **Iteration:** iter-1 | |
| **Effort:** XS (1 hour) | |
| --- | |
| > β οΈ **Agent Scope:** Populate `crates/synesthesia-core/` with the canonical shared constants, a handful of foundational types, and stub module declarations. Do **not** implement any audio processing, model inference, or IPC logic here β those live in `kansas/` and `plugin/`. This crate is a library. Its job in v0.1 is to be the single source of truth for values that both targets must agree on. | |
| --- | |
| ## Context | |
| T-001 created `crates/synesthesia-core/src/lib.rs` as a comment-only stub. T-005 added it as a dep to `plugin/`. T-006 gives it actual content. | |
| The core problem this crate solves: `kansas/` and `plugin/` are separate compilation units. If `kansas/` defines `SAMPLE_RATE = 48_000` locally and `plugin/` defines it separately, they will inevitably diverge. `synesthesia-core` is the single place these constants live β both crates import from here, so a change propagates everywhere. | |
| Beyond constants, this crate will eventually hold the generator traits, shared note event types, and latent vector types that the plugin's `process()` loop and the Tauri app's model inference loop both need to speak. T-006 lays the module skeleton so future tasks know where to add their types. | |
| --- | |
| ## Prerequisites | |
| - [ ] T-001 merged β `crates/synesthesia-core/` exists with empty `lib.rs` | |
| - [ ] `cargo check --workspace` passes before starting | |
| --- | |
| ## Acceptance Criteria | |
| - [ ] `cargo check --workspace` β zero errors | |
| - [ ] `cargo clippy --workspace -- -D warnings` β zero warnings | |
| - [ ] `SAMPLE_RATE`, `CHANNELS`, `PPQN` are publicly exported from `synesthesia_core::audio` | |
| - [ ] `NoteEvent`, `PitchClass` are publicly exported from `synesthesia_core::music` | |
| - [ ] `LatentVector`, `MoodDescriptor` are publicly exported from `synesthesia_core::models` | |
| - [ ] All types derive `Debug`, `Clone`, `PartialEq`; numeric newtype wrappers also derive `Copy` | |
| - [ ] `pub use` re-exports at crate root mean `synesthesia_core::SAMPLE_RATE` works without importing submodules | |
| - [ ] No `#[allow(dead_code)]` suppressions β if something is unused, the doc comment explains what T-NNN will use it | |
| - [ ] `kansas/src/audio/engine.rs` local `SAMPLE_RATE` and `CHANNELS` constants are replaced with imports from this crate | |
| --- | |
| ## `crates/synesthesia-core/Cargo.toml` | |
| ```toml | |
| [package] | |
| name = "synesthesia-core" | |
| version = "0.1.0" | |
| edition.workspace = true | |
| [dependencies] | |
| # T-006 has no runtime deps β pure Rust types and constants only. | |
| # serde added here when T-046+ needs to serialize model inputs across the IPC boundary. | |
| ``` | |
| --- | |
| ## File Structure After This Task | |
| ``` | |
| crates/synesthesia-core/src/ | |
| βββ lib.rs β module declarations + pub use re-exports | |
| βββ audio.rs β SAMPLE_RATE, CHANNELS, PPQN; AudioConstants | |
| βββ music.rs β NoteEvent, PitchClass, PITCH_CLASS_NAMES, pitch_to_freq() | |
| βββ models.rs β LatentVector, MoodDescriptor, GenerationMode | |
| ``` | |
| --- | |
| ## Implementation | |
| ### `crates/synesthesia-core/src/lib.rs` | |
| ```rust | |
| //! # synesthesia-core | |
| //! | |
| //! Shared constants, types, and traits used by both the Tauri standalone app | |
| //! (`kansas/`) and the nih-plug VST3/CLAP plugin (`plugin/`). | |
| //! | |
| //! ## Module layout | |
| //! | |
| //! - [`audio`] β sample rate, channel count, buffer constants | |
| //! - [`music`] β note events, pitch classes, music theory primitives | |
| //! - [`models`] β shared types for ML model inputs and outputs | |
| pub mod audio; | |
| pub mod music; | |
| pub mod models; | |
| // ββ Convenience re-exports from crate root ββββββββββββββββββββββββββββββββββββ | |
| // Allows `synesthesia_core::SAMPLE_RATE` without importing submodules. | |
| pub use audio::{CHANNELS, PPQN, SAMPLE_RATE}; | |
| pub use models::{GenerationMode, LatentVector, MoodDescriptor}; | |
| pub use music::{PitchClass, NoteEvent, PITCH_CLASS_NAMES}; | |
| ``` | |
| ### `crates/synesthesia-core/src/audio.rs` | |
| ```rust | |
| //! Audio system constants shared between the standalone app and the plugin. | |
| //! | |
| //! These values are the single source of truth. Do not redeclare them locally | |
| //! in `kansas/` or `plugin/` β import from here. | |
| /// System sample rate in Hz. Chosen to match Magenta RT's SpectroStream codec (48 kHz stereo). | |
| /// All CPAL streams (T-004) and nih-plug processing (T-005+) target this rate. | |
| pub const SAMPLE_RATE: u32 = 48_000; | |
| /// Number of audio channels. Stereo throughout the system. | |
| pub const CHANNELS: u16 = 2; | |
| /// Pulses per quarter note β MIDI standard resolution used by the transport clock (T-003) | |
| /// and Perf RNN tick scheduling (T-063). | |
| pub const PPQN: u32 = 96; | |
| /// Default audio output buffer size in frames at SAMPLE_RATE. | |
| /// 512 frames @ 48 kHz β 10.67 ms. The VST plugin inherits the DAW's buffer size instead. | |
| pub const DEFAULT_BUFFER_FRAMES: u32 = 512; | |
| /// Minimum and maximum supported BPM for the transport clock. | |
| pub const BPM_MIN: f32 = 20.0; | |
| pub const BPM_MAX: f32 = 300.0; | |
| /// Compute tick interval in microseconds for a given BPM. | |
| /// | |
| /// # Example | |
| /// ``` | |
| /// use synesthesia_core::audio::tick_interval_us; | |
| /// let us = tick_interval_us(128.0); // β 4883 ΞΌs at 128 BPM | |
| /// assert!((us - 4882.8).abs() < 1.0); | |
| /// ``` | |
| #[inline] | |
| pub fn tick_interval_us(bpm: f32) -> f32 { | |
| 60_000_000.0 / bpm / PPQN as f32 | |
| } | |
| ``` | |
| ### `crates/synesthesia-core/src/music.rs` | |
| ```rust | |
| //! Music theory primitives shared between the generative engines and the UI. | |
| /// A single MIDI-style note event produced by Performance RNN (T-062) and | |
| /// consumed by the MIDI output (T-068) and nih-plug plugin (T-069). | |
| #[derive(Debug, Clone, Copy, PartialEq, Eq)] | |
| pub struct NoteEvent { | |
| /// MIDI pitch 0β127. Middle C = 60. | |
| pub pitch: u8, | |
| /// MIDI velocity 1β127. 0 = note-off. | |
| pub velocity: u8, | |
| /// MIDI channel 0β15. | |
| pub channel: u8, | |
| /// Position in transport ticks (PPQN resolution) from session start. | |
| pub tick: u64, | |
| /// Duration in ticks. 0 = instantaneous (gate closed immediately). | |
| pub duration_ticks: u32, | |
| } | |
| impl NoteEvent { | |
| /// True if this is a note-off event (velocity == 0). | |
| #[inline] | |
| pub fn is_note_off(&self) -> bool { | |
| self.velocity == 0 | |
| } | |
| /// Convert pitch to frequency in Hz using equal temperament (A4 = 440 Hz). | |
| #[inline] | |
| pub fn frequency(&self) -> f32 { | |
| pitch_to_freq(self.pitch) | |
| } | |
| } | |
| /// The 12 pitch classes of the chromatic scale. | |
| /// Used by the 12-dimensional pitch intelligence radar (UI, T-058) | |
| /// and Steam Audio spatial positioning (T-100). | |
| #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)] | |
| #[repr(u8)] | |
| pub enum PitchClass { | |
| C = 0, Cs, D, Ds, E, F, Fs, G, Gs, A, As, B, | |
| } | |
| impl PitchClass { | |
| /// Extract pitch class from a MIDI pitch number. | |
| #[inline] | |
| pub fn from_midi(pitch: u8) -> Self { | |
| match pitch % 12 { | |
| 0 => Self::C, 1 => Self::Cs, 2 => Self::D, | |
| 3 => Self::Ds, 4 => Self::E, 5 => Self::F, | |
| 6 => Self::Fs, 7 => Self::G, 8 => Self::Gs, | |
| 9 => Self::A, 10 => Self::As, _ => Self::B, | |
| } | |
| } | |
| /// Index 0β11. | |
| #[inline] | |
| pub fn index(self) -> usize { | |
| self as usize | |
| } | |
| } | |
| /// Human-readable pitch class names aligned with `PitchClass` index. | |
| /// `PITCH_CLASS_NAMES[PitchClass::Fs as usize]` == `"F#"`. | |
| pub const PITCH_CLASS_NAMES: [&str; 12] = [ | |
| "C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B", | |
| ]; | |
| /// Convert MIDI pitch to frequency in Hz (equal temperament, A4 = 440 Hz). | |
| #[inline] | |
| pub fn pitch_to_freq(pitch: u8) -> f32 { | |
| 440.0 * 2.0_f32.powf((pitch as f32 - 69.0) / 12.0) | |
| } | |
| #[cfg(test)] | |
| mod tests { | |
| use super::*; | |
| #[test] | |
| fn middle_c_freq() { | |
| let f = pitch_to_freq(60); | |
| assert!((f - 261.63).abs() < 0.1, "Middle C should be ~261.63 Hz, got {f}"); | |
| } | |
| #[test] | |
| fn a4_freq() { | |
| let f = pitch_to_freq(69); | |
| assert!((f - 440.0).abs() < 0.01, "A4 should be 440 Hz, got {f}"); | |
| } | |
| #[test] | |
| fn pitch_class_from_midi() { | |
| assert_eq!(PitchClass::from_midi(60), PitchClass::C); | |
| assert_eq!(PitchClass::from_midi(69), PitchClass::A); | |
| assert_eq!(PitchClass::from_midi(127), PitchClass::G); | |
| } | |
| } | |
| ``` | |
| ### `crates/synesthesia-core/src/models.rs` | |
| ```rust | |
| //! Shared types for ML model inputs and outputs. | |
| //! | |
| //! These are the Rust-native equivalents of the IPC types in `kansas/src/ipc/types.rs`. | |
| //! The IPC types are serialized to JSON for the Tauri frontend. | |
| //! These types are used directly in Rust β no serialization overhead. | |
| /// MusicVAE 512-dimensional latent vector. | |
| /// Produced by the encoder (T-054) and consumed by the decoder (T-055) | |
| /// and Track Mixer latent navigation (T-057). | |
| #[derive(Debug, Clone, PartialEq)] | |
| pub struct LatentVector(pub [f32; 512]); | |
| impl LatentVector { | |
| /// The zero vector β corresponds to the VAE prior mean (silence/root). | |
| pub fn zero() -> Self { | |
| Self([0.0; 512]) | |
| } | |
| /// Linear interpolation between two latent vectors. | |
| /// `t = 0.0` returns `self`, `t = 1.0` returns `other`. | |
| pub fn lerp(&self, other: &Self, t: f32) -> Self { | |
| let t = t.clamp(0.0, 1.0); | |
| let mut out = [0.0f32; 512]; | |
| for i in 0..512 { | |
| out[i] = self.0[i] * (1.0 - t) + other.0[i] * t; | |
| } | |
| Self(out) | |
| } | |
| } | |
| /// Semantic mood state output from Gemma-3N visual analysis (T-052). | |
| /// Rust-native mirror of `MoodState` in `kansas/src/ipc/types.rs`. | |
| #[derive(Debug, Clone, PartialEq)] | |
| pub struct MoodDescriptor { | |
| /// Natural language mood label e.g. "mysterious", "energetic", "serene". | |
| pub mood: String, | |
| /// Energy level 0.0β1.0. | |
| pub energy: f32, | |
| /// Timbral texture label e.g. "dense", "sparse", "glitchy". | |
| pub texture: String, | |
| } | |
| impl Default for MoodDescriptor { | |
| fn default() -> Self { | |
| Self { | |
| mood: "neutral".into(), | |
| energy: 0.5, | |
| texture: "balanced".into(), | |
| } | |
| } | |
| } | |
| /// Which generation engine is active on a channel. | |
| /// Used by the plugin's process() loop and the standalone app's mix bus router. | |
| /// T-032 (Perf RNN VST), T-065 (standalone), T-089 (Magenta RT) set this per channel. | |
| #[derive(Debug, Clone, Copy, PartialEq, Eq, Default)] | |
| pub enum GenerationMode { | |
| /// No generation β channel produces silence. | |
| #[default] | |
| Silent, | |
| /// Performance RNN MIDI generation (T-062). | |
| PerfRnn, | |
| /// Magenta RT continuous audio generation (T-081). | |
| MagentaRt, | |
| /// DDSP timbre transfer on Perf RNN MIDI output (T-092). | |
| Ddsp, | |
| /// Audio passthrough β used by the T-005 scaffold and test paths. | |
| Passthrough, | |
| } | |
| ``` | |
| --- | |
| ## Update `kansas/src/audio/engine.rs` | |
| Once T-006 is merged, replace the local constants in `engine.rs` (added by T-004) with imports: | |
| ```rust | |
| // In kansas/src/audio/engine.rs β REPLACE these local definitions: | |
| // const SAMPLE_RATE: u32 = 48_000; | |
| // const CHANNELS: u16 = 2; | |
| // const BUFFER_FRAMES: u32 = 512; | |
| // WITH: | |
| use synesthesia_core::{SAMPLE_RATE, CHANNELS}; | |
| use synesthesia_core::audio::{DEFAULT_BUFFER_FRAMES as BUFFER_FRAMES}; | |
| ``` | |
| If T-004 and T-006 are merged in the same PR or T-006 merges first, apply this change directly. If T-004 merges before T-006, open a follow-up commit on `main` after T-006 merges. | |
| Similarly, replace the local `PPQN` constant in `kansas/src/commands/transport.rs` (T-003): | |
| ```rust | |
| // REPLACE: | |
| // const PPQN: u32 = 96; | |
| // WITH: | |
| use synesthesia_core::PPQN; | |
| ``` | |
| --- | |
| ## Testing | |
| ### Automated | |
| - [ ] `cargo check --workspace` β zero errors | |
| - [ ] `cargo clippy --workspace -- -D warnings` β zero warnings | |
| - [ ] `cargo test -p synesthesia-core` β `middle_c_freq`, `a4_freq`, `pitch_class_from_midi` all pass | |
| ### Manual | |
| ```bash | |
| # Verify both consumers compile against it | |
| cargo check -p kansas | |
| cargo check -p synesthesia-plugin | |
| ``` | |
| --- | |
| ## GitHub CLI | |
| ```bash | |
| gh issue create \ | |
| --title "T-006: synesthesia-core crate scaffold β shared constants, types, module structure" \ | |
| --label "type:task,stack:rust,agent:autonomous,priority:high,status:ready,day:1" \ | |
| --body-file T-006.md | |
| ``` | |
| --- | |
| **Parent:** GENESIS | |
| **Blocks:** T-003 constant migration (`PPQN`), T-004 constant migration (`SAMPLE_RATE`, `CHANNELS`), T-057 (`LatentVector` lerp), T-062 (`NoteEvent` from Perf RNN), T-068 (`NoteEvent` to MIDI out), T-100 (`PitchClass` icosahedron positions) | |
| **Blocked By:** T-001 | |
| **Version:** v0.1 Β· **Iteration:** iter-1 Β· **Effort:** XS (1 hour) | |