Spaces:

Ashiedu
/

Synesthesia

Runtime error

App Files Files Community

Synesthesia / github-issues /T-006.md

Ashiedu

Sync unified workbench

0490201 verified about 1 month ago

preview code

raw

history blame contribute delete

12.7 kB

	# T-006: `synesthesia-core` Crate Scaffold — Shared Constants, Types, Module Structure

	Type: Task
	Phase: 0 — Foundation
	Autonomy: `agent:autonomous` — Pure Rust library, no external systems, no decisions required.
	Stack: `stack:rust`
	Version: v0.1
	Iteration: iter-1
	Effort: XS (1 hour)

	---

	> ⚠️ Agent Scope: Populate `crates/synesthesia-core/` with the canonical shared constants, a handful of foundational types, and stub module declarations. Do not implement any audio processing, model inference, or IPC logic here — those live in `kansas/` and `plugin/`. This crate is a library. Its job in v0.1 is to be the single source of truth for values that both targets must agree on.

	---

	## Context

	T-001 created `crates/synesthesia-core/src/lib.rs` as a comment-only stub. T-005 added it as a dep to `plugin/`. T-006 gives it actual content.

	The core problem this crate solves: `kansas/` and `plugin/` are separate compilation units. If `kansas/` defines `SAMPLE_RATE = 48_000` locally and `plugin/` defines it separately, they will inevitably diverge. `synesthesia-core` is the single place these constants live — both crates import from here, so a change propagates everywhere.

	Beyond constants, this crate will eventually hold the generator traits, shared note event types, and latent vector types that the plugin's `process()` loop and the Tauri app's model inference loop both need to speak. T-006 lays the module skeleton so future tasks know where to add their types.

	---

	## Prerequisites

	- [ ] T-001 merged — `crates/synesthesia-core/` exists with empty `lib.rs`
	- [ ] `cargo check --workspace` passes before starting

	---

	## Acceptance Criteria

	- [ ] `cargo check --workspace` — zero errors
	- [ ] `cargo clippy --workspace -- -D warnings` — zero warnings
	- [ ] `SAMPLE_RATE`, `CHANNELS`, `PPQN` are publicly exported from `synesthesia_core::audio`
	- [ ] `NoteEvent`, `PitchClass` are publicly exported from `synesthesia_core::music`
	- [ ] `LatentVector`, `MoodDescriptor` are publicly exported from `synesthesia_core::models`
	- [ ] All types derive `Debug`, `Clone`, `PartialEq`; numeric newtype wrappers also derive `Copy`
	- [ ] `pub use` re-exports at crate root mean `synesthesia_core::SAMPLE_RATE` works without importing submodules
	- [ ] No `#[allow(dead_code)]` suppressions — if something is unused, the doc comment explains what T-NNN will use it
	- [ ] `kansas/src/audio/engine.rs` local `SAMPLE_RATE` and `CHANNELS` constants are replaced with imports from this crate

	---

	## `crates/synesthesia-core/Cargo.toml`

	```toml
	[package]
	name = "synesthesia-core"
	version = "0.1.0"
	edition.workspace = true

	[dependencies]
	# T-006 has no runtime deps — pure Rust types and constants only.
	# serde added here when T-046+ needs to serialize model inputs across the IPC boundary.
	```

	---

	## File Structure After This Task

	```
	crates/synesthesia-core/src/
	├── lib.rs ← module declarations + pub use re-exports
	├── audio.rs ← SAMPLE_RATE, CHANNELS, PPQN; AudioConstants
	├── music.rs ← NoteEvent, PitchClass, PITCH_CLASS_NAMES, pitch_to_freq()
	└── models.rs ← LatentVector, MoodDescriptor, GenerationMode
	```

	---

	## Implementation

	### `crates/synesthesia-core/src/lib.rs`

	```rust
	//! # synesthesia-core
	//!
	//! Shared constants, types, and traits used by both the Tauri standalone app
	//! (`kansas/`) and the nih-plug VST3/CLAP plugin (`plugin/`).
	//!
	//! ## Module layout
	//!
	//! - [`audio`] — sample rate, channel count, buffer constants
	//! - [`music`] — note events, pitch classes, music theory primitives
	//! - [`models`] — shared types for ML model inputs and outputs

	pub mod audio;
	pub mod music;
	pub mod models;

	// ── Convenience re-exports from crate root ────────────────────────────────────
	// Allows `synesthesia_core::SAMPLE_RATE` without importing submodules.

	pub use audio::{CHANNELS, PPQN, SAMPLE_RATE};
	pub use models::{GenerationMode, LatentVector, MoodDescriptor};
	pub use music::{PitchClass, NoteEvent, PITCH_CLASS_NAMES};
	```

	### `crates/synesthesia-core/src/audio.rs`

	```rust
	//! Audio system constants shared between the standalone app and the plugin.
	//!
	//! These values are the single source of truth. Do not redeclare them locally
	//! in `kansas/` or `plugin/` — import from here.

	/// System sample rate in Hz. Chosen to match Magenta RT's SpectroStream codec (48 kHz stereo).
	/// All CPAL streams (T-004) and nih-plug processing (T-005+) target this rate.
	pub const SAMPLE_RATE: u32 = 48_000;

	/// Number of audio channels. Stereo throughout the system.
	pub const CHANNELS: u16 = 2;

	/// Pulses per quarter note — MIDI standard resolution used by the transport clock (T-003)
	/// and Perf RNN tick scheduling (T-063).
	pub const PPQN: u32 = 96;

	/// Default audio output buffer size in frames at SAMPLE_RATE.
	/// 512 frames @ 48 kHz ≈ 10.67 ms. The VST plugin inherits the DAW's buffer size instead.
	pub const DEFAULT_BUFFER_FRAMES: u32 = 512;

	/// Minimum and maximum supported BPM for the transport clock.
	pub const BPM_MIN: f32 = 20.0;
	pub const BPM_MAX: f32 = 300.0;

	/// Compute tick interval in microseconds for a given BPM.
	///
	/// # Example
	/// ```
	/// use synesthesia_core::audio::tick_interval_us;
	/// let us = tick_interval_us(128.0); // ≈ 4883 μs at 128 BPM
	/// assert!((us - 4882.8).abs() < 1.0);
	/// ```
	#[inline]
	pub fn tick_interval_us(bpm: f32) -> f32 {
	60_000_000.0 / bpm / PPQN as f32
	}
	```

	### `crates/synesthesia-core/src/music.rs`

	```rust
	//! Music theory primitives shared between the generative engines and the UI.

	/// A single MIDI-style note event produced by Performance RNN (T-062) and
	/// consumed by the MIDI output (T-068) and nih-plug plugin (T-069).
	#[derive(Debug, Clone, Copy, PartialEq, Eq)]
	pub struct NoteEvent {
	/// MIDI pitch 0–127. Middle C = 60.
	pub pitch: u8,
	/// MIDI velocity 1–127. 0 = note-off.
	pub velocity: u8,
	/// MIDI channel 0–15.
	pub channel: u8,
	/// Position in transport ticks (PPQN resolution) from session start.
	pub tick: u64,
	/// Duration in ticks. 0 = instantaneous (gate closed immediately).
	pub duration_ticks: u32,
	}

	impl NoteEvent {
	/// True if this is a note-off event (velocity == 0).
	#[inline]
	pub fn is_note_off(&self) -> bool {
	self.velocity == 0
	}

	/// Convert pitch to frequency in Hz using equal temperament (A4 = 440 Hz).
	#[inline]
	pub fn frequency(&self) -> f32 {
	pitch_to_freq(self.pitch)
	}
	}

	/// The 12 pitch classes of the chromatic scale.
	/// Used by the 12-dimensional pitch intelligence radar (UI, T-058)
	/// and Steam Audio spatial positioning (T-100).
	#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash, PartialOrd, Ord)]
	#[repr(u8)]
	pub enum PitchClass {
	C = 0, Cs, D, Ds, E, F, Fs, G, Gs, A, As, B,
	}

	impl PitchClass {
	/// Extract pitch class from a MIDI pitch number.
	#[inline]
	pub fn from_midi(pitch: u8) -> Self {
	match pitch % 12 {
	0 => Self::C, 1 => Self::Cs, 2 => Self::D,
	3 => Self::Ds, 4 => Self::E, 5 => Self::F,
	6 => Self::Fs, 7 => Self::G, 8 => Self::Gs,
	9 => Self::A, 10 => Self::As, _ => Self::B,
	}
	}

	/// Index 0–11.
	#[inline]
	pub fn index(self) -> usize {
	self as usize
	}
	}

	/// Human-readable pitch class names aligned with `PitchClass` index.
	/// `PITCH_CLASS_NAMES[PitchClass::Fs as usize]` == `"F#"`.
	pub const PITCH_CLASS_NAMES: [&str; 12] = [
	"C", "C#", "D", "D#", "E", "F", "F#", "G", "G#", "A", "A#", "B",
	];

	/// Convert MIDI pitch to frequency in Hz (equal temperament, A4 = 440 Hz).
	#[inline]
	pub fn pitch_to_freq(pitch: u8) -> f32 {
	440.0 * 2.0_f32.powf((pitch as f32 - 69.0) / 12.0)
	}

	#[cfg(test)]
	mod tests {
	use super::*;

	#[test]
	fn middle_c_freq() {
	let f = pitch_to_freq(60);
	assert!((f - 261.63).abs() < 0.1, "Middle C should be ~261.63 Hz, got {f}");
	}

	#[test]
	fn a4_freq() {
	let f = pitch_to_freq(69);
	assert!((f - 440.0).abs() < 0.01, "A4 should be 440 Hz, got {f}");
	}

	#[test]
	fn pitch_class_from_midi() {
	assert_eq!(PitchClass::from_midi(60), PitchClass::C);
	assert_eq!(PitchClass::from_midi(69), PitchClass::A);
	assert_eq!(PitchClass::from_midi(127), PitchClass::G);
	}
	}
	```

	### `crates/synesthesia-core/src/models.rs`

	```rust
	//! Shared types for ML model inputs and outputs.
	//!
	//! These are the Rust-native equivalents of the IPC types in `kansas/src/ipc/types.rs`.
	//! The IPC types are serialized to JSON for the Tauri frontend.
	//! These types are used directly in Rust — no serialization overhead.

	/// MusicVAE 512-dimensional latent vector.
	/// Produced by the encoder (T-054) and consumed by the decoder (T-055)
	/// and Track Mixer latent navigation (T-057).
	#[derive(Debug, Clone, PartialEq)]
	pub struct LatentVector(pub [f32; 512]);

	impl LatentVector {
	/// The zero vector — corresponds to the VAE prior mean (silence/root).
	pub fn zero() -> Self {
	Self([0.0; 512])
	}

	/// Linear interpolation between two latent vectors.
	/// `t = 0.0` returns `self`, `t = 1.0` returns `other`.
	pub fn lerp(&self, other: &Self, t: f32) -> Self {
	let t = t.clamp(0.0, 1.0);
	let mut out = [0.0f32; 512];
	for i in 0..512 {
	out[i] = self.0[i] * (1.0 - t) + other.0[i] * t;
	}
	Self(out)
	}
	}

	/// Semantic mood state output from Gemma-3N visual analysis (T-052).
	/// Rust-native mirror of `MoodState` in `kansas/src/ipc/types.rs`.
	#[derive(Debug, Clone, PartialEq)]
	pub struct MoodDescriptor {
	/// Natural language mood label e.g. "mysterious", "energetic", "serene".
	pub mood: String,
	/// Energy level 0.0–1.0.
	pub energy: f32,
	/// Timbral texture label e.g. "dense", "sparse", "glitchy".
	pub texture: String,
	}

	impl Default for MoodDescriptor {
	fn default() -> Self {
	Self {
	mood: "neutral".into(),
	energy: 0.5,
	texture: "balanced".into(),
	}
	}
	}

	/// Which generation engine is active on a channel.
	/// Used by the plugin's process() loop and the standalone app's mix bus router.
	/// T-032 (Perf RNN VST), T-065 (standalone), T-089 (Magenta RT) set this per channel.
	#[derive(Debug, Clone, Copy, PartialEq, Eq, Default)]
	pub enum GenerationMode {
	/// No generation — channel produces silence.
	#[default]
	Silent,
	/// Performance RNN MIDI generation (T-062).
	PerfRnn,
	/// Magenta RT continuous audio generation (T-081).
	MagentaRt,
	/// DDSP timbre transfer on Perf RNN MIDI output (T-092).
	Ddsp,
	/// Audio passthrough — used by the T-005 scaffold and test paths.
	Passthrough,
	}
	```

	---

	## Update `kansas/src/audio/engine.rs`

	Once T-006 is merged, replace the local constants in `engine.rs` (added by T-004) with imports:

	```rust
	// In kansas/src/audio/engine.rs — REPLACE these local definitions:
	// const SAMPLE_RATE: u32 = 48_000;
	// const CHANNELS: u16 = 2;
	// const BUFFER_FRAMES: u32 = 512;

	// WITH:
	use synesthesia_core::{SAMPLE_RATE, CHANNELS};
	use synesthesia_core::audio::{DEFAULT_BUFFER_FRAMES as BUFFER_FRAMES};
	```

	If T-004 and T-006 are merged in the same PR or T-006 merges first, apply this change directly. If T-004 merges before T-006, open a follow-up commit on `main` after T-006 merges.

	Similarly, replace the local `PPQN` constant in `kansas/src/commands/transport.rs` (T-003):

	```rust
	// REPLACE:
	// const PPQN: u32 = 96;

	// WITH:
	use synesthesia_core::PPQN;
	```

	---

	## Testing

	### Automated
	- [ ] `cargo check --workspace` — zero errors
	- [ ] `cargo clippy --workspace -- -D warnings` — zero warnings
	- [ ] `cargo test -p synesthesia-core` — `middle_c_freq`, `a4_freq`, `pitch_class_from_midi` all pass

	### Manual
	```bash
	# Verify both consumers compile against it
	cargo check -p kansas
	cargo check -p synesthesia-plugin
	```

	---

	## GitHub CLI

	```bash
	gh issue create \
	--title "T-006: synesthesia-core crate scaffold — shared constants, types, module structure" \
	--label "type:task,stack:rust,agent:autonomous,priority:high,status:ready,day:1" \
	--body-file T-006.md
	```

	---

	Parent: GENESIS
	Blocks: T-003 constant migration (`PPQN`), T-004 constant migration (`SAMPLE_RATE`, `CHANNELS`), T-057 (`LatentVector` lerp), T-062 (`NoteEvent` from Perf RNN), T-068 (`NoteEvent` to MIDI out), T-100 (`PitchClass` icosahedron positions)
	Blocked By: T-001
	Version: v0.1 · Iteration: iter-1 · Effort: XS (1 hour)