Spaces:

Ashiedu
/

Synesthesia

Runtime error

App Files Files Community

Synesthesia / docs /RUNTIME_API_MUSIC.md

Ashiedu

Sync unified workbench

0490201 verified about 1 month ago

preview code

raw

history blame contribute delete

1.91 kB

	# 🏗️ Runtime API: Music (Jazz)

	The Runtime API defines the interface between the Orchestration layer (Genkit) and the Music Engines (Resonance).

	## 🎛️ Music Steering Spec

	All generation functions accept a `MusicSteeringSpec` as their primary steering input.

	### 1. Symbolic Music (Path A)

	#### `generate_midi(steering: MusicSteeringSpec) -> MidiBundle`

	- Purpose: Generates a multi-track symbolic (MIDI) representation.
	- Engines: Performance RNN, MusicVAE.
	- Output: `MidiBundle` (includes tracks, metadata, and optional DDSP URL).

	### 2. Audio Music (Path B)

	#### `generate_audio(steering: MusicSteeringSpec, midi?: MidiBundle) -> AudioStems`

	- Purpose: Generates a full audio stream or multi-track audio stems.
	- Engines: Magenta RT, DDSP (if `midi` is provided).
	- Output: `AudioStems` (includes audio streams and control curves).

	---

	## 🧭 Mental Model Routing

	- Lyria Camera: Focuses on `generate_audio` (Magenta RT) for reactive audio clips.
	- Infinite Crate: Focuses on `generate_midi` (MusicVAE) for slow evolution + `generate_audio` (DDSP) for timbre synthesis.
	- Lyria Studio: Uses both paths extensively, often starting with `generate_midi` for composition followed by `generate_audio` for stems.

	## 🛠️ Wiring Examples

	### 1. TypeScript Orchestrator (Genkit)

	```typescript
	const spec: MusicSteeringSpec = {
	style: { genre: "jazz", era: "bebop" },
	emotion: { energy: 0.8, tension: 0.4, warmth: 0.6, darkness: 0.2 },
	intentions: "Generate a bebop-style jazz session with high energy.",
	};

	const midiBundle = await generate_midi(spec);
	const audioStems = await generate_audio(spec, midiBundle);
	```

	### 2. C++ Native Core (Resonance)

	- Receives JSON-serialized `MusicSteeringSpec`.
	- Loads corresponding ONNX models via WinML/DirectML.
	- Executes inference and returns `MidiBundle` or `AudioStems`.