Synesthesia / docs /RUNTIME_API_MUSIC.md
Ashiedu's picture
Sync unified workbench
0490201 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

πŸ—οΈ Runtime API: Music (Jazz)

The Runtime API defines the interface between the Orchestration layer (Genkit) and the Music Engines (Resonance).

πŸŽ›οΈ Music Steering Spec

All generation functions accept a MusicSteeringSpec as their primary steering input.

1. Symbolic Music (Path A)

generate_midi(steering: MusicSteeringSpec) -> MidiBundle

  • Purpose: Generates a multi-track symbolic (MIDI) representation.
  • Engines: Performance RNN, MusicVAE.
  • Output: MidiBundle (includes tracks, metadata, and optional DDSP URL).

2. Audio Music (Path B)

generate_audio(steering: MusicSteeringSpec, midi?: MidiBundle) -> AudioStems

  • Purpose: Generates a full audio stream or multi-track audio stems.
  • Engines: Magenta RT, DDSP (if midi is provided).
  • Output: AudioStems (includes audio streams and control curves).

🧭 Mental Model Routing

  • Lyria Camera: Focuses on generate_audio (Magenta RT) for reactive audio clips.
  • Infinite Crate: Focuses on generate_midi (MusicVAE) for slow evolution + generate_audio (DDSP) for timbre synthesis.
  • Lyria Studio: Uses both paths extensively, often starting with generate_midi for composition followed by generate_audio for stems.

πŸ› οΈ Wiring Examples

1. TypeScript Orchestrator (Genkit)

const spec: MusicSteeringSpec = {
  style: { genre: "jazz", era: "bebop" },
  emotion: { energy: 0.8, tension: 0.4, warmth: 0.6, darkness: 0.2 },
  intentions: "Generate a bebop-style jazz session with high energy.",
};

const midiBundle = await generate_midi(spec);
const audioStems = await generate_audio(spec, midiBundle);

2. C++ Native Core (Resonance)

  • Receives JSON-serialized MusicSteeringSpec.
  • Loads corresponding ONNX models via WinML/DirectML.
  • Executes inference and returns MidiBundle or AudioStems.