YAML Metadata Warning:The pipeline tag "text2text-generation" is not in the official list: text-classification, token-classification, table-question-answering, question-answering, zero-shot-classification, translation, summarization, feature-extraction, text-generation, fill-mask, sentence-similarity, text-to-speech, text-to-audio, automatic-speech-recognition, audio-to-audio, audio-classification, audio-text-to-text, voice-activity-detection, depth-estimation, image-classification, object-detection, image-segmentation, text-to-image, image-to-text, image-to-image, image-to-video, unconditional-image-generation, video-classification, reinforcement-learning, robotics, tabular-classification, tabular-regression, tabular-to-text, table-to-text, multiple-choice, text-ranking, text-retrieval, time-series-forecasting, text-to-video, image-text-to-text, image-text-to-image, image-text-to-video, visual-question-answering, document-question-answering, zero-shot-image-classification, graph-ml, mask-generation, zero-shot-object-detection, text-to-3d, image-to-3d, image-feature-extraction, video-text-to-text, keypoint-detection, visual-document-retrieval, any-to-any, video-to-video, other

CodeT5+ Strudel Music Generator 🎵🎛️

An encoder-decoder transformer fine-tuned to generate Algorave music patterns in Strudel syntax from natural language descriptions of genre, mood, and musical intent.

Model Description

This model takes natural language descriptions like "dark techno beat with heavy kick and industrial percussion" and generates valid Strudel/TidalCycles JavaScript code that can be directly pasted into strudel.cc to produce live music.

Base Model: Salesforce/codet5p-220m (CodeT5+, 222M params)
Architecture: T5 encoder-decoder (seq2seq)
Training: Full fine-tuning with Seq2SeqTrainer, 30 epochs
Dataset: Adam-Ben-Khalifa/strudel-nl-to-code (770 train / 91 val / 46 test)

Performance

Metric	Test Set
BLEU	97.17
ROUGE-L	97.43
Exact Match	84.78%
Valid Strudel Syntax	100%

Usage

from transformers import T5ForConditionalGeneration, AutoTokenizer

model = T5ForConditionalGeneration.from_pretrained("Adam-Ben-Khalifa/codet5p-strudel-music")
tokenizer = AutoTokenizer.from_pretrained("Adam-Ben-Khalifa/codet5p-strudel-music")

def generate_strudel(description, num_beams=5, max_length=384):
    input_text = "generate strudel: " + description
    inputs = tokenizer(input_text, return_tensors="pt", max_length=256, truncation=True)
    outputs = model.generate(**inputs, max_new_tokens=max_length, num_beams=num_beams, early_stopping=True)
    return tokenizer.decode(outputs[0], skip_special_tokens=True)

# Generate music patterns!
print(generate_strudel("Dark techno beat with heavy kick and industrial percussion"))
# → stack(s("bd*4"), s("~ cp ~ cp"), s("hh*8").gain(0.4), note("c3 ~ e3 ~ g3 ~ e3 ~").s("triangle").gain(0.7))

print(generate_strudel("Calm ambient drone with deep reverb and slowly evolving filter"))
# → note("<c3 eb3 g3 bb3>").s("sine").room(0.9).roomsize(4).lpf(2000).slow(4)

print(generate_strudel("Acid house with squelchy 303 bassline and resonant filter"))
# → stack(s("bd*4"), note("c2 c2 c2 eb2").s("sawtooth").lpf(sine.range(200,4000).slow(4)).resonance(15))

print(generate_strudel("Aggressive drum and bass breakbeat with wobbling bass"))
# → stack(s("[bd ~ ~ bd] [~ ~ bd ~]").fast(2), s("[~ ~ ~ ~] [sd ~ ~ ~]").fast(2).room(0.3),
#     s("hh*16").gain(0.35).lpf(6000),
#     note("[c2 ~ c2 ~] [~ eb2 ~ c2]").s("sawtooth").lpf(sine.range(200,2000).fast(4)).fast(2))

Supported Genres & Styles

The model understands prompts covering:

🏭 Techno (minimal, industrial, dark, acid)
🏠 House (deep, funky, progressive, UK garage)
🌊 Ambient (drone, atmospheric, dark ambient, meditation)
🥁 Drum & Bass (liquid, neurofunk, jungle)
🎤 Hip Hop (boom bap, trap, lo-fi, trip hop)
🎹 Electro (synthwave, italo disco, electropop)
🌀 Trance (psytrance, uplifting, progressive)
🔧 Experimental (IDM, glitch, noise, polyrhythmic)
🎸 Dub (dubstep, dub techno, reggae)
🎮 Chiptune (8-bit, retro gaming)
🌍 World (African polyrhythms)
💿 Vaporwave, Footwork, Broken beat, Nu jazz

Strudel Syntax Output

The model generates valid Strudel JavaScript DSL including:

s() - sample triggers (drums, synths)
note() - pitched notes with synth selection
n() - scale-degree notation
stack() - layered patterns
.fast(), .slow() - tempo modifiers
.room(), .delay() - reverb/delay effects
.lpf(), .hpf() - filters
.gain(), .distort(), .crush() - dynamics
.every(), .rev(), .degradeBy() - pattern transformations
.cpm() - tempo in cycles per minute

Generated code can be pasted directly into strudel.cc/workshop to hear the music!

Training Details

Hardware: NVIDIA L4 GPU (24GB)
Training time: ~8 minutes (30 epochs)
Optimizer: AdamW (lr=5e-5, cosine schedule, 10% warmup)
Batch size: 8 × 4 gradient accumulation = 32 effective
Precision: FP16
Loss: Cross-entropy on decoder output tokens

Limitations

The model was trained on a synthetic dataset of ~900 patterns. While it achieves high metrics on the test set, real-world prompts with very specific or unusual requests may produce less accurate results.
Complex multi-layer compositions with many instruments may be simplified.
The model doesn't have real-time knowledge of the full Strudel API, just the most common patterns.
BPM suggestions are approximate and depend on the training data distribution.

Citation

@misc{codet5p-strudel-music,
  title={CodeT5+ Strudel Music Generator},
  author={Adam Ben Khalifa},
  year={2026},
  note={Fine-tuned on Salesforce/codet5p-220m for NL to Strudel code generation},
  url={https://huggingface.co/Adam-Ben-Khalifa/codet5p-strudel-music}
}

Downloads last month: 10

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for Adam-Ben-Khalifa/codet5p-strudel-music

Base model

Salesforce/codet5p-220m

Finetuned

(96)

this model

Dataset used to train Adam-Ben-Khalifa/codet5p-strudel-music

Evaluation results

BLEU on strudel-nl-to-code
test set self-reported

97.170
ROUGE-L on strudel-nl-to-code
test set self-reported

97.430
Exact Match on strudel-nl-to-code
test set self-reported

84.780