| --- |
| language: |
| - en |
| license: mit |
| base_model: Qwen/Qwen3.5-3B-Instruct |
| tags: |
| - music |
| - guitar |
| - piano |
| - drums |
| - vocals |
| - music-theory |
| - ear-training |
| - songwriting |
| - lora |
| - peft |
| - qwen |
| - eq-adapter |
| - matrix-corp |
| pipeline_tag: text-generation |
| library_name: transformers |
| model_type: touchgrass |
| --- |
| |
| # Touch Grass π΅ |
|
|
| **A Lightweight Music AI Assistant Fine-Tuned from Qwen3.5** |
|
|
| Touch Grass is a specialized music AI assistant built by fine-tuning Qwen3.5 models (3B and 7B variants) with music-specific capabilities. It understands guitar, piano, drums, vocals, music theory, ear training, songwriting, and productionβwith emotional intelligence to help musicians through frustration. |
|
|
| ## π Features |
|
|
| - **Two Model Sizes**: TouchGrass-3B (CPU-friendly) and TouchGrass-7B (GPU-enhanced) |
| - **Music Tokenizer Extension**: Adds 21+ music-specific tokens to Qwen3.5's vocabulary |
| - **Five Specialized Modules**: |
| - πΈ **Tab & Chord Generation**: Creates and validates guitar tabs, chord diagrams |
| - πΉ **Music Theory Engine**: Scales, chords, intervals, progressions, circle of fifths |
| - π **Ear Training**: Interval identification with song references, solfege exercises |
| - π **EQ Adapter**: Frustration detection and emotional response adaptation |
| - βοΈ **Song Writing Assistant**: Chord progressions, lyrics, hooks, production tips |
| - **LoRA Fine-Tuning**: Efficient adaptation without full model retraining |
| - **HuggingFace Compatible**: Production-ready with custom config and tokenizer classes |
| - **Ollama Support**: Run locally with Ollama modelfiles |
| - **Unified Inference**: Instrument context switching (guitar, piano, drums, vocals, theory, production) |
| - **Synthetic Data Pipeline**: 10 categories, 80+ templates covering all music domains |
|
|
| ## ποΈ Architecture |
|
|
| ``` |
| TouchGrass/ |
| βββ configs/ # Model configurations |
| β βββ touchgrass_3b_config.py # 3B variant config |
| β βββ touchgrass_7b_config.py # 7B variant config |
| β βββ training_config.py # Training hyperparameters |
| βββ tokenizer/ |
| β βββ music_token_extension.py # Extends Qwen tokenizer with music tokens |
| βββ models/ # Specialized music modules |
| β βββ tab_chord_module.py # Guitar tabs and chords |
| β βββ music_theory_module.py # Theory knowledge |
| β βββ ear_training_module.py # Ear training exercises |
| β βββ eq_adapter.py # Emotional intelligence |
| β βββ songwriting_module.py # Song creation assistance |
| βββ data/ |
| β βββ music_qa_generator.py # Synthetic dataset generator |
| β βββ chat_formatter.py # Qwen chat format converter |
| β βββ dataset_loader.py # PyTorch dataset |
| βββ training/ |
| β βββ losses.py # Multi-task loss functions |
| β βββ trainer.py # LoRA-aware trainer |
| β βββ train.py # Main training entry point |
| βββ inference/ |
| β βββ inference.py # Unified inference with context |
| βββ benchmarks/ |
| β βββ evaluate_music_modules.py # Module-level benchmarks |
| β βββ evaluate_inference.py # End-to-end inference benchmarks |
| βββ tests/ # Comprehensive test suite |
| β βββ test_*.py # Unit tests for each module |
| β βββ conftest.py # Pytest fixtures |
| β βββ run_tests.py # Test runner |
| βββ configuration_touchgrass.py # HuggingFace config class |
| βββ tokenization_touchgrass.py # HuggingFace tokenizer wrapper |
| βββ ollama_3b_modelfile # Ollama config for 3B |
| βββ ollama_7b_modelfile # Ollama config for 7B |
| βββ train.py # Main training script |
| ``` |
|
|
| ## π¦ Installation |
|
|
| ### Prerequisites |
|
|
| - Python 3.10+ |
| - PyTorch 2.0+ |
| - Transformers (HuggingFace) |
| - PEFT (LoRA) |
| - Datasets |
| - Pytest (for testing) |
|
|
| ### Setup |
|
|
| ```bash |
| # Clone the repository |
| cd TouchGrass |
| |
| # Install dependencies |
| pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu |
| pip install transformers peft datasets accelerate tqdm pytest |
| |
| # Optional: For GPU support |
| pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118 |
| ``` |
|
|
| ## π Quick Start |
|
|
| ### 1. Generate Training Data |
|
|
| ```bash |
| python -c " |
| from TouchGrass.data.music_qa_generator import MusicQAGenerator |
| from TouchGrass.data.chat_formatter import ChatFormatter |
| |
| # Generate synthetic dataset |
| generator = MusicQAGenerator(seed=42) |
| dataset = generator.generate_dataset(num_samples=1000, output_path='data/music_qa.jsonl') |
| |
| # Format for Qwen |
| formatter = ChatFormatter() |
| formatted = formatter.format_dataset(dataset) |
| train_data, val_data = formatter.create_splits(formatted, val_size=0.1) |
| |
| formatter.save_dataset(train_data, 'data/train.jsonl') |
| formatter.save_dataset(val_data, 'data/val.jsonl') |
| " |
| ``` |
|
|
| ### 2. Train the Model |
|
|
| ```bash |
| # Train 3B variant |
| python train.py \ |
| --base_model Qwen/Qwen3.5-3B-Instruct \ |
| --train_data data/train.jsonl \ |
| --val_data data/val.jsonl \ |
| --output_dir checkpoints/touchgrass-3b \ |
| --lora_r 16 \ |
| --lora_alpha 32 \ |
| --batch_size 4 \ |
| --gradient_accumulation_steps 4 \ |
| --learning_rate 2e-4 \ |
| --num_epochs 3 \ |
| --mixed_precision fp16 |
| |
| # Train 7B variant (requires GPU with 16GB+ VRAM) |
| python train.py \ |
| --base_model Qwen/Qwen3.5-7B-Instruct \ |
| --train_data data/train.jsonl \ |
| --val_data data/val.jsonl \ |
| --output_dir checkpoints/touchgrass-7b \ |
| --lora_r 16 \ |
| --lora_alpha 32 \ |
| --batch_size 2 \ |
| --gradient_accumulation_steps 8 \ |
| --learning_rate 1e-4 \ |
| --num_epochs 3 \ |
| --mixed_precision bf16 |
| ``` |
|
|
| ### 3. Run Inference |
|
|
| ```python |
| from TouchGrass.inference.inference import TouchGrassInference |
| |
| # Load model |
| model = TouchGrassInference( |
| model_path="checkpoints/touchgrass-3b", |
| device="cpu" # or "cuda" |
| ) |
| |
| # Single query with instrument context |
| response = model.generate( |
| prompt="How do I play a G major chord?", |
| instrument="guitar", |
| skill_level="beginner", |
| max_new_tokens=200 |
| ) |
| print(response) |
| |
| # Interactive mode |
| model.chat(instrument="piano") |
| ``` |
|
|
| ### 4. Use with Ollama |
|
|
| ```bash |
| # Create modelfile from provided template |
| cat ollama_3b_modelfile > Modelfile |
| |
| # Build and run |
| ollama create touchgrass-3b -f Modelfile |
| ollama run touchgrass-3b "How do I play a G major chord on guitar?" |
| ``` |
|
|
| ### 5. Use with HuggingFace |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| # Load with custom config and tokenizer |
| config = TouchGrassConfig.from_pretrained("checkpoints/touchgrass-3b") |
| tokenizer = TouchGrassTokenizer.from_pretrained("checkpoints/touchgrass-3b") |
| model = AutoModelForCausalLM.from_pretrained( |
| "checkpoints/touchgrass-3b", |
| config=config, |
| device_map="auto" |
| ) |
| |
| # Generate |
| inputs = tokenizer("system\nYou are a music assistant.\nuser\nHow do I play a G major chord?\nassistant\n", return_tensors="pt") |
| outputs = model.generate(**inputs, max_new_tokens=200) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| ## π§ͺ Testing |
|
|
| Run the comprehensive test suite: |
|
|
| ```bash |
| # Run all tests |
| python tests/run_tests.py |
| |
| # Run with coverage |
| python tests/run_tests.py --coverage |
| |
| # Run specific test categories |
| pytest tests/test_music_theory_module.py -v |
| pytest tests/test_tokenizer.py -v |
| pytest tests/test_eq_adapter.py -v |
| |
| # Skip slow tests |
| pytest -m "not slow" |
| ``` |
|
|
| ## π Benchmarking |
|
|
| Evaluate model performance on music-specific tasks: |
|
|
| ```bash |
| # Evaluate music modules |
| python benchmarks/evaluate_music_modules.py --device cpu --d_model 768 |
| |
| # Run inference benchmarks |
| python benchmarks/evaluate_inference.py --model_path checkpoints/touchgrass-3b --device cpu |
| ``` |
|
|
| ## ποΈ Configuration |
|
|
| ### Training Configuration |
|
|
| Edit `configs/training_config.py` to customize: |
|
|
| - **Learning rate**: 2e-4 (3B), 1e-4 (7B) |
| - **LoRA rank (r)**: 8-32 (higher = more capacity) |
| - **LoRA alpha**: Typically 2Γr |
| - **Batch size**: Adjust based on GPU memory |
| - **Gradient accumulation**: Use to simulate larger batches |
| - **Loss weights**: |
| - `lm_loss_weight=1.0` (primary language modeling) |
| - `eq_loss_weight=0.1` (emotional intelligence) |
| - `music_module_loss_weight=0.05` (specialized modules) |
|
|
| ### Model Configuration |
|
|
| - **TouchGrass-3B**: Based on Qwen3.5-3B-Instruct, d_model=2048, num_layers=36 |
| - **TouchGrass-7B**: Based on Qwen3.5-7B-Instruct, d_model=4096, num_layers=40 |
|
|
| ### Music Tokens |
|
|
| The tokenizer extension adds these special tokens: |
|
|
| **Domain tokens**: `[GUITAR]`, `[PIANO]`, `[DRUMS]`, `[VOCALS]`, `[THEORY]`, `[PRODUCTION]` |
|
|
| **Emotion tokens**: `[FRUSTRATED]`, `[CONFUSED]`, `[EXCITED]`, `[CONFIDENT]` |
|
|
| **Difficulty tokens**: `[EASY]`, `[MEDIUM]`, `[HARD]` |
|
|
| **Function tokens**: `[TAB]`, `[CHORD]`, `[SCALE]`, `[INTERVAL]`, `[PROGRESSION]` |
|
|
| **EQ tokens**: `[SIMPLIFY]`, `[ENCOURAGE]` |
|
|
| **Music notation**: All note names (C, C#, D, etc.), chord types (m, dim, aug, 7, maj7, etc.) |
|
|
| ## π Music Domains Covered |
|
|
| 1. **Guitar & Bass**: Tabs, chords, fingerings, techniques, tunings |
| 2. **Piano & Keys**: Scales, arpeggios, hand positions, pedaling |
| 3. **Drums & Percussion**: Beats, fills, rudiments, kit setup |
| 4. **Vocals & Singing**: Range, breathing, technique, warmups |
| 5. **Music Theory & Composition**: Scales, chords, progressions, harmony |
| 6. **DJ & Production**: EQ, mixing, compression, arrangement |
|
|
| ## π Emotional Intelligence |
|
|
| The EQ Adapter detects user frustration and adapts responses: |
|
|
| - **Frustration detection**: Sigmoid output [0, 1] indicating frustration level |
| - **Emotion classification**: 4 classes (frustrated, confused, excited, confident) |
| - **Simplification gate**: Automatically simplifies explanations when frustration is high |
| - **Encouragement templates**: Pre-built supportive responses |
| - **Context-aware**: Uses conversation history to track emotional state |
|
|
| ## π§ Advanced Usage |
|
|
| ### Custom Dataset Generation |
|
|
| ```python |
| from TouchGrass.data.music_qa_generator import MusicQAGenerator |
| |
| # Create custom templates |
| custom_templates = { |
| "guitar": [ |
| { |
| "system": "You are a {instrument} specialist.", |
| "user": "How do I play {chord}?", |
| "assistant": "Place your fingers: {fingering}" |
| } |
| ] |
| } |
| |
| generator = MusicQAGenerator(templates=custom_templates, seed=123) |
| dataset = generator.generate_dataset(num_samples=500) |
| ``` |
|
|
| ### Multi-Instrument Context |
|
|
| ```python |
| from TouchGrass.inference.inference import TouchGrassInference |
| |
| model = TouchGrassInference(model_path="checkpoints/touchgrass-3b") |
| |
| # Switch between instruments seamlessly |
| guitar_response = model.generate("How do I palm mute?", instrument="guitar") |
| piano_response = model.generate("What are the scales in C major?", instrument="piano") |
| theory_response = model.generate("Explain the circle of fifths", instrument="theory") |
| ``` |
|
|
| ### LoRA Fine-Tuning Customization |
|
|
| ```python |
| from transformers import LoraConfig |
| |
| lora_config = LoraConfig( |
| task_type=TaskType.CAUSAL_LM, |
| r=32, # Rank (higher = more parameters) |
| lora_alpha=64, # Alpha (typically 2Γr) |
| target_modules=["q_proj", "k_proj", "v_proj", "o_proj"], # Qwen attention modules |
| lora_dropout=0.1, |
| bias="none" |
| ) |
| ``` |
|
|
| ## π§© Module Details |
|
|
| ### Tab & Chord Module |
|
|
| - **Input**: Hidden states + string/fret indices |
| - **Output**: |
| - `tab_validator`: Confidence score [0, 1] for tab validity |
| - `difficulty`: 3-class classification (easy/medium/hard) |
| - **Supports**: Multiple tunings (standard, drop D, open G), 6 strings, 24 frets |
|
|
| ### Music Theory Module |
|
|
| - **Functions**: |
| - `get_scale_from_key(key, mode)`: Returns scale notes |
| - `detect_chord_function(root, chord_type, key)`: Returns Roman numeral |
| - `get_circle_of_fifths()`: Returns 12-key circle |
| - `construct_chord(root, chord_type)`: Returns chord notes |
| - `analyze_progression(progression, key)`: Returns functional analysis |
| - **Knowledge**: All modes (ionian through locrian), intervals, transpositions |
|
|
| ### Ear Training Module |
|
|
| - **Interval identification**: 12 intervals (P1-P8) |
| - **Song references**: Each interval linked to famous songs (Star Wars for P5, Jaws for m2, etc.) |
| - **Solfege generation**: Do-Re-Mi for any key/mode |
| - **Quiz generation**: Automatic interval quiz creation |
|
|
| ### EQ Adapter |
|
|
| - **Frustration detector**: Sigmoid output from hidden states |
| - **Emotion classifier**: 4-way classification |
| - **Simplification gate**: Context-aware response simplification |
| - **Encouragement embed**: Pre-trained supportive phrases |
|
|
| ### Songwriting Module |
|
|
| - **Progression suggester**: By mood (8 types) and genre (8 types) |
| - **Lyric generator**: With rhyme scheme awareness (ABAB, AABB, etc.) |
| - **Hook generator**: Creates memorable song hooks |
| - **Production advisor**: Instrumentation, effects, arrangement tips |
|
|
| ## π Training Tips |
|
|
| 1. **Start small**: Use 3B variant for experimentation, 7B for production |
| 2. **Data quality**: Ensure diverse coverage of all 10 categories |
| 3. **Loss weights**: Default (1.0, 0.1, 0.05) work well; adjust if modules need more/less supervision |
| 4. **LoRA rank**: Start with r=16; increase to 32 if underfitting |
| 5. **Mixed precision**: Use `fp16` for NVIDIA, `bf16` for newer GPUs |
| 6. **Gradient accumulation**: Essential for fitting larger batches on limited VRAM |
| 7. **Checkpointing**: Save every 100-500 steps for safety |
|
|
| ## π€ Contributing |
|
|
| 1. Fork the repository |
| 2. Create a feature branch |
| 3. Add tests for new functionality |
| 4. Ensure all tests pass (`python tests/run_tests.py`) |
| 5. Submit a pull request |
|
|
| ## π License |
|
|
| MIT License - see LICENSE file for details. |
|
|
| ## π Acknowledgments |
|
|
| - **Qwen3.5**: Base model from Alibaba Cloud |
| - **HuggingFace**: Transformers and PEFT libraries |
| - **Music theory**: Traditional Western music theory principles |
| - **Song references**: Popular music culture for ear training |
|
|
| ## π Support |
|
|
| - Issues: GitHub Issues |
| - Discussions: GitHub Discussions |
| - Documentation: See individual module docstrings |
|
|
| --- |
|
|
| **Made with β€οΈ for musicians everywhere.** |
|
|
| *Touch Grass - because even AI needs to remember to make music, not just talk about it.* |
|
|