add fast captioning module (CLAP + faster-whisper + Silero VAD), update deps 4619f39 Nekochu commited on 17 days ago
random 60s crop at training time (matches Side-Step chunk-duration), remove pre-split chunking d3618ec Nekochu commited on 17 days ago
audio-level chunking (not latent), auto-scale epochs for chunk count 1ee8f1f Nekochu commited on 18 days ago
chunk latents into ~30s segments for faster CPU training, energy-aware boundaries 2e395ab Nekochu commited on 19 days ago
skip bare librosa sidecar, let preprocessing faf analysis handle caption fallback 53f6566 Nekochu commited on 20 days ago
fix adapter save path, smart LM fallback, compact training UI, remove Server Status 35fbf3e Nekochu commited on 20 days ago
cancel, captioning, preprocessing, sidecar upload, elapsed time, GeneratorExit fix 32de701 Nekochu commited on 21 days ago
fix review: debug leak, int crash, rank mismatch, 0-byte skip, log cap, understand diag 4d9a556 Nekochu commited on 22 days ago
fix: save PEFT adapter (not full model), remove random suffix from LoRA names, fix epoch cap to 1000 57df0f6 Nekochu commited on 22 days ago
remove XL checkpoint download (OOMKilled build, training uses standard turbo) 6d9fb39 Nekochu commited on 22 days ago
fix: save_every_n_epochs=0, add demucs-infer to Dockerfile, debug adapter dir 0e27e49 Nekochu commited on 22 days ago
fix all review issues: dedup sampling/unwrap, thread-safe lock, cleanup, retry, security docs 829ed0c Nekochu commited on 22 days ago
update README with final state, full pipeline inference, LM generation step a5741b1 Nekochu commited on 22 days ago
fix inference: add LM generation step, detokenize codes before DiT, full pipeline working ff9f4ad Nekochu commited on 22 days ago
add _is_space flag, block inference during training, understand clone fix 3c15b8b Nekochu commited on 22 days ago
fix understand_audio: clone tensors for inference mode, working on GPU (52s) 4b2f4ad Nekochu commited on 22 days ago
add understand_audio (LM reverse), demucs-infer fix, commit refs, dtype fixes 6bfdc38 Nekochu commited on 22 days ago
major update: PyTorch inference, Gradio 6, session isolation, /understand captioning ff239f5 Nekochu commited on 22 days ago
truncate long files to fit cap, show which files truncated/skipped bc97006 Nekochu commited on 23 days ago
accept files until total audio cap reached, skip rest with warning 956dc8c Nekochu commited on 23 days ago
add LoRA download button after training (gr.File output, like rvc-beatrice) 2d3c27c Nekochu commited on 23 days ago
remove ace-server understand proxy, captioning stays librosa + txt sidecars 5b7a56f Nekochu commited on 23 days ago
SDPA first on Blackwell, FA2 only for Ampere/Hopper, txt caption support 04ccf32 Nekochu commited on 23 days ago
add GPU/CUDA auto-detect, mixed precision, flash_attn, txt caption parser 917e4ed Nekochu commited on 23 days ago
update defaults: LR 3e-4, rank 32, alpha 2x rank (per Side-Step author) 04c031f Nekochu commited on 23 days ago
add mid/sas analysis modes (Demucs + ensemble), auto-select by dataset size b38d0b1 Nekochu commited on 23 days ago
add auto-captioning (BPM/key/signature via librosa), add librosa+mutagen deps 1d42836 Nekochu commited on 23 days ago
switch training to standard turbo (11s/epoch), auto-select standard GGUF for LoRA inference c0f2a13 Nekochu commited on 23 days ago
fix: train on XL turbo (matches XL GGUF for inference), add XL checkpoint download 372f08e Nekochu commited on 23 days ago
fix: adapter saved to clean dir, LM dropdown no 'Default', on-demand download e62602f Nekochu commited on 23 days ago
default mp3, remove format selector, increase LM timeout to 900s 882ed5c Nekochu commited on 24 days ago