YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Important
please do report how was the results of using this repository
Faster-RoKAN Core
Faster-RoKAN is a next-generation hybrid architecture that integrates Faster-KAN (Kolmogorov-Arnold Networks) into the BS-Roformer audio source separation model.
Features
- Isomorphic Conversion: Convert standard BS-Roformer or MelBand-Roformer models to the RoKAN architecture with ZERO fidelity loss (MAE โ 0.0).
- Faster-KAN (RSWAF): Replaces linear MLP layers with Reflectional Switch Wavelet Activation Functions for efficient, expressive, and detailed non-linear learning. High-frequency artifacts are filtered out through smooth geometric spline curves.
- Gentle Training: Optimized for standard consumer hardware with thermal management considerations.
Includes Base Model
To get you started immediately, we have included a pre-converted Base_Model.ckpt and Base_Model.yaml in this package.
This base model is already functioning perfectly. You skip the conversion step entirely and jump straight to fine-tuning it on your own dataset!
Setup
pip install -r requirements.txt
Usage
0. (Optional) How to Make Your Own RoKAN Model
If you want to use a different checkpoint rather than the provided Base_Model, you can convert your existing standard .ckpt to the RoKAN format automatically with convert_bs_to_rokan.py.
(Note: You do NOT need to do this if you just want to use the included Base_Model.)
python convert_bs_to_rokan.py \
--src_yaml dataset/Models/your_model.yaml \
--src_ckpt dataset/Models/your_model.ckpt \
--out_yaml converted/rokan.yaml \
--out_ckpt converted/rokan.ckpt
How it works (For both BS & MelBand):
The convert_bs_to_rokan.py script automatically analyzes your .yaml configuration to determine whether it is a BS-Roformer or a MelBand-Roformer (by checking for the num_bands parameter).
Depending on the architecture, it seamlessly intercepts the standard linear MLP components located inside the Siamese or Standard Transformer FeedForward blocks, and replaces them with our custom FasterKANLinear blocks. All base knowledge is perfectly preserved without any fidelity loss.
1. Fine-tuning
Train only the new KAN spline parameters on your dataset to remove high-frequency artifacts and teach the model geometric patterns. The script will automatically unfreeze only the new KAN parameters while keeping the base knowledge perfectly intact.
python train_rokan.py --ckpt_path Base_Model.ckpt --yaml_path Base_Model.yaml
(Store your vocal audio in dataset/vocals/ and instrumental audio in dataset/instrumentals/ before running).
2. Inference
Run source separation using the pre-tuned or fine-tuned model:
python run_infer_rokan.py \
--model_path Base_Model.ckpt \
--config_path Base_Model.yaml \
--input_audio your_song.wav
Credits, Contact & Disclaimer
All Method Made By Himadayon.
IMPORTANT: If you release or distribute any models that utilize this architecture or are fine-tuned using this repository, you must explicitly explicitly credit Himadayon in your release notes or repository.
Contact: If you have any questions or inquiries regarding this project, please send an email to: ๐ง Joker200702@gmail.com (Please make sure to include a clear subject line and detailed contents in your email).
Disclaimer: For the purpose of experimental verification and architectural testing, existing base models originally developed by unwa and Aname were utilized during the development of this project.Also I do not contact to these 2 goats, I js use their model without ask them. Big shoutout to Unwa and Aname