YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Important

please do report how was the results of using this repository

Faster-RoKAN Core

Faster-RoKAN is a next-generation hybrid architecture that integrates Faster-KAN (Kolmogorov-Arnold Networks) into the BS-Roformer audio source separation model.

Features

  • Isomorphic Conversion: Convert standard BS-Roformer or MelBand-Roformer models to the RoKAN architecture with ZERO fidelity loss (MAE โ‰ˆ 0.0).
  • Faster-KAN (RSWAF): Replaces linear MLP layers with Reflectional Switch Wavelet Activation Functions for efficient, expressive, and detailed non-linear learning. High-frequency artifacts are filtered out through smooth geometric spline curves.
  • Gentle Training: Optimized for standard consumer hardware with thermal management considerations.

Includes Base Model

To get you started immediately, we have included a pre-converted Base_Model.ckpt and Base_Model.yaml in this package. This base model is already functioning perfectly. You skip the conversion step entirely and jump straight to fine-tuning it on your own dataset!

Setup

pip install -r requirements.txt

Usage

0. (Optional) How to Make Your Own RoKAN Model

If you want to use a different checkpoint rather than the provided Base_Model, you can convert your existing standard .ckpt to the RoKAN format automatically with convert_bs_to_rokan.py. (Note: You do NOT need to do this if you just want to use the included Base_Model.)

python convert_bs_to_rokan.py \
    --src_yaml dataset/Models/your_model.yaml \
    --src_ckpt dataset/Models/your_model.ckpt \
    --out_yaml converted/rokan.yaml \
    --out_ckpt converted/rokan.ckpt

How it works (For both BS & MelBand): The convert_bs_to_rokan.py script automatically analyzes your .yaml configuration to determine whether it is a BS-Roformer or a MelBand-Roformer (by checking for the num_bands parameter). Depending on the architecture, it seamlessly intercepts the standard linear MLP components located inside the Siamese or Standard Transformer FeedForward blocks, and replaces them with our custom FasterKANLinear blocks. All base knowledge is perfectly preserved without any fidelity loss.

1. Fine-tuning

Train only the new KAN spline parameters on your dataset to remove high-frequency artifacts and teach the model geometric patterns. The script will automatically unfreeze only the new KAN parameters while keeping the base knowledge perfectly intact.

python train_rokan.py --ckpt_path Base_Model.ckpt --yaml_path Base_Model.yaml

(Store your vocal audio in dataset/vocals/ and instrumental audio in dataset/instrumentals/ before running).

2. Inference

Run source separation using the pre-tuned or fine-tuned model:

python run_infer_rokan.py \
    --model_path Base_Model.ckpt \
    --config_path Base_Model.yaml \
    --input_audio your_song.wav

Credits, Contact & Disclaimer

All Method Made By Himadayon. IMPORTANT: If you release or distribute any models that utilize this architecture or are fine-tuned using this repository, you must explicitly explicitly credit Himadayon in your release notes or repository.

Contact: If you have any questions or inquiries regarding this project, please send an email to: ๐Ÿ“ง Joker200702@gmail.com (Please make sure to include a clear subject line and detailed contents in your email).

Disclaimer: For the purpose of experimental verification and architectural testing, existing base models originally developed by unwa and Aname were utilized during the development of this project.Also I do not contact to these 2 goats, I js use their model without ask them. Big shoutout to Unwa and Aname

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support