| # Important |
| please do report how was the results of using this repository |
|
|
|
|
| # Faster-RoKAN Core |
|
|
| Faster-RoKAN is a next-generation hybrid architecture that integrates Faster-KAN (Kolmogorov-Arnold Networks) into the BS-Roformer audio source separation model. |
|
|
| ## Features |
| - **Isomorphic Conversion**: Convert standard BS-Roformer or MelBand-Roformer models to the RoKAN architecture with ZERO fidelity loss (MAE ≈ 0.0). |
| - **Faster-KAN (RSWAF)**: Replaces linear MLP layers with Reflectional Switch Wavelet Activation Functions for efficient, expressive, and detailed non-linear learning. High-frequency artifacts are filtered out through smooth geometric spline curves. |
| - **Gentle Training**: Optimized for standard consumer hardware with thermal management considerations. |
|
|
| ## Includes Base Model |
| To get you started immediately, we have included a pre-converted **`Base_Model.ckpt`** and **`Base_Model.yaml`** in this package. |
| This base model is already functioning perfectly. You skip the conversion step entirely and jump straight to fine-tuning it on your own dataset! |
|
|
| ## Setup |
| ```bash |
| pip install -r requirements.txt |
| ``` |
|
|
| ## Usage |
|
|
| ### 0. (Optional) How to Make Your Own RoKAN Model |
| If you want to use a different checkpoint rather than the provided `Base_Model`, you can convert your existing standard `.ckpt` to the RoKAN format automatically with `convert_bs_to_rokan.py`. |
| **(Note: You do NOT need to do this if you just want to use the included `Base_Model`.)** |
| |
| ```bash |
| python convert_bs_to_rokan.py \ |
| --src_yaml dataset/Models/your_model.yaml \ |
| --src_ckpt dataset/Models/your_model.ckpt \ |
| --out_yaml converted/rokan.yaml \ |
| --out_ckpt converted/rokan.ckpt |
| ``` |
| |
| **How it works (For both BS & MelBand):** |
| The `convert_bs_to_rokan.py` script automatically analyzes your `.yaml` configuration to determine whether it is a **BS-Roformer** or a **MelBand-Roformer** (by checking for the `num_bands` parameter). |
| Depending on the architecture, it seamlessly intercepts the standard linear MLP components located inside the Siamese or Standard Transformer FeedForward blocks, and replaces them with our custom `FasterKANLinear` blocks. All base knowledge is perfectly preserved without any fidelity loss. |
|
|
| ### 1. Fine-tuning |
| Train only the new KAN spline parameters on your dataset to remove high-frequency artifacts and teach the model geometric patterns. The script will automatically unfreeze *only* the new KAN parameters while keeping the base knowledge perfectly intact. |
|
|
| ```bash |
| python train_rokan.py --ckpt_path Base_Model.ckpt --yaml_path Base_Model.yaml |
| ``` |
| *(Store your vocal audio in `dataset/vocals/` and instrumental audio in `dataset/instrumentals/` before running).* |
|
|
| ### 2. Inference |
| Run source separation using the pre-tuned or fine-tuned model: |
| ```bash |
| python run_infer_rokan.py \ |
| --model_path Base_Model.ckpt \ |
| --config_path Base_Model.yaml \ |
| --input_audio your_song.wav |
| ``` |
|
|
| --- |
|
|
| ## Credits, Contact & Disclaimer |
|
|
| **All Method Made By Himadayon.** |
| **IMPORTANT:** If you release or distribute any models that utilize this architecture or are fine-tuned using this repository, you **must** explicitly explicitly credit `Himadayon` in your release notes or repository. |
|
|
| **Contact:** |
| If you have any questions or inquiries regarding this project, please send an email to: |
| 📧 **Joker200702@gmail.com** |
| *(Please make sure to include a clear subject line and detailed contents in your email).* |
|
|
| **Disclaimer:** |
| For the purpose of experimental verification and architectural testing, existing base models originally developed by **unwa** and **Aname** were utilized during the development of this project.Also I do not contact to these 2 goats, I js use their model without ask them. Big shoutout to Unwa and Aname |
|
|