File size: 3,766 Bytes
c0ab1bf f73ae00 455f80c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 | # Important
please do report how was the results of using this repository
# Faster-RoKAN Core
Faster-RoKAN is a next-generation hybrid architecture that integrates Faster-KAN (Kolmogorov-Arnold Networks) into the BS-Roformer audio source separation model.
## Features
- **Isomorphic Conversion**: Convert standard BS-Roformer or MelBand-Roformer models to the RoKAN architecture with ZERO fidelity loss (MAE ≈ 0.0).
- **Faster-KAN (RSWAF)**: Replaces linear MLP layers with Reflectional Switch Wavelet Activation Functions for efficient, expressive, and detailed non-linear learning. High-frequency artifacts are filtered out through smooth geometric spline curves.
- **Gentle Training**: Optimized for standard consumer hardware with thermal management considerations.
## Includes Base Model
To get you started immediately, we have included a pre-converted **`Base_Model.ckpt`** and **`Base_Model.yaml`** in this package.
This base model is already functioning perfectly. You skip the conversion step entirely and jump straight to fine-tuning it on your own dataset!
## Setup
```bash
pip install -r requirements.txt
```
## Usage
### 0. (Optional) How to Make Your Own RoKAN Model
If you want to use a different checkpoint rather than the provided `Base_Model`, you can convert your existing standard `.ckpt` to the RoKAN format automatically with `convert_bs_to_rokan.py`.
**(Note: You do NOT need to do this if you just want to use the included `Base_Model`.)**
```bash
python convert_bs_to_rokan.py \
--src_yaml dataset/Models/your_model.yaml \
--src_ckpt dataset/Models/your_model.ckpt \
--out_yaml converted/rokan.yaml \
--out_ckpt converted/rokan.ckpt
```
**How it works (For both BS & MelBand):**
The `convert_bs_to_rokan.py` script automatically analyzes your `.yaml` configuration to determine whether it is a **BS-Roformer** or a **MelBand-Roformer** (by checking for the `num_bands` parameter).
Depending on the architecture, it seamlessly intercepts the standard linear MLP components located inside the Siamese or Standard Transformer FeedForward blocks, and replaces them with our custom `FasterKANLinear` blocks. All base knowledge is perfectly preserved without any fidelity loss.
### 1. Fine-tuning
Train only the new KAN spline parameters on your dataset to remove high-frequency artifacts and teach the model geometric patterns. The script will automatically unfreeze *only* the new KAN parameters while keeping the base knowledge perfectly intact.
```bash
python train_rokan.py --ckpt_path Base_Model.ckpt --yaml_path Base_Model.yaml
```
*(Store your vocal audio in `dataset/vocals/` and instrumental audio in `dataset/instrumentals/` before running).*
### 2. Inference
Run source separation using the pre-tuned or fine-tuned model:
```bash
python run_infer_rokan.py \
--model_path Base_Model.ckpt \
--config_path Base_Model.yaml \
--input_audio your_song.wav
```
---
## Credits, Contact & Disclaimer
**All Method Made By Himadayon.**
**IMPORTANT:** If you release or distribute any models that utilize this architecture or are fine-tuned using this repository, you **must** explicitly explicitly credit `Himadayon` in your release notes or repository.
**Contact:**
If you have any questions or inquiries regarding this project, please send an email to:
📧 **Joker200702@gmail.com**
*(Please make sure to include a clear subject line and detailed contents in your email).*
**Disclaimer:**
For the purpose of experimental verification and architectural testing, existing base models originally developed by **unwa** and **Aname** were utilized during the development of this project.Also I do not contact to these 2 goats, I js use their model without ask them. Big shoutout to Unwa and Aname
|