File size: 3,766 Bytes
c0ab1bf
 
 
 
f73ae00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
455f80c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
# Important
please do report how was the results of using this repository


# Faster-RoKAN Core

Faster-RoKAN is a next-generation hybrid architecture that integrates Faster-KAN (Kolmogorov-Arnold Networks) into the BS-Roformer audio source separation model.

## Features
- **Isomorphic Conversion**: Convert standard BS-Roformer or MelBand-Roformer models to the RoKAN architecture with ZERO fidelity loss (MAE ≈ 0.0).
- **Faster-KAN (RSWAF)**: Replaces linear MLP layers with Reflectional Switch Wavelet Activation Functions for efficient, expressive, and detailed non-linear learning. High-frequency artifacts are filtered out through smooth geometric spline curves.
- **Gentle Training**: Optimized for standard consumer hardware with thermal management considerations.

## Includes Base Model
To get you started immediately, we have included a pre-converted **`Base_Model.ckpt`** and **`Base_Model.yaml`** in this package. 
This base model is already functioning perfectly. You skip the conversion step entirely and jump straight to fine-tuning it on your own dataset!

## Setup
```bash
pip install -r requirements.txt
```

## Usage

### 0. (Optional) How to Make Your Own RoKAN Model
If you want to use a different checkpoint rather than the provided `Base_Model`, you can convert your existing standard `.ckpt` to the RoKAN format automatically with `convert_bs_to_rokan.py`. 
**(Note: You do NOT need to do this if you just want to use the included `Base_Model`.)**

```bash
python convert_bs_to_rokan.py \
    --src_yaml dataset/Models/your_model.yaml \
    --src_ckpt dataset/Models/your_model.ckpt \
    --out_yaml converted/rokan.yaml \
    --out_ckpt converted/rokan.ckpt
```

**How it works (For both BS & MelBand):**
The `convert_bs_to_rokan.py` script automatically analyzes your `.yaml` configuration to determine whether it is a **BS-Roformer** or a **MelBand-Roformer** (by checking for the `num_bands` parameter). 
Depending on the architecture, it seamlessly intercepts the standard linear MLP components located inside the Siamese or Standard Transformer FeedForward blocks, and replaces them with our custom `FasterKANLinear` blocks. All base knowledge is perfectly preserved without any fidelity loss.

### 1. Fine-tuning
Train only the new KAN spline parameters on your dataset to remove high-frequency artifacts and teach the model geometric patterns. The script will automatically unfreeze *only* the new KAN parameters while keeping the base knowledge perfectly intact.

```bash
python train_rokan.py --ckpt_path Base_Model.ckpt --yaml_path Base_Model.yaml
```
*(Store your vocal audio in `dataset/vocals/` and instrumental audio in `dataset/instrumentals/` before running).*

### 2. Inference
Run source separation using the pre-tuned or fine-tuned model:
```bash
python run_infer_rokan.py \
    --model_path Base_Model.ckpt \
    --config_path Base_Model.yaml \
    --input_audio your_song.wav
```

---

## Credits, Contact & Disclaimer

**All Method Made By Himadayon.**
**IMPORTANT:** If you release or distribute any models that utilize this architecture or are fine-tuned using this repository, you **must** explicitly explicitly credit `Himadayon` in your release notes or repository.

**Contact:**
If you have any questions or inquiries regarding this project, please send an email to:
📧 **Joker200702@gmail.com**
*(Please make sure to include a clear subject line and detailed contents in your email).*

**Disclaimer:** 
For the purpose of experimental verification and architectural testing, existing base models originally developed by **unwa** and **Aname** were utilized during the development of this project.Also I do not contact to these 2 goats, I js use their model without ask them. Big shoutout to Unwa and Aname