Update README.md

455f80c verified about 1 month ago

3.77 kB

	# Important
	please do report how was the results of using this repository


	# Faster-RoKAN Core

	Faster-RoKAN is a next-generation hybrid architecture that integrates Faster-KAN (Kolmogorov-Arnold Networks) into the BS-Roformer audio source separation model.

	## Features
	- Isomorphic Conversion: Convert standard BS-Roformer or MelBand-Roformer models to the RoKAN architecture with ZERO fidelity loss (MAE ≈ 0.0).
	- Faster-KAN (RSWAF): Replaces linear MLP layers with Reflectional Switch Wavelet Activation Functions for efficient, expressive, and detailed non-linear learning. High-frequency artifacts are filtered out through smooth geometric spline curves.
	- Gentle Training: Optimized for standard consumer hardware with thermal management considerations.

	## Includes Base Model
	To get you started immediately, we have included a pre-converted `Base_Model.ckpt` and `Base_Model.yaml` in this package.
	This base model is already functioning perfectly. You skip the conversion step entirely and jump straight to fine-tuning it on your own dataset!

	## Setup
	```bash
	pip install -r requirements.txt
	```

	## Usage

	### 0. (Optional) How to Make Your Own RoKAN Model
	If you want to use a different checkpoint rather than the provided `Base_Model`, you can convert your existing standard `.ckpt` to the RoKAN format automatically with `convert_bs_to_rokan.py`.
	(Note: You do NOT need to do this if you just want to use the included `Base_Model`.)

	```bash
	python convert_bs_to_rokan.py \
	--src_yaml dataset/Models/your_model.yaml \
	--src_ckpt dataset/Models/your_model.ckpt \
	--out_yaml converted/rokan.yaml \
	--out_ckpt converted/rokan.ckpt
	```

	How it works (For both BS & MelBand):
	The `convert_bs_to_rokan.py` script automatically analyzes your `.yaml` configuration to determine whether it is a BS-Roformer or a MelBand-Roformer (by checking for the `num_bands` parameter).
	Depending on the architecture, it seamlessly intercepts the standard linear MLP components located inside the Siamese or Standard Transformer FeedForward blocks, and replaces them with our custom `FasterKANLinear` blocks. All base knowledge is perfectly preserved without any fidelity loss.

	### 1. Fine-tuning
	Train only the new KAN spline parameters on your dataset to remove high-frequency artifacts and teach the model geometric patterns. The script will automatically unfreeze only the new KAN parameters while keeping the base knowledge perfectly intact.

	```bash
	python train_rokan.py --ckpt_path Base_Model.ckpt --yaml_path Base_Model.yaml
	```
	(Store your vocal audio in `dataset/vocals/` and instrumental audio in `dataset/instrumentals/` before running).

	### 2. Inference
	Run source separation using the pre-tuned or fine-tuned model:
	```bash
	python run_infer_rokan.py \
	--model_path Base_Model.ckpt \
	--config_path Base_Model.yaml \
	--input_audio your_song.wav
	```

	---

	## Credits, Contact & Disclaimer

	All Method Made By Himadayon.
	IMPORTANT: If you release or distribute any models that utilize this architecture or are fine-tuned using this repository, you must explicitly explicitly credit `Himadayon` in your release notes or repository.

	Contact:
	If you have any questions or inquiries regarding this project, please send an email to:
	📧 Joker200702@gmail.com
	(Please make sure to include a clear subject line and detailed contents in your email).

	Disclaimer:
	For the purpose of experimental verification and architectural testing, existing base models originally developed by unwa and Aname were utilized during the development of this project.Also I do not contact to these 2 goats, I js use their model without ask them. Big shoutout to Unwa and Aname