EAR_VAE / eval /README.md

Upload folder using huggingface_hub

b3c4dc3 verified 5 months ago

4.97 kB

	# VAE Audio Evaluation

	This directory contains the script and resources for evaluating the performance of models in audio reconstruction tasks. The primary script, `eval_compare_matrix.py`, computes a suite of objective metrics to compare the quality of audio generated by the model against the original ground truth audio.

	## Features

	- Comprehensive Metrics: Calculates a wide range of industry-standard and research-grade metrics:
	- Time-Domain: Scale-Invariant Signal-to-Distortion Ratio (SI-SDR).
	- Frequency-Domain: Multi-Resolution STFT Loss and Multi-Resolution Mel-Spectrogram Loss.
	- Phase: Multi-Resolution Phase Coherence (both per-channel and inter-channel for stereo).
	- Loudness: Integrated Loudness (LUFS-I), Loudness Range (LRA), and True Peak, analyzed using `ffmpeg`.
	- Batch Processing: Automatically discovers and processes multiple model output directories.
	- File Matching: Intelligently pairs reconstructed audio files (e.g., `_vae_rec.wav`) with their corresponding ground truth files (e.g., `.wav`).
	- Robust & Resilient: Handles missing files, audio processing errors, and varying sample rates gracefully.
	- Organized Output: Saves aggregated results in both machine-readable (`.json`) and human-readable (`.txt`) formats for each model.
	- Command-Line Interface: Easy-to-use CLI for specifying the input directory and other options.

	## Prerequisites

	### 1. Python Environment
	Ensure you have a Python environment (3.8 or newer recommended) with the required packages installed. You can install them using pip:
	```bash
	pip install torch torchaudio auraloss numpy
	```

	### 2. FFmpeg
	The script relies on `ffmpeg` for loudness analysis. You must have `ffmpeg` installed and accessible in your system's PATH.

	On Ubuntu/Debian:
	```bash
	sudo apt update && sudo apt install ffmpeg
	```

	On macOS (using Homebrew):
	```bash
	brew install ffmpeg
	```

	On Windows:
	Download the executable from the [official FFmpeg website](https://ffmpeg.org/download.html) and add its `bin` directory to your system's PATH environment variable.

	You can verify the installation by running:
	```bash
	ffmpeg -version
	```

	Also On Conda ENv:
	```bash
	conda install -c conda-forge 'ffmpeg<7'
	```

	## Directory Structure

	The script expects a specific directory structure for the evaluation data. The root input directory should contain subdirectories, where each subdirectory represents a different model or experiment to be evaluated.

	Inside each model's subdirectory, you should place the pairs of ground truth and reconstructed audio files. The script identifies pairs based on a naming convention:
	- Ground Truth: `your_audio_file.wav`
	- Reconstructed: `your_audio_file_vae_rec.wav`

	Here is an example structure:
	```
	/path/to/your/evaluation_data/
	├── model_A/
	│ ├── song1.wav # Ground Truth 1
	│ ├── song1_vae_rec.wav # Reconstructed 1
	│ ├── song2.wav # Ground Truth 2
	│ ├── song2_vae_rec.wav # Reconstructed 2
	│ └── ...
	├── model_B/
	│ ├── trackA.wav
	│ ├── trackA_vae_rec.wav
	│ └── ...
	└── ...
	```

	## Usage

	Run the evaluation script from the command line, pointing it to the root directory containing your model outputs.

	```bash
	python eval_compare_matrix.py --input_dir /path/to/your/evaluation_data/
	```

	### Command-Line Arguments

	- `--input_dir` (required): The path to the root directory containing the model folders (e.g., `/path/to/your/evaluation_data/`).
	- `--force` (optional): If specified, the script will re-run the evaluation for all models, even if results files (`evaluation_results.json`) already exist. By default, it skips models that have already been evaluated.
	- `--echo` (optional): If specified, the script will print the detailed evaluation metrics for each individual audio pair during processing. By default, only the progress bar and final summary are shown.

	### Example
	```bash
	python eval/eval_compare_matrix.py --input_dir ./results/
	```

	## Output

	After running, the script will generate two files inside each model's directory:

	1. `evaluation_results.json`: A JSON file containing the aggregated average of all computed metrics. This is ideal for programmatic analysis.
	```json
	{
	"model_name": "model_A",
	"file_count": 50,
	"avg_sisdr": 15.78,
	"avg_mel_distance": 0.45,
	"avg_stft_distance": 0.89,
	"avg_per_channel_coherence": 0.95,
	"avg_interchannel_coherence": 0.92,
	"avg_gen_lufs-i": -14.2,
	"avg_gt_lufs-i": -14.0,
	...
	}
	```

	2. `evaluation_summary.txt`: A human-readable text file summarizing the results.
	```
	model_name: model_A
	file_count: 50
	avg_sisdr: 15.78...
	avg_mel_distance: 0.45...
	...
	```
	This allows for quick inspection of a model's performance without needing to parse the JSON.