---
datasets:
- guilinhu/libri_conversation
language:
- en
license: mit
pipeline_tag: audio-to-audio
---

# Proactive Hearing Assistants that Isolate Egocentric Conversations

## More Information

This is the model implementation for the paper **[Proactive Hearing Assistants that Isolate Egocentric Conversations](https://www.arxiv.org/abs/2511.11473)**  
*Hu et al., 2025*.

For more information, please refer to our website: [https://proactivehearing.cs.washington.edu/](https://proactivehearing.cs.washington.edu/).
The code is available at: [https://github.com/guilinhu/proactive_hearing_assistant](https://github.com/guilinhu/proactive_hearing_assistant).

## Training and Evaluation

### 1. Installing Requirements

Before training or evaluating the model, please create an environment and install all dependencies:

```
pip install -r requirements.txt
```

### 2. Dataset (Required Before Training)

The synthetic libri conversation dataset for this project is provided through the publicly available **LibriConversation** dataset:

Dataset link: https://huggingface.co/datasets/guilinhu/libri_conversation

The dataset files are distributed as `.tar` archives. After downloading the dataset (for example `train.tar.gz` and `val.tar.gz`), extract them using:

```
tar -xvf train.tar.gz -C /path/to/output_directory/
tar -xvf val.tar.gz -C /path/to/output_directory/
```

This will produce directory structures containing the audio mixtures, target signals, and metadata used for model training. If you create your own dataset, ensure it follows the same format.

Once extracted, provide the directory paths in your model config file:

```
"train_data_args": {
  "input_dir": ["/absolute/path/to/train_directory"]
},

"val_data_args": {
  "input_dir": ["/absolute/path/to/validation_directory"]
}
```

### 3. Model Training

To train the model, run:

```
python src/train_joint.py --config <path_to_config> --run_dir <path_to_model_checkpoint>
```

To resume training, make sure that <path_to_model_checkpoint> points to the same directory used previously, and rerun the command above.


### 4. Model Evaluation

To evaluate the model, run:

```
python eval.py <path to testing dataset> <path to model checkpoint> --use_cuda --save
```


## Citation

If you use our work, please cite:

```
@inproceedings{hu2025proactive,
  title={Proactive Hearing Assistants that Isolate Egocentric Conversations},
  author={Hu, Guilin and Itani, Malek and Chen, Tuochao and Gollakota, Shyamnath},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing},
  pages={25377--25394},
  year={2025}
}
```