metadata
license: cc-by-4.0
tags:
- audio-visual-learning
- joint-embedding
- acoustics
- room-geometry
- computer-vision
- pytorch
library_name: pytorch
AGREE
Official model weights for AGREE, introduced in the paper "Few-shot Acoustic Synthesis with Multimodal Flow Matching" by Amandine Brunetto (CVPR 2026).
AGREE is a joint embedding model for acoustics and room geometry. It learns a shared representation between room impulse responses (RIRs) and panoramic depth maps captured at the receiver position.
The model can be used for:
- Evaluating geometry consistency of generated RIRs (via retrieval metrics and Fréchet distance, as done in the paper)
- Downstream multimodal learning tasks involving acoustics and geometry
- Audio-visual representation learning
This repository contains the pretrained weights. To run AGREE, please use the official codebase.
Available checkpoints
| file | description |
|---|---|
AGREE_AR.ckpt |
Model trained on the Acoustic Rooms (AR) training set. Intended for downstream tasks. |
AGREE_fullAR.ckpt |
Model trained on the full AR dataset. Used in the paper for evaluation of RIR generation. |
AGREE_fullHAA.ckpt |
Model fine-tuned on the full HAA dataset, used for evaluation of RIR generation. |
Download
Weights can be downloaded with:
huggingface-cli download AmandineBtto/AGREE --local-dir weights/AGREE