|
|
--- |
|
|
license: mit |
|
|
task_categories: |
|
|
- audio-to-audio |
|
|
language: |
|
|
- en |
|
|
datasets: |
|
|
- Blinorot/lensless_mic_librispeech |
|
|
- Blinorot/lensless_mic_random |
|
|
- Blinorot/lensless_mic_songdescriber |
|
|
--- |
|
|
|
|
|
# Model Card for LenslessMic Reconstruction Algorithms |
|
|
|
|
|
## Models Summary |
|
|
Reconstruction algoritms from the |
|
|
["LenslessMic: Audio Encryption and Authentication via Lensless Computational Imaging"](https://arxiv.org/abs/2509.16418) paper. |
|
|
|
|
|
To download the models and work with them, use our [official repository](https://github.com/Blinorot/LenslessMic). |
|
|
|
|
|
## Models Details |
|
|
The models are saved in the following format: |
|
|
|
|
|
``` |
|
|
. |
|
|
βββ checkpoint_tag |
|
|
βββ checkpoint_name.pth # PyTorch checkpoint with model state dict under 'state_dict' key. |
|
|
βββ config.yaml # Hydra config used to train the model |
|
|
``` |
|
|
|
|
|
Checkpoint tag is represented in the following format: |
|
|
|
|
|
``` |
|
|
{latent_size}_{training_dataset}_{loss_functions_used}_{reconstruction_algorithm} |
|
|
``` |
|
|
|
|
|
1. The `latent_size` is either 16x16 or 32x32, depends on the neural audio codec used in the dataset. |
|
|
2. The training dataset is either `random` or `librispeech`. For `librispeech`, a groupped version can be used, tagged as |
|
|
`group_n_m_r_c` (see [LenslessMic Version of Librispeech](https://huggingface.co/datasets/Blinorot/lensless_mic_librispeech) |
|
|
(with 288x288 after group if the sensor image size is not the default 256x256). The version of the model, which is |
|
|
fine-tuned using `train-other`, is tagged as `librispeech_other` and `_ft` at the end. |
|
|
3. The `loss_function` is usually MSE, SSIM, and Raw SSIM, as in the paper. We also provide checkpoints with only MSE, |
|
|
MSE and SSIM, and all three with L1 waveform or Mel Losses. |
|
|
4. The reconstruction algorithm: `PSF_Unet4M_U5_Unet4M` is the Learned and R-Learned methods from the paper. |
|
|
`Unet8M` is the `NoPSF` method. |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use these models, please cite it as follows: |
|
|
|
|
|
```bibtex |
|
|
@article{grinberg2025lenslessmic, |
|
|
title = {LenslessMic: Audio Encryption and Authentication via Lensless Computational Imaging}, |
|
|
author = {Grinberg, Petr and Bezzam, Eric and Prandoni, Paolo and Vetterli, Martin}, |
|
|
journal = {arXiv preprint arXiv:2509.16418}, |
|
|
year = {2025}, |
|
|
} |
|
|
``` |