File size: 1,415 Bytes
ce225f0 d0af2eb ce225f0 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
---
pipeline_tag: other
language: en
library_name: pytorch
license: apache-2.0
tags:
- music
- midi
- mir
- deduplication
- caugbert
model-index:
- name: LMD Deduplication - CAugBERT
results:
- task:
type: representation-learning
name: symbolic music representation learning
dataset:
type: midi
name: Lakh MIDI Dataset
metrics:
- type: F1
value: 0.493
---
# LMD Deduplication Supplements
This repository provides the pre-trained CAugBERT model checkpoint used in:
**"On the De-duplication of the Lakh MIDI Dataset" (ISMIR 2025)**
[[Paper]](https://ismir2025program.ismir.net/poster_188.html) | [[GitHub Code]](https://github.com/jech2/LMD_Deduplication)
---
# Usage
You can either integrate this checkpoint into the main repository for inference, or load it directly:
```bash
# Option 1: Run inference in the main repo
poetry run python inference.py # make sure yamls/inference.yaml paths are correct
```
```python
# Option 2: Load checkpoint manually
import torch
from contrastive_musicbert.model.BERT import BERT_Lightning
model = BERT_Lightning(...).to(device) # see .hydra/config.yaml for arguments
checkpoint = torch.load(checkpoint_path, map_location="cpu")
model.load_state_dict(checkpoint['state_dict'])
```
# Note
If you have any questions regarding the checkpoint, please contact:
Eunjin Choi (jech@kaist.ac.kr) |