userx1 commited on
Commit
fa9b889
·
verified ·
1 Parent(s): d962890

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md CHANGED
@@ -14,6 +14,9 @@ v2 (continued) - https://github.com/alanngnet/CoverHunterMPS
14
  v1 (original) - https://github.com/Liu-Feng-deeplearning/CoverHunter
15
 
16
  >This is the official PyTorch implementation of the following paper:
 
17
  >https://arxiv.org/abs/2306.09025
 
 
18
  >COVERHUNTER: COVER SONG IDENTIFY WITH REFINED ATTENTION AND ALIGNMENTS Feng Liu, Deyi Tuo, Yinan Xu, Xintong Han
19
  >Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track. In this paper, we propose a novel system named CoverHunter that overcomes the shortcomings of existing detection schemes by exploring richer features with refined attention and alignments. CoverHunter contains three key modules: 1) A convolution-augmented transformer (i.e., Conformer) structure that captures both local and global feature interactions in contrast to previous methods mainly relying on convolutional neural networks; 2) An attention-based time pooling module that further exploits the attention in the time dimension; 3) A novel coarse-to-fine training scheme that first trains a network to roughly align the song chunks and then refines the network by training on the aligned chunks. At the same time, we also summarize some important training tricks used in our system that help achieve better results. Experiments on several standard CSI datasets show that our method significantly improves over state-of-the-art methods with an embedding size of 128 (2.3% on SHS100K-TEST and 17.7% on DaTacos).
 
14
  v1 (original) - https://github.com/Liu-Feng-deeplearning/CoverHunter
15
 
16
  >This is the official PyTorch implementation of the following paper:
17
+ >
18
  >https://arxiv.org/abs/2306.09025
19
+ >
20
+ >
21
  >COVERHUNTER: COVER SONG IDENTIFY WITH REFINED ATTENTION AND ALIGNMENTS Feng Liu, Deyi Tuo, Yinan Xu, Xintong Han
22
  >Abstract: Cover song identification (CSI) focuses on finding the same music with different versions in reference anchors given a query track. In this paper, we propose a novel system named CoverHunter that overcomes the shortcomings of existing detection schemes by exploring richer features with refined attention and alignments. CoverHunter contains three key modules: 1) A convolution-augmented transformer (i.e., Conformer) structure that captures both local and global feature interactions in contrast to previous methods mainly relying on convolutional neural networks; 2) An attention-based time pooling module that further exploits the attention in the time dimension; 3) A novel coarse-to-fine training scheme that first trains a network to roughly align the song chunks and then refines the network by training on the aligned chunks. At the same time, we also summarize some important training tricks used in our system that help achieve better results. Experiments on several standard CSI datasets show that our method significantly improves over state-of-the-art methods with an embedding size of 128 (2.3% on SHS100K-TEST and 17.7% on DaTacos).