haeylee commited on
Commit
0ff7f54
·
verified ·
1 Parent(s): 03f4184

Create README.md

Browse files

# SSL-FT-PRON: Fine-tuned SSL Models for Automatic Pronunciation Assessment (APA)

**Author:** Haeyoung Lee (haeylee)
**Paper:** *Analysis of Various Self-Supervised Learning Models for Automatic Pronunciation Assessment (APSIPA ASC 2024)*
**Code:** https://github.com/hy310/ssl_finetuning

This repository on the Hub is a **collection of sub-checkpoints** for different SSL backbones (Wav2Vec2.0, HuBERT, WavLM) under three training strategies:

- **CTC**: ASR-style head with Connectionist Temporal Classification
- **Freeze**: feature extractor (CNN frontend) frozen during fine-tuning
- **General**: no CTC head; a small regression head predicts four APA scores
*(Accuracy, Fluency, Prosody, Total)*

> Each variant lives in a **subdirectory**. Load it using the full path
> (e.g., `haeylee/ssl_ft_pron/wav2vec2/general/02_wav2vec2-large-960h`).

---

Files changed (1) hide show
  1. README.md +21 -0
README.md ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - facebook/wav2vec2-large
4
+ - facebook/wav2vec2-large-960h
5
+ - facebook/wav2vec2-large-lv60
6
+ - facebook/wav2vec2-large-xlsr-53
7
+ - facebook/wav2vec2-xls-r-300m
8
+ - facebook/hubert-large-ll60k
9
+ - facebook/hubert-base-ls960
10
+ - facebook/hubert-xlarge-ll60k
11
+ - facebook/hubert-xlarge-ls960-ft
12
+ - microsoft/wavlm-large
13
+ - microsoft/wavlm-base-plus
14
+ - microsoft/wavlm-base-plus-sv
15
+ tags:
16
+ - self-supervised-learning
17
+ - pronunciation-assessment
18
+ - speech
19
+ metrics:
20
+ - pearsonr
21
+ ---