| --- |
| license: mit |
| datasets: |
| - KorQuAD/squad_kor_v1 |
| language: |
| - ko |
| metrics: |
| - accuracy |
| --- |
| |
| # ๐ DPR-KO |
|
|
| ## 1. Intro |
|
|
| **ํ๊ตญ์ด DPR ๋ชจ๋ธ (Question Encoder)** ์
๋๋ค. |
| Facebook์ DPR ์ฝ๋์๋ ์ ํ ๋ค๋ฅธ ์๋ก์ด ์ฝ๋๋ก ํ์ต๋์์ต๋๋ค. |
| Dense Vector ๊ธฐ๋ฐ์ Semantic Search์ ์ฌ์ฉํ ์ ์์ต๋๋ค. |
| ์ง๋ฌธ์ Question Encoder๋ก, ํ
์คํธ๋ Context Encoder๋ฅผ ์ด์ฉํด ์ธ์ฝ๋ฉํฉ๋๋ค. |
|
|
| - Github: [https://github.com/snumin44/DPR-KO](https://github.com/snumin44/DPR-KO) |
| - Context Encoder: [https://huggingface.co/snumin44/biencoder-ko-bert-context](https://huggingface.co/snumin44/biencoder-ko-bert-context) |
|
|
|
|
| ## 2. Experiment settings |
|
|
| - ๋ฒ ์ด์ค ๋ชจ๋ธ: klue/bert-base |
| - ๋ฐ์ดํฐ ์
: KorQuad v1 |
| - ์ํค ๋คํ: kowiki-latest-pages-articles.xml.bz2 (2024/07/23) |
| - ์ฒญํฌ ๋น ๋ฌธ์ฅ: 5 |
| - ์ ์ฒด ์ฒญํฌ: ์ฝ 160 ๋ง |
| - BM25 ๊ฐ์ค์น: 0.3 |
| - 1 A100 GPU |
|
|
| ## 3. Performance |
|
|
| |(%)|BM25 (w/o DPR-KO)|DPR-KO (w/o BM25)|DPR-KO (with BM25)| |
| |:---:|:---:|:---:|:---:| |
| |Top1 Acc|36.25 |**48.98** |71.16 | |
| |Top5 Acc|51.61 |**71.16** |86.75 | |
| |Top10 Acc|57.34 |**77.05** |90.28 | |
| |Top20 Acc|62.40 |**82.09** |92.66 | |
| |Top50 Acc|68.46 |**87.03** |94.86 | |
| |Top100 Acc|72.48 |**90.23** |96.02 | |
|
|
| โป BM25๋ชจ๋ธ์ ํ๊ตญ์ด ์ํคํผ๋์ ์ ์ฒด ํ
์คํธ๋ก ํ์ตํ ๋ชจ๋ธ์
๋๋ค. |
| โป ์์ธํ ์ฝ๋๋ Github ๋ฅผ ์ฐธ๊ณ ํด์ฃผ์ธ์. |
|
|
| ## Citing |
| ``` |
| ``` |
|
|