| # π KoChatBART | |
| [**BART**](https://arxiv.org/pdf/1910.13461.pdf)(**B**idirectional and **A**uto-**R**egressive **T**ransformers)λ μ λ ₯ ν μ€νΈ μΌλΆμ λ Έμ΄μ¦λ₯Ό μΆκ°νμ¬ μ΄λ₯Ό λ€μ μλ¬ΈμΌλ‘ 볡ꡬνλ `autoencoder`μ ννλ‘ νμ΅μ΄ λ©λλ€. νκ΅μ΄ μ±ν BART(μ΄ν **KoChatBART**) λ λ Όλ¬Έμμ μ¬μ©λ `Text Infilling` λ Έμ΄μ¦ ν¨μλ₯Ό μ¬μ©νμ¬ μ½ **10GB** μ΄μμ νκ΅μ΄ λν ν μ€νΈμ λν΄μ νμ΅ν νκ΅μ΄ `encoder-decoder` μΈμ΄ λͺ¨λΈμ λλ€. μ΄λ₯Ό ν΅ν΄ λμΆλ λν μμ±μ κ°κ±΄ν `KoChatBART-base`λ₯Ό λ°°ν¬ν©λλ€. | |
| <img src=https://user-images.githubusercontent.com/55969260/205434343-b72641e9-d0f9-4b88-a334-9f904e0a35c5.png> | |
| ## Quick tour | |
| ```python | |
| from transformers import AutoTokenizer, BartForConditionalGeneration | |
| tokenizer = AutoTokenizer.from_pretrained("BM-K/KoChatBART") | |
| model = BartForConditionalGeneration.from_pretrained("BM-K/KoChatBART") | |
| inputs = tokenizer("μλ μΈμμ!", return_tensors="pt") | |
| outputs = model(**inputs) | |
| ``` | |
| ## μ¬μ νμ΅ λ°μ΄ν° μ μ²λ¦¬ | |
| μ¬μ©ν λ°μ΄ν°μ | |
| - [μ£Όμ λ³ ν μ€νΈ μΌμ λν λ°μ΄ν°](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=543) | |
| - [μμκ³΅μΈ κ³ κ° μ£Όλ¬Έ μ§μ-μλ΅ ν μ€νΈ](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=102) | |
| - [νκ΅μ΄ SNS](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=114) | |
| - [λ―Όμ μ 무 μλν μΈκ³΅μ§λ₯ μΈμ΄ λ°μ΄ν°](https://aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=realm&dataSetSn=619) | |
| KoChatBARTλ₯Ό νμ΅μν€κΈ° μνμ¬ νκ΅μ΄ λν λ°μ΄ν°μ λ€μ μ μ²λ¦¬ ν ν©μ³ λλμ νκ΅μ΄ λν λ§λμΉλ₯Ό λ§λ€μμ΅λλ€. | |
| 1. λ°μ΄ν°μ μ€λ³΅μ μ€μ΄κΈ° μν΄ 'γ γ γ γ γ γ 'μ κ°μ μ€λ³΅λ ννμ΄ 2λ² μ΄μ λ°λ³΅λ λλ 'γ γ 'μ κ°μ΄ 2λ²μΌλ‘ λ°κΏ¨μ΅λλ€. | |
| 2. λ무 μ§§μ λ°μ΄ν°λ νμ΅μ λ°©ν΄κ° λ μ μκΈ° λλ¬Έμ KoBART ν ν¬λμ΄μ κΈ°μ€ μ 체 ν ν° κΈΈμ΄κ° 3μ λλ λ°μ΄ν°λ§μ μ λ³νμ΅λλ€. | |
| 3. κ°λͺ μ²λ¦¬λ λ°μ΄ν°λ μ κ±°νμμ΅λλ€. | |
| ## Model | |
| | Model | # of params | vocab size | Type | # of layers | # of heads | ffn_dim | hidden_dims | | |
| | ------------- | :---------: | :-----: | :----------: | ---------: | ------: | ----------: | ----------: | | |
| | `KoChatBART` | 139M | 50265 | Encoder | 6 | 16 | 3072 | 768 | | |
| | | | | Decoder | 6 | 16 | 3072 | 768 | | |
| ## λν μμ± μ±λ₯ μΈ‘μ | |
| λ€μ μ½λ[(Dialogue Generator)](https://github.com/2unju/KoBART_Dialogue_Generator)λ₯Ό κΈ°λ°μΌλ‘ κ° λͺ¨λΈμ fine-tuning νμμ΅λλ€. λν μμ± μ±λ₯ μΈ‘μ μ μν΄ μΆλ‘ μ ν ν¬λμ΄μ§λμ΄ μμ±λ μλ΅μ 볡μν ν, BPE tokenizerλ₯Ό μ¬μ©νμ¬ μ€μ μλ΅κ³Ό μμ±λ μλ΅ μ¬μ΄μ overlap λ° distinctλ₯Ό μΈ‘μ νμμ΅λλ€. | |
| > **Warning** <br> | |
| > μΌλ°μ μΌλ‘ μ§§μ λν λ°μ΄ν°λ‘ λͺ¨λΈμ μ¬μ νμ΅νμκΈ° λλ¬Έμ κΈ΄ λ¬Έμ₯ μ²λ¦¬κ° μꡬλλ νμ€ν¬(μμ½) λ±μ λν΄μλ μ½ν λͺ¨μ΅μ 보μ λλ€. | |
| ### μ€ν κ²°κ³Ό | |
| - [κ°μ± λν λ°μ΄ν°](https://github.com/songys/Chatbot_data) | |
| |Training|Validation|Test| | |
| |:----:|:----:|:----:| | |
| |9,458|1,182|1,183| | |
| | Model | Param | BLEU-3 | BLEU-4 | Dist-1 | Dist-2 | | |
| |------------------------|:----:|:----:|:----:|:----:|:----:| | |
| | KoBART | 124M | 8.73 | 7.12 | 16.85 | 34.89 | | |
| | KoChatBART | 139M | **12.97** | **11.23** | **19.64** | **44.53** | | |
| | KoT5-ETRI | 324M | 12.10 | 10.14 | 16.97 | 40.09 | | |
| - [μμκ³΅μΈ λν λ°μ΄ν°](https://github.com/2unju/AIHub_Chitchat_dataset_parser) | |
| |Training|Validation|Test| | |
| |:----:|:----:|:----:| | |
| |29,093|1,616|1,616| | |
| | Model | Param | BLEU-3 | BLEU-4 | Dist-1 | Dist-2 | | |
| |------------------------|:----:|:----:|:----:|:----:|:----:| | |
| | KoBART | 124M | 10.04 | 7.24 | 13.76| 42.09 | | |
| | KoChatBART | 139M | **10.11** | **7.26** | **15.12** | **46.08** | | |
| | KoT5-ETRI | 324M | 9.45 | 6.66 | 14.50 | 45.46 | | |
| ## Contributors | |
| <a href="https://github.com/BM-K/KoChatBART/graphs/contributors"> | |
| <img src="https://contrib.rocks/image?repo=BM-K/KoChatBART" /> | |
| </a> | |
| ## Reference | |
| - [KoBART](https://github.com/SKT-AI/KoBART) | |