| Paper: [Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study](https://arxiv.org/abs/2212.10233) | |
| ``` | |
| @article{https://doi.org/10.48550/arxiv.2212.10233, | |
| doi = {10.48550/ARXIV.2212.10233}, | |
| url = {https://arxiv.org/abs/2212.10233}, | |
| author = {Wu, Di and Ahmad, Wasi Uddin and Chang, Kai-Wei}, | |
| keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences}, | |
| title = {Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study}, | |
| publisher = {arXiv}, | |
| year = {2022}, | |
| copyright = {Creative Commons Attribution 4.0 International} | |
| } | |
| ``` | |
| Pre-training Corpus: [RealNews](https://github.com/rowanz/grover/tree/master/realnews) | |
| Pre-training Details: | |
| - Resume from bert-base-uncased | |
| - Batch size: 512 | |
| - Total steps: 250k | |
| - Learning rate: 1e-4 | |
| - LR schedule: linear with 4k warmup steps | |
| - Masking ratio: 15% dynamic masking |