sleepyhead111's picture
Add files using upload-large-folder tool
fdc723d verified

Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

Example usage

First download and preprocess the data following the main language modeling README.

Then to train a convolutional LM using the fconv_lm_dauphin_wikitext103 architecture:

fairseq-train --task language_modeling \
    data-bin/wikitext-103 \
    --save-dir checkpoints/fconv_wikitext-103 \
    --arch fconv_lm_dauphin_wikitext103 \
    --adaptive-softmax-cutoff 10000,20000,200000 \
    --dropout 0.2 \
    --criterion adaptive_loss \
    --optimizer nag --clip-norm 0.1 --weight-decay 5e-06 \
    --lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 \
    --max-tokens 1024 --tokens-per-sample 1024 \
    --ddp-backend no_c10d \
    --max-epoch 35

And evaluate with:

fairseq-eval-lm data-bin/wikitext-103 --path checkpoints/fconv_wiki103/checkpoint_best.pt

Citation

@inproceedings{dauphin2017language,
  title={Language Modeling with Gated Convolutional Networks},
  author={Dauphin, Yann N and Fan, Angela and Auli, Michael and Grangier, David},
  booktitle={Proceedings of the 34th International Conference on Machine Learning-Volume 70},
  pages={933--941},
  year={2017},
  organization={JMLR}
}