codelion/gpt-2-70m
Text Generation
•
64.1M
•
Updated
•
609
•
18
A collection of pre-training datasets samples of sizes 10M, 100M and 1B tokens. Ideal for use in quick experimentation and ablations.