EleutherAI/the_pile_deduplicated
Viewer • Updated • 134M • 21.7k • 112
For use with frontends that support GGML quantized GPT-NeoX models, such as KoboldCpp and Oobabooga (with the CTransformers loader).
Last updated on 2023-05-25.
For other versions of the models, see here:
Description:
| Model | RAM usage |
|---|---|
| Unloaded | 41.3 MiB |
| ggmlv3-pythia-70m-deduped-q4_0.bin | 95.5 MiB |
| ggmlv3-pythia-160m-deduped-q4_0.bin | 201.1 MiB |
| ggmlv3-pythia-410m-deduped-q4_0.bin | 415.1 MiB |
| ggmlv3-pythia-1b-deduped-q4_0.bin | 762.2 MiB |
| ggmlv3-pythia-1.4b-deduped-q4_0.bin | 1.0 GiB |
| ggmlv3-pythia-2.8b-deduped-q4_0.bin | 1.9 GiB |
| ggmlv3-pythia-70m-deduped-q5_1.bin | 108.7 MiB |
| ggmlv3-pythia-160m-deduped-q5_1.bin | 226.9 MiB |
| ggmlv3-pythia-410m-deduped-q5_1.bin | 494.0 MiB |
| ggmlv3-pythia-1b-deduped-q5_1.bin | 943.9 MiB |
| ggmlv3-pythia-1.4b-deduped-q5_1.bin | 1.3 GiB |
| ggmlv3-pythia-2.8b-deduped-q5_1.bin | 2.3 GiB |
Tested on KoboldCpp with OpenBLAS enabled.