tim-lawson/token-weights_gpt2_HuggingFaceFW_fineweb_sample-10BT_gpt2_KLdiv_gpt2-large_gpt2-medium Text Generation • 0.1B • Updated Jul 22, 2025 • 1
tim-lawson/token-weights_gpt2_HuggingFaceFW_fineweb_sample-10BT_gpt2_KLdiv_gpt2-xl_gpt2 Text Generation • 0.1B • Updated Jul 22, 2025 • 1
tim-lawson/token-weights_gpt2_HuggingFaceFW_fineweb_sample-10BT_gpt2_KLdiv_gpt2-large_gpt2 Text Generation • 0.1B • Updated Jul 22, 2025 • 1
tim-lawson/token-weights_gpt2_HuggingFaceFW_fineweb_sample-10BT_gpt2_KLdiv_gpt2-medium_gpt2 Text Generation • 0.1B • Updated Jul 22, 2025 • 1
tim-lawson/notoken-weights_gpt2_HuggingFaceFW_fineweb_sample-10BT_gpt2_KLdiv_gpt2-medium_gpt2 Text Generation • 0.1B • Updated Jul 22, 2025 • 1
tim-lawson/skip-middle-fineweb-gated-target-end-0.4 Text Generation • 0.2B • Updated Jun 27, 2025 • 8
tim-lawson/skip-middle-fineweb-gated-target-end-0.1 Text Generation • 0.2B • Updated Jun 27, 2025 • 3
tim-lawson/skip-middle-fineweb-nocontrol-2-layers Text Generation • 96.1M • Updated Jun 27, 2025 • 13
tim-lawson/skip-middle-fineweb-gated-target-end-0.8 Text Generation • 0.2B • Updated Jun 27, 2025 • 5
tim-lawson/skip-middle-fineweb-gated-target-end-0.9 Text Generation • 0.2B • Updated Jun 27, 2025 • 5
tim-lawson/skip-middle-fineweb-gated-target-end-0.3 Text Generation • 0.2B • Updated Jun 27, 2025 • 12