| license: apache-2.0 | |
| pipeline_tag: text-generation | |
| datasets: | |
| - HuggingFaceFW/fineweb-edu | |
| ```py | |
| config = { | |
| "d_model": 1180, | |
| "n_rwkv_layers": 16, | |
| "n_attn_layers": 4, | |
| "n_heads": 10, | |
| "seq_len": 1024, | |
| "batch_size": 4, | |
| "accum_steps": 8, | |
| "lr": 4e-4, | |
| } | |
| ``` |
| license: apache-2.0 | |
| pipeline_tag: text-generation | |
| datasets: | |
| - HuggingFaceFW/fineweb-edu | |
| ```py | |
| config = { | |
| "d_model": 1180, | |
| "n_rwkv_layers": 16, | |
| "n_attn_layers": 4, | |
| "n_heads": 10, | |
| "seq_len": 1024, | |
| "batch_size": 4, | |
| "accum_steps": 8, | |
| "lr": 4e-4, | |
| } | |
| ``` |