brianling16's picture
Create README.md
1722c84 verified
|
raw
history blame
118 Bytes

Relaxed Recursive Transformer implementation, uptraining with distillation on openwebtext2. arxiv.org/abs/2410.20672