brianling16's picture
Create README.md
1722c84 verified
Relaxed Recursive Transformer implementation, uptraining with distillation on openwebtext2.
arxiv.org/abs/2410.20672