metadata
tags:
- Distillation
the model(student) after finetuning outperforms 410m-deduped model on wsc accuracy.
tags:
- Distillation
the model(student) after finetuning outperforms 410m-deduped model on wsc accuracy.