SudharsanSundar's picture
Update README.md
3ce2419

Model Details

  • Architecture: Basic/default GPT-2, decoder only
  • Num params: ~50M
  • Num tokens seen: ~1.31 B
  • Dataset: USPTO subset of The Pile