Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
5
7
13
Catherine Arnett
catherinearnett
Follow
derguene's profile picture
jlzhou's profile picture
tylerachang's profile picture
105 followers
·
34 following
https://catherinearnett.github.io/
linguist_cat
catherinearnett
catherinearnett.bsky.social
AI & ML interests
multilingual NLP, tokenization
Recent Activity
liked
a dataset
12 days ago
aaparajit02/punjabi-asr
liked
a dataset
12 days ago
aznlp/azerbaijani-blogs
liked
a dataset
12 days ago
MWirelabs/assamese-monolingual-corpus
View all activity
Organizations
catherinearnett
's datasets
3
Sort: Recently updated
catherinearnett/montok
Updated
Sep 19, 2025
•
15.4k
•
3
catherinearnett/morphscore
Viewer
•
Updated
Jul 10, 2025
•
5.09M
•
443
•
3
catherinearnett/monolingual-tokenizer-data
Viewer
•
Updated
May 15, 2025
•
139M
•
76
•
1