Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
104.8
TFLOPS
82
7
1
Jeffrey Quesnelle
PRO
emozilla
Follow
vegardw's profile picture
DESUCLUB's profile picture
machine-learnoooooooor's profile picture
4,587 followers
·
12 following
https://jeffq.com
theemozilla
jquesnelle
AI & ML interests
None yet
Recent Activity
authored
a paper
about 1 month ago
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation
upvoted
a
paper
about 1 month ago
Decoupling the Benefits of Subword Tokenization for Language Model Training via Byte-level Simulation
submitted
a paper
about 1 month ago
Targeted Neuron Modulation via Contrastive Pair Search
View all activity
Organizations
emozilla
's datasets
53
Sort: Recently updated
emozilla/Hermes-3-Preprocessed-Llama3-2samples
Viewer
•
Updated
Jul 23, 2025
•
2
•
6
emozilla/Hermes-3-Preprocessed-Llama3-100samples
Viewer
•
Updated
Jul 23, 2025
•
100
•
4
•
1
emozilla/Hermes-3-Preprocessed-Llama3
Viewer
•
Updated
Jul 23, 2025
•
91.1k
•
7
•
1
emozilla/dolma-v1_7-30B-tokenized-llama2-nanoset
Updated
Jul 9, 2024
•
44
•
1
emozilla/fineweb-10bt-tokenized-datatrove-llama2
Updated
Jul 8, 2024
•
94
•
3
emozilla/fineweb-350bt-tokenized-datatrove-llama2
Updated
Jul 7, 2024
•
86
emozilla/dolma-v1_7-305B-tokenized-llama2-nanoset
Updated
Jun 5, 2024
•
128
emozilla/proofpile-test-tokenized-llama3
Viewer
•
Updated
Jun 5, 2024
•
46.3k
•
47
emozilla/PaulGrahamEssays
Viewer
•
Updated
Jun 1, 2024
•
49
•
14
emozilla/dolma-v1_7-cc_en_head
Viewer
•
Updated
May 30, 2024
•
475M
•
339
•
1
emozilla/dolma-v1_7-c4
Viewer
•
Updated
May 29, 2024
•
250M
•
52
•
2
emozilla/dolma-v1_7-305B-tokenized-llama3-nanoset
Updated
May 29, 2024
•
197
•
1
emozilla/dolma-v1_7-books
Viewer
•
Updated
May 29, 2024
•
56k
•
34
•
2
emozilla/dolma-v1_7-arxiv
Viewer
•
Updated
May 29, 2024
•
1.55M
•
275
•
3
emozilla/dolma-v1_7-algebraic-stack-train
Viewer
•
Updated
May 29, 2024
•
2.83M
•
77
•
1
emozilla/dolma-v1_7-30B
Viewer
•
Updated
May 23, 2024
•
34.5M
•
152
•
1
emozilla/dolma-v1_7-3B
Viewer
•
Updated
May 23, 2024
•
3.4M
•
360
•
1
emozilla/dolma-v1_7-3B-tokenized-llama3-nanoset
Updated
May 23, 2024
•
5
•
1
emozilla/dolma-v1_7-30B-tokenized-llama3-nanoset
Updated
May 20, 2024
•
108
•
1
emozilla/dolma-v1_7-305B
Viewer
•
Updated
May 13, 2024
•
343M
•
478
•
11
emozilla/c4-validation.00000-of-00008
Viewer
•
Updated
Apr 11, 2024
•
45.6k
•
8
emozilla/hermes2-tokenized-llama-alpaca
Viewer
•
Updated
Mar 13, 2024
•
1M
•
65
emozilla/yarn-train-tokenized-8k-mistral
Viewer
•
Updated
Jan 6, 2024
•
417k
•
208
•
2
emozilla/story-summary-training-mistral-9k-1_4_24
Viewer
•
Updated
Jan 4, 2024
•
751
•
25
•
4
emozilla/yarn-train-tokenized-8k-llama
Viewer
•
Updated
Nov 16, 2023
•
213k
•
744
•
1
emozilla/yarn-train-tokenized-32k-mistral
Viewer
•
Updated
Oct 21, 2023
•
104k
•
84
•
3
emozilla/yarn-train-tokenized-16k-mistral
Viewer
•
Updated
Oct 11, 2023
•
208k
•
668
•
14
emozilla/pg19
Viewer
•
Updated
Oct 9, 2023
•
13.8k
•
12.3k
•
18
emozilla/Long-Data-Collections-Fine-Tune
Viewer
•
Updated
Oct 9, 2023
•
98.6k
•
446
•
4
emozilla/Long-Data-Collections-Pretrain-Without-Books
Viewer
•
Updated
Oct 9, 2023
•
9.38M
•
960
•
2
Previous
1
2
Next