Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
magibu
's Collections
Pretrain Datasets
papers
Ekip karışık verileri
Turkish Language Healthcare Datasets
Pretrain Datasets
updated
Jan 3
Datasets we use for pretraining large language models
Upvote
-
omarkamali/wikipedia-monthly
Viewer
•
Updated
Mar 14
•
195M
•
11.3k
•
65
alibayram/hukuk_soru_cevap
Viewer
•
Updated
Nov 6, 2024
•
2.08k
•
67
•
14
umutertugrul/turkish-hospital-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
24.6k
•
25
•
8
umutertugrul/turkish-medical-articles
Viewer
•
Updated
Oct 2, 2025
•
42.8k
•
12
•
3
alibayram/tr-books
Viewer
•
Updated
Dec 17, 2025
•
3.7k
•
5
selimfirat/bilkent-turkish-writings-dataset
Viewer
•
Updated
Mar 25
•
25.1k
•
189
•
16
umutertugrul/turkish-academic-theses-dataset
Viewer
•
Updated
Aug 18, 2025
•
649k
•
42
•
9
alibayram/onedio_haberler
Viewer
•
Updated
Jun 18, 2024
•
66.7k
•
3
•
5
habanoz/news-tr-1.8M
Viewer
•
Updated
Oct 6, 2024
•
1.85M
•
596
•
7
alibayram/hepsiburada_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
2.66M
•
15
•
14
alibayram/kitapyurdu_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
405k
•
6
•
1
alibayram/beyazperde_yorumlar
Viewer
•
Updated
Jun 18, 2024
•
192k
•
8
•
5
BILGEM-AI/BILGE-Synthetic-Stories
Viewer
•
Updated
Nov 20, 2025
•
2.87M
•
718
•
5
Upvote
-
Share collection
View history
Collection guide
Browse collections