github-code

community
Activity Feed

AI & ML interests

None defined yet.

Recent Activity

nick007xย  updated a dataset about 14 hours ago
github-code/github-archive
nick007xย  published a dataset 2 days ago
github-code/github-archive
nick007xย  published a bucket 2 days ago
github-code/github-archive
View all activity

nick007xย 
published a bucket 2 days ago
nick007xย 
posted an update 7 months ago
view post
Post
2039
๐Ÿ‘‹ Hey i have Just uploaded 2 new datasets for code and scientific reasoning models:

1. ArXiv Papers (4.6TB) A massive scientific corpus with papers and metadata across all domains.Perfect for training models on academic reasoning, literature review, and scientific knowledge mining. ๐Ÿ”—Link: nick007x/arxiv-papers

2. GitHub Code 2025 (1 TB)a comprehensive code dataset for code generation and analysis tasks. mostly contains GitHub's high quality top 1 million repos above 2 stars ๐Ÿ”—Link: nick007x/github-code-2025