Spaces:
Running
Running
| title: README | |
| emoji: ๐ | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: static | |
| pinned: true | |
| short_description: Explore Common Crawl's metadata and experimental datasets | |
| # Common Crawl | |
| Welcome to the Common Crawl Foundation's Hugging Face page! | |
| We aim to provide metadata and experimental versions of our latest data products here. | |
| ### Useful Links | |
| - [Common Crawl's official website](https://commoncrawl.org/) | |
| - [Our existing statistics webpages](https://commoncrawl.github.io/cc-crawl-statistics/) ([GitHub repo](https://github.com/commoncrawl/cc-crawl-statistics)) | |
| - [AWS infrastructure status page](https://status.commoncrawl.org/) | |
| ### Datasets | |
| Explore our datasets hosted on Hugging Face: | |
| - [Common Crawl Citations](https://huggingface.co/datasets/commoncrawl/citations) | |
| - [Common Crawl Citations, Annotated](https://huggingface.co/datasets/commoncrawl/citations-annotated) | |
| - [Common Crawl Statistics](https://huggingface.co/datasets/commoncrawl/statistics) | |
| - [EOT 2024 Host-Level Logs](https://huggingface.co/datasets/commoncrawl/eot2024_hostlevel_logs) (only available to EOT collaborators) | |
| We look forward to supporting the research and development community with these resources. |