File size: 1,913 Bytes
9537200 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
# Dataset Sources & References
This document tracks official and related visual reasoning datasets.
## Primary Dataset
### Zebra-CoT
- **Source**: [Hugging Face](https://huggingface.co/datasets/multimodal-reasoning-lab/Zebra-CoT)
- **Paper**: [arXiv:2507.16746](https://arxiv.org/abs/2507.16746)
- **Size**: 182,384 samples (58.9 GB)
- **License**: CC BY-NC 4.0
- **Modalities**: Image, Text
- **Use Case**: Training multimodal models for Visual Chain of Thought reasoning
## Related Datasets
### Visual-CoT (NeurIPS'24 Spotlight)
- **Source**: [Hugging Face](https://huggingface.co/datasets/deepcs233/Visual-CoT)
- **GitHub**: [deepcs233/Visual-CoT](https://github.com/deepcs233/Visual-CoT)
- **Size**: 438K question-answer pairs
- **Description**: Multi-turn processing pipeline for MLLMs with intermediate bounding boxes
### LLaVA-CoT-100k (ICCV 2025)
- **Source**: [Hugging Face](https://huggingface.co/datasets/Xkev/LLaVA-CoT-100k)
- **GitHub**: [PKU-YuanGroup/LLaVA-CoT](https://github.com/PKU-YuanGroup/LLaVA-CoT)
- **Description**: Visual language model with spontaneous systematic reasoning
### MM-CoT (Amazon Science)
- **GitHub**: [amazon-science/mm-cot](https://github.com/amazon-science/mm-cot)
- **Description**: Multimodal CoT with decoupled training framework for rationale generation
### MME-CoT Benchmark
- **GitHub**: [CaraJ7/MME-CoT](https://github.com/CaraJ7/MME-CoT)
- **Description**: Benchmark for evaluating CoT reasoning across math, science, OCR, logic, space-time
## Pre-trained Models
| Model | Base | Link |
|-------|------|------|
| Anole-Zebra-CoT | Anole-7B | [HuggingFace](https://huggingface.co/multimodal-reasoning-lab/Anole-Zebra-CoT) |
| Bagel-Zebra-CoT | Bagel-7B | [HuggingFace](https://huggingface.co/multimodal-reasoning-lab/Bagel-Zebra-CoT) |
## Adoption Status
- [x] Zebra-CoT integrated
- [ ] Visual-CoT samples pending
- [ ] Custom samples in development
|