Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
ByteDance
/
Dolphin
like
513
Follow
ByteDance
4.1k
Image-Text-to-Text
Transformers
Safetensors
custom
Chinese
English
vision-encoder-decoder
image-to-text
document-parsing
document-understanding
document-intelligence
ocr
layout-analysis
table-extraction
multimodal
vision-language-model
License:
mit
Model card
Files
Files and versions
xet
Community
9
Deploy
Use this model
4e22f46
Dolphin
807 MB
2 contributors
History:
4 commits
HaoFeng2025
Update README.md
4e22f46
verified
8 months ago
.gitattributes
1.52 kB
Initial commit
8 months ago
README.md
3.54 kB
Update README.md
8 months ago
config.json
4.77 kB
Initial commit
8 months ago
generation_config.json
160 Bytes
Initial commit
8 months ago
model.safetensors
796 MB
xet
Initial commit
8 months ago
preprocessor_config.json
803 Bytes
Initial commit
8 months ago
special_tokens_map.json
277 Bytes
Initial commit
8 months ago
tokenizer.json
6.42 MB
Initial commit
8 months ago
tokenizer_config.json
4.05 MB
Initial commit
8 months ago