Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
sbintuitions
/
sarashina2.2-ocr
like
22
Follow
SB Intuitions
285
Image-to-Text
Transformers
Safetensors
Japanese
English
sarashina2_vision
text-generation
multimodal
ocr
document-understanding
vision-language
custom_code
arxiv:
2503.09208
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
sarashina2.2-ocr
7.81 GB
Ctrl+K
Ctrl+K
2 contributors
History:
6 commits
tkmtakada-sbint
Update README.md
eafb8d4
verified
2 days ago
assets
Initial commit
8 days ago
.gitattributes
Safe
2.46 kB
Initial commit
8 days ago
LICENSE
Safe
1.07 kB
Initial commit
8 days ago
README.md
Safe
10.8 kB
Update README.md
2 days ago
chat_template.jinja
Safe
1.72 kB
Initial commit
8 days ago
config.json
Safe
1.84 kB
Initial commit
8 days ago
configuration_sarashina2_vision.py
Safe
3.5 kB
Initial commit
8 days ago
generation_config.json
Safe
154 Bytes
Initial commit
8 days ago
model.safetensors
Safe
7.8 GB
xet
Initial commit
8 days ago
modeling_sarashina2_vision.py
Safe
36.8 kB
Initial commit
8 days ago
preprocessor_config.json
Safe
646 Bytes
Initial commit
8 days ago
processing_sarashina2_vision.py
Safe
32.2 kB
Initial commit
8 days ago
processor_config.json
Safe
152 Bytes
Initial commit
8 days ago
special_tokens_map.json
Safe
968 Bytes
Initial commit
8 days ago
tokenizer.json
Safe
6.72 MB
Initial commit
8 days ago
tokenizer.model
Safe
1.83 MB
xet
Initial commit
8 days ago
tokenizer_config.json
Safe
3.93 kB
Initial commit
8 days ago
video_preprocessor_config.json
Safe
1.11 kB
Initial commit
8 days ago