Models
Datasets
Spaces
Buckets new
Docs
Enterprise
Pricing
- Website
- Community
- Solutions
Log In
Sign Up

PaDT's picture

PaDT

PaDT-MLLM

eehaojiezhang's profile picture

lans1ng's profile picture

quantum11's profile picture

·

AI & ML interests

MultiModal Large Language Models

Organizations

None yet

PaDT-MLLM 's collections 3

Multi-Modal Model series based on Patch-as-Decodable-Token framework.

PaDT-MLLM/PaDT_OVD_3B

Any-to-Any • 4B • Updated Oct 10, 2025 • 2
PaDT-MLLM/PaDT_Pro_3B

Any-to-Any • 4B • Updated Oct 10, 2025 • 10 • 3
PaDT-MLLM/PaDT_Pro_7B

Any-to-Any • 8B • Updated Oct 10, 2025 • 6 • 2
PaDT-MLLM/PaDT_REC_7B

Any-to-Any • 8B • Updated Oct 10, 2025 • 2

Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

Paper • 2510.01954 • Published Oct 2, 2025 • 14

Preprocessed datasets used to train PaDT framework.

PaDT-MLLM/ReferringImageCaptioning

Viewer • Updated Oct 10, 2025 • 575k • 228 • 3
PaDT-MLLM/COCO

Viewer • Updated Oct 10, 2025 • 123k • 211 • 1
PaDT-MLLM/RefCOCO

Viewer • Updated Oct 10, 2025 • 357k • 529 • 4

Multi-Modal Model series based on Patch-as-Decodable-Token framework.

PaDT-MLLM/PaDT_OVD_3B

Any-to-Any • 4B • Updated Oct 10, 2025 • 2
PaDT-MLLM/PaDT_Pro_3B

Any-to-Any • 4B • Updated Oct 10, 2025 • 10 • 3
PaDT-MLLM/PaDT_Pro_7B

Any-to-Any • 8B • Updated Oct 10, 2025 • 6 • 2
PaDT-MLLM/PaDT_REC_7B

Any-to-Any • 8B • Updated Oct 10, 2025 • 2

Preprocessed datasets used to train PaDT framework.

PaDT-MLLM/ReferringImageCaptioning

Viewer • Updated Oct 10, 2025 • 575k • 228 • 3
PaDT-MLLM/COCO

Viewer • Updated Oct 10, 2025 • 123k • 211 • 1
PaDT-MLLM/RefCOCO

Viewer • Updated Oct 10, 2025 • 357k • 529 • 4

Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

Paper • 2510.01954 • Published Oct 2, 2025 • 14

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs