File size: 341 Bytes
2f2f6ed | 1 2 3 4 5 6 7 8 9 10 11 12 13 | ---
license: apache-2.0
language:
- en
metrics:
- accuracy
pipeline_tag: image-text-to-text
library_name: transformers
---
<h3>PyPE: Advancing General Multimodal Capability of Vision-language Models with Pyramid-descent Visual Position Encoding</h3>
For more details, please refer to Github: [PyPE](https://github.com/SakuraTroyChen/PyPE). |