| license: apache-2.0 | |
| language: | |
| - fa | |
| library_name: hezar | |
| tags: | |
| - hezar | |
| A vision encoder decoder model initialized from `hezarai/roberta-base-fa` and `google/vit-base-patch16-224` weights. | |
| **This model cannot perform image-to-text inference out of the box without finetuning.** |