mlpc-lab
/

BLIVA_Vicuna

Visual Question Answering

Model card Files Files and versions

Update model card for ESTR-CoT

#2

by nielsr HF Staff - opened Jul 7, 2025

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

This pull request completely revamps the model card for this repository to accurately reflect the ESTR-CoT model, as described in the paper ESTR-CoT: Towards Explainable and Accurate Event Stream based Scene Text Recognition with Chain-of-Thought Reasoning.

The update includes:

Replacing all outdated information related to the previous BLIVA model with details about ESTR-CoT.
Setting the pipeline_tag to image-text-to-text to better categorize the model on the Hub, making it discoverable for visual-to-text tasks.
Confirming the library_name as transformers based on the model's components (Llama tokenizer, Vicuna-7B).
Adding license: cc-by-nc-4.0, a common non-commercial license, as a placeholder until an official license is released with the code.

Update model card for ESTR-CoTd992dba0

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Ready to merge

This branch is ready to get merged automatically.

· Sign up or log in to comment