duongtruongbinh commited on
Commit
697ec82
·
1 Parent(s): 0287e70

Update README.md to include details about ViVQA-X LSTM-Generative model, usage instructions, and relevant metadata

Browse files
Files changed (1) hide show
  1. README.md +53 -0
README.md CHANGED
@@ -1,3 +1,56 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language: vi
4
+ library_name: pytorch
5
+ tags:
6
+ - visual-question-answering
7
+ - text-generation
8
+ - image-to-text
9
+ - vietnamese
10
+ - vivqa-x
11
+ datasets:
12
+ - VLAI-AIVN/ViVQA-X
13
  ---
14
+
15
+ # ViVQA-X: LSTM-Generative Baseline Model
16
+
17
+ This repository contains the `LSTM-Generative` baseline model from the research paper **"An Automated Pipeline for Constructing a Vietnamese VQA-NLE Dataset"**. The model is designed for Visual Question Answering with Natural Language Explanations (VQA-NLE) in Vietnamese.
18
+
19
+ Given an image and a question in Vietnamese, it predicts an answer and generates a sentence explaining its reasoning.
20
+
21
+ **Main Project Repository:** [duongtruongbinh/ViVQA-X](https://github.com/duongtruongbinh/ViVQA-X)
22
+
23
+ ---
24
+
25
+ ## 🚀 How to Use
26
+
27
+ This model is a custom PyTorch checkpoint and requires the model class definition from the main project. The intended use is to download the checkpoint and load it using the project's source code.
28
+
29
+ ```python
30
+ import torch
31
+ from huggingface_hub import hf_hub_download
32
+ # You need to have the model definition file, for example: from src.models.baseline_model.vivqax_model import ViVQAX_Model
33
+
34
+ # Download checkpoint from Hub
35
+ checkpoint_path = hf_hub_download(
36
+ repo_id="VLAI-AIVN/ViVQA-X_LSTM-Generative", # <-- REPLACE WITH YOUR MODEL REPO NAME
37
+ filename="best_model.pth" # Checkpoint file name
38
+ )
39
+
40
+ # Load checkpoint
41
+ checkpoint = torch.load(checkpoint_path, map_location='cpu')
42
+
43
+ # Initialize model with structure and vocab from checkpoint
44
+ # (This is an example, you need to adjust it to your code)
45
+ # model = ViVQAX_Model(
46
+ # vocab_size=len(checkpoint['word2idx']),
47
+ # embed_size=checkpoint['config']['model']['embed_size'],
48
+ # # ... other parameters
49
+ # )
50
+
51
+ # Load trained weights
52
+ # model.load_state_dict(checkpoint['model_state_dict'])
53
+ # model.eval()
54
+
55
+ print("Model loaded successfully!")
56
+ # Now you can use the model to predict