ModalityDance
/

IVTLR_QWEN_M3COT

Model card Files Files and versions

FYYDCC commited on 16 days ago

Commit

0331d97

·

verified ·

1 Parent(s): 1e53b01

Update README.md

Files changed (1) hide show

README.md +7 -4

README.md CHANGED Viewed

@@ -10,7 +10,7 @@ Interleaved Vision-Text Latent Reasoning (IVT-LR) is the first VLM framework tha
 ## Usage
-This repository provides pretrained Qwen2-VL models for IVT-LR.
 To see detailed usage, including inference code and scripts for training, please refer to the [GitHub repository](https://github.com/ModalityDance/IVT-LR).
@@ -20,10 +20,13 @@ To see detailed usage, including inference code and scripts for training, please
 You can download the models directly from Hugging Face using `huggingface_hub`:
 from huggingface_hub import hf_hub_download
-# Qwen2-VL on M3CoT
 qwen_m3cot_path = hf_hub_download("ModalityDance/IVTLR_QWEN_M3COT", "model.pth")
-# Qwen2-VL on ScienceQA
-qwen_sqa_path = hf_hub_download("ModalityDance/IVTLR_QWEN_SQA", "model.pth")

 ## Usage
+This repository provides pretrained Qwen2-VL models for IVT-LR on **M3CoT** and **ScienceQA** datasets.
 To see detailed usage, including inference code and scripts for training, please refer to the [GitHub repository](https://github.com/ModalityDance/IVT-LR).
 You can download the models directly from Hugging Face using `huggingface_hub`:
+```python
 from huggingface_hub import hf_hub_download
+# Download Qwen2-VL model trained on M3CoT
 qwen_m3cot_path = hf_hub_download("ModalityDance/IVTLR_QWEN_M3COT", "model.pth")
+# Download Qwen2-VL model trained on ScienceQA
+qwen_sqa_path = hf_hub_download("ModalityDance/IVTLR_QWEN_SQA", "model.pth")
+```