helehan
/

topic-overwrite-llava-7b-lora

@@ -1,35 +1,49 @@
 ---
-license: apache-2.0
 datasets:
 - helehan/topic-overwrite
 language:
 - en
 ---
-# Model Card for Model ID
-[GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) | [Paper](https://arxiv.org/abs/2411.17265)
 ## Model Details
-The model, trained using the RLHF/RLAIF methods proposed in the [TPO paper](https://arxiv.org/abs/2411.17265) by llava, has enhanced trustworthiness and reduced hallucinations.
 ## Model Description
-- **Trained from model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
-- **Lora Config:** [llava-v1.5-7B-lora](https://huggingface.co/liuhaotian/llava-v1.5-7b-lora)
-- **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
 ## Usage
-Please look at [GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) for more details about usage.
 ## Citation
 ```bibtex
 @article{he2024topic,
   title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},
-  author={He, Lehan and Chen, Zeren and Shi, Zhelun and Yu, Tianyu and Shao, Jing and Sheng, Lu},
   journal={arXiv preprint arXiv:2411.17265},
   year={2024}
 }

 ---
 datasets:
 - helehan/topic-overwrite
 language:
 - en
+license: apache-2.0
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
+# Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
+[Project Page](https://tpr-dpo.github.io) | [GitHub](https://github.com/tpr-dpo/tpr-dpo) | [Paper](https://arxiv.org/abs/2411.17265)
 ## Model Details
+This model is a Vision Language Model (VLM) specifically designed to mitigate hallucinations. It is trained using the Topic-level Preference Overwriting (TPO) approach, an RLHF/RLAIF method that systematically optimizes reward gaps in preference pairs during data curation. TPO achieves topic-level control over fine-grained semantic details by selectively replacing semantic topics in VLM responses with resampled candidates, leading to enhanced trustworthiness and reduced hallucinations.
 ## Model Description
+-   **Trained from base model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
+-   **LoRA Config:** [llava-v1.5-7B-lora](https://huggingface.co/liuhaotian/llava-v1.5-7b-lora)
+-   **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
 ## Usage
+Here's a simple example demonstrating how to use the TPO model for inference:
+```python
+from chat import TPOChat, img2base64
+chat_model = TPOChat('helehan/topic-overwrite-llava-7b-full')
+image_path="Your_Image_Path.jpg"
+msgs = "Describe in detail the people in the picture."
+inputs = {"image": image_path, "question": msgs}
+answer = chat_model.chat(inputs)
+print(answer)
+```
+You can also run this code to inference by executing the `chat.py` script from the GitHub repository.
 ## Citation
 ```bibtex
 @article{he2024topic,
   title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},
+  author={He, Lehan and Zeren Chen and Shi, Zhelun and Yu, Tianyu and Shao, Jing and Sheng, Lu},
   journal={arXiv preprint arXiv:2411.17265},
   year={2024}
 }