helehan
/

topic-overwrite-llava-7b-full

@@ -1,14 +1,44 @@
 ---
-license: apache-2.0
 datasets:
 - helehan/topic-overwrite
 language:
 - en
 ---
-# Model Card for Model ID
-[GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) | [Paper](https://arxiv.org/abs/2411.17265)
 ## Model Details
@@ -16,15 +46,56 @@ The model, trained using the RLHF/RLAIF methods proposed in the [TPO paper](http
 ## Model Description
-- **Trained from model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
-- **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
 ## Usage
-Please look at [GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) for more details about usage.
 ## Citation
 ```bibtex
 @article{he2024topic,
   title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},

 ---
 datasets:
 - helehan/topic-overwrite
 language:
 - en
+license: apache-2.0
+pipeline_tag: image-text-to-text
+library_name: transformers
 ---
+# TPO: A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs
+This repository contains the **TPO-LLaVA-7B-Full** model, trained using the Topic-level Preference Overwriting (TPO) method. TPO is a novel framework designed for the systematic optimization of reward gap configuration to mitigate hallucinations in Vision Language Models (VLMs), as presented in the paper:
+[**Systematic Reward Gap Optimization for Mitigating VLM Hallucinations**](https://arxiv.org/abs/2411.17265)
+[Project Page](https://tpr-dpo.github.io) | [GitHub Repository](https://github.com/tpr-dpo/tpr-dpo) | [Hugging Face Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
+<div align="center" style="font-size: 15pt">
+<a href='https://arxiv.org/abs/2411.17265'><img src='https://img.shields.io/badge/Paper-PDF-purple'></a>
+<a href='https://huggingface.co/datasets/helehan/topic-overwrite'><img src='https://img.shields.io/badge/Dataset-HF-Green'></a>
+<a href='https://huggingface.co/helehan/topic-overwrite-llava-7b-full'><img src='https://img.shields.io/badge/Model-7B-orange'></a>
+<a href='https://huggingface.co/helehan/topic-overwrite-llava-7b-lora'><img src='https://img.shields.io/badge/Model-Lora-orange'></a>
+</div>
+## 🎉 News
+- [2024.12.08] We open-source the code, weights ([7B](https://huggingface.co/helehan/topic-overwrite-llava-7b-full), [Lora](https://huggingface.co/helehan/topic-overwrite-llava-7b-lora)) and [data](https://huggingface.co/datasets/helehan/topic-overwrite) of TPO!
+- [2024.11.26] Our paper is accesible at [arXiv](https://arxiv.org/abs/2411.17265) now!
+## 📜 Overview
+We propose a topic-level self-correctional paradigm tailored for reducing hallucinations, Topic-level Preference Overwriting (TPO). We adopt a deconfounded algorithm that replaces all topics involved in a complex response, with the best or worst alternatives resampled multiple times from the reference model itself on the same topic.
+<table align="center">
+    <p align="center">
+      <img src="https://github.com/tpr-dpo/tpr-dpo/raw/main/examples/intro1.png" width="95%" alt="intro1" />
+    </p>
+</table>
 ## Model Details
 ## Model Description
+-   **Trained from model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
+-   **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
 ## Usage
+We provide a simple example to show how to use TPO for inference.
+First, ensure you have the necessary packages installed (refer to the [GitHub repository](https://github.com/tpr-dpo/tpr-dpo) for `requirements.txt`):
+```bash
+conda create -n tpo python=3.10 -y
+conda activate tpo
+pip install -r requirements.txt
+```
+Then, you can use the following Python snippet:
+```python
+from chat import TPOChat, img2base64
+chat_model = TPOChat('helehan/topic-overwrite-llava-7b-full')
+image_path="Your_Image_Path.jpg" # Replace with the path to your image
+msgs = "Describe in detail the people in the picture."
+inputs = {"image": image_path, "question": msgs}
+answer = chat_model.chat(inputs)
+print(answer)
+```
+You can also run this code to inference by executing the following script:
+```bash
+python chat.py
+```
+For more detailed usage, including training and evaluation instructions, please refer to the [GitHub repository](https://github.com/tpr-dpo/tpr-dpo).
+## Dialogue Examples
+<div align="center">
+  <img src="https://github.com/tpr-dpo/tpr-dpo/raw/main/examples/test1.png" width="70%">
+</div>
+<div align="center">
+  <img src="https://github.com/tpr-dpo/tpr-dpo/raw/main/examples/test2.png" width="70%">
+</div>
 ## Citation
+If you find our work helpful or inspiring, please feel free to cite it:
 ```bibtex
 @article{he2024topic,
   title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},