Improve model card: Add pipeline tag, library name, project page, and usage example

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +77 -6
README.md CHANGED
@@ -1,14 +1,44 @@
1
  ---
2
- license: apache-2.0
3
  datasets:
4
  - helehan/topic-overwrite
5
  language:
6
  - en
 
 
 
7
  ---
8
 
9
- # Model Card for Model ID
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
- [GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) | [Paper](https://arxiv.org/abs/2411.17265)
 
 
 
 
12
 
13
  ## Model Details
14
 
@@ -16,15 +46,56 @@ The model, trained using the RLHF/RLAIF methods proposed in the [TPO paper](http
16
 
17
  ## Model Description
18
 
19
- - **Trained from model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
20
- - **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
21
 
22
  ## Usage
23
 
24
- Please look at [GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) for more details about usage.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
  ## Citation
27
 
 
 
28
  ```bibtex
29
  @article{he2024topic,
30
  title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},
 
1
  ---
 
2
  datasets:
3
  - helehan/topic-overwrite
4
  language:
5
  - en
6
+ license: apache-2.0
7
+ pipeline_tag: image-text-to-text
8
+ library_name: transformers
9
  ---
10
 
11
+ # TPO: A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs
12
+
13
+ This repository contains the **TPO-LLaVA-7B-Full** model, trained using the Topic-level Preference Overwriting (TPO) method. TPO is a novel framework designed for the systematic optimization of reward gap configuration to mitigate hallucinations in Vision Language Models (VLMs), as presented in the paper:
14
+
15
+ [**Systematic Reward Gap Optimization for Mitigating VLM Hallucinations**](https://arxiv.org/abs/2411.17265)
16
+
17
+ [Project Page](https://tpr-dpo.github.io) | [GitHub Repository](https://github.com/tpr-dpo/tpr-dpo) | [Hugging Face Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
18
+
19
+ <div align="center" style="font-size: 15pt">
20
+
21
+ <a href='https://arxiv.org/abs/2411.17265'><img src='https://img.shields.io/badge/Paper-PDF-purple'></a>
22
+ <a href='https://huggingface.co/datasets/helehan/topic-overwrite'><img src='https://img.shields.io/badge/Dataset-HF-Green'></a>
23
+ <a href='https://huggingface.co/helehan/topic-overwrite-llava-7b-full'><img src='https://img.shields.io/badge/Model-7B-orange'></a>
24
+ <a href='https://huggingface.co/helehan/topic-overwrite-llava-7b-lora'><img src='https://img.shields.io/badge/Model-Lora-orange'></a>
25
+
26
+ </div>
27
+
28
+ ## 🎉 News
29
+
30
+ - [2024.12.08] We open-source the code, weights ([7B](https://huggingface.co/helehan/topic-overwrite-llava-7b-full), [Lora](https://huggingface.co/helehan/topic-overwrite-llava-7b-lora)) and [data](https://huggingface.co/datasets/helehan/topic-overwrite) of TPO!
31
+ - [2024.11.26] Our paper is accesible at [arXiv](https://arxiv.org/abs/2411.17265) now!
32
+
33
+ ## 📜 Overview
34
+
35
+ We propose a topic-level self-correctional paradigm tailored for reducing hallucinations, Topic-level Preference Overwriting (TPO). We adopt a deconfounded algorithm that replaces all topics involved in a complex response, with the best or worst alternatives resampled multiple times from the reference model itself on the same topic.
36
 
37
+ <table align="center">
38
+ <p align="center">
39
+ <img src="https://github.com/tpr-dpo/tpr-dpo/raw/main/examples/intro1.png" width="95%" alt="intro1" />
40
+ </p>
41
+ </table>
42
 
43
  ## Model Details
44
 
 
46
 
47
  ## Model Description
48
 
49
+ - **Trained from model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
50
+ - **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
51
 
52
  ## Usage
53
 
54
+ We provide a simple example to show how to use TPO for inference.
55
+
56
+ First, ensure you have the necessary packages installed (refer to the [GitHub repository](https://github.com/tpr-dpo/tpr-dpo) for `requirements.txt`):
57
+
58
+ ```bash
59
+ conda create -n tpo python=3.10 -y
60
+ conda activate tpo
61
+ pip install -r requirements.txt
62
+ ```
63
+
64
+ Then, you can use the following Python snippet:
65
+
66
+ ```python
67
+ from chat import TPOChat, img2base64
68
+
69
+ chat_model = TPOChat('helehan/topic-overwrite-llava-7b-full')
70
+ image_path="Your_Image_Path.jpg" # Replace with the path to your image
71
+ msgs = "Describe in detail the people in the picture."
72
+ inputs = {"image": image_path, "question": msgs}
73
+ answer = chat_model.chat(inputs)
74
+ print(answer)
75
+ ```
76
+
77
+ You can also run this code to inference by executing the following script:
78
+
79
+ ```bash
80
+ python chat.py
81
+ ```
82
+
83
+ For more detailed usage, including training and evaluation instructions, please refer to the [GitHub repository](https://github.com/tpr-dpo/tpr-dpo).
84
+
85
+ ## Dialogue Examples
86
+
87
+ <div align="center">
88
+ <img src="https://github.com/tpr-dpo/tpr-dpo/raw/main/examples/test1.png" width="70%">
89
+ </div>
90
+
91
+ <div align="center">
92
+ <img src="https://github.com/tpr-dpo/tpr-dpo/raw/main/examples/test2.png" width="70%">
93
+ </div>
94
 
95
  ## Citation
96
 
97
+ If you find our work helpful or inspiring, please feel free to cite it:
98
+
99
  ```bibtex
100
  @article{he2024topic,
101
  title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},