Improve model card: Add pipeline tag, library, project page, usage example, and update title/links

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +23 -9
README.md CHANGED
@@ -1,35 +1,49 @@
1
  ---
2
- license: apache-2.0
3
  datasets:
4
  - helehan/topic-overwrite
5
  language:
6
  - en
 
 
 
7
  ---
8
 
9
- # Model Card for Model ID
10
 
11
- [GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) | [Paper](https://arxiv.org/abs/2411.17265)
12
 
13
  ## Model Details
14
 
15
- The model, trained using the RLHF/RLAIF methods proposed in the [TPO paper](https://arxiv.org/abs/2411.17265) by llava, has enhanced trustworthiness and reduced hallucinations.
16
 
17
  ## Model Description
18
 
19
- - **Trained from model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
20
- - **Lora Config:** [llava-v1.5-7B-lora](https://huggingface.co/liuhaotian/llava-v1.5-7b-lora)
21
- - **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
22
 
23
  ## Usage
24
 
25
- Please look at [GitHub](https://github.com/topic-overwrite/topic-level-overwrite/tree/main) for more details about usage.
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  ## Citation
28
 
29
  ```bibtex
30
  @article{he2024topic,
31
  title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},
32
- author={He, Lehan and Chen, Zeren and Shi, Zhelun and Yu, Tianyu and Shao, Jing and Sheng, Lu},
33
  journal={arXiv preprint arXiv:2411.17265},
34
  year={2024}
35
  }
 
1
  ---
 
2
  datasets:
3
  - helehan/topic-overwrite
4
  language:
5
  - en
6
+ license: apache-2.0
7
+ pipeline_tag: image-text-to-text
8
+ library_name: transformers
9
  ---
10
 
11
+ # Systematic Reward Gap Optimization for Mitigating VLM Hallucinations
12
 
13
+ [Project Page](https://tpr-dpo.github.io) | [GitHub](https://github.com/tpr-dpo/tpr-dpo) | [Paper](https://arxiv.org/abs/2411.17265)
14
 
15
  ## Model Details
16
 
17
+ This model is a Vision Language Model (VLM) specifically designed to mitigate hallucinations. It is trained using the Topic-level Preference Overwriting (TPO) approach, an RLHF/RLAIF method that systematically optimizes reward gaps in preference pairs during data curation. TPO achieves topic-level control over fine-grained semantic details by selectively replacing semantic topics in VLM responses with resampled candidates, leading to enhanced trustworthiness and reduced hallucinations.
18
 
19
  ## Model Description
20
 
21
+ - **Trained from base model:** [llava-v1.5-7B](https://huggingface.co/liuhaotian/llava-v1.5-7b)
22
+ - **LoRA Config:** [llava-v1.5-7B-lora](https://huggingface.co/liuhaotian/llava-v1.5-7b-lora)
23
+ - **Trained on data:** [TPO-Dataset](https://huggingface.co/datasets/helehan/topic-overwrite)
24
 
25
  ## Usage
26
 
27
+ Here's a simple example demonstrating how to use the TPO model for inference:
28
+
29
+ ```python
30
+ from chat import TPOChat, img2base64
31
+
32
+ chat_model = TPOChat('helehan/topic-overwrite-llava-7b-full')
33
+ image_path="Your_Image_Path.jpg"
34
+ msgs = "Describe in detail the people in the picture."
35
+ inputs = {"image": image_path, "question": msgs}
36
+ answer = chat_model.chat(inputs)
37
+ print(answer)
38
+ ```
39
+ You can also run this code to inference by executing the `chat.py` script from the GitHub repository.
40
 
41
  ## Citation
42
 
43
  ```bibtex
44
  @article{he2024topic,
45
  title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},
46
+ author={He, Lehan and Zeren Chen and Shi, Zhelun and Yu, Tianyu and Shao, Jing and Sheng, Lu},
47
  journal={arXiv preprint arXiv:2411.17265},
48
  year={2024}
49
  }