nielsr's picture
nielsr HF Staff
Improve model card: Add pipeline tag, library name, project page, and usage example
2739510 verified
|
raw
history blame
4.28 kB
metadata
datasets:
  - helehan/topic-overwrite
language:
  - en
license: apache-2.0
pipeline_tag: image-text-to-text
library_name: transformers

TPO: A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs

This repository contains the TPO-LLaVA-7B-Full model, trained using the Topic-level Preference Overwriting (TPO) method. TPO is a novel framework designed for the systematic optimization of reward gap configuration to mitigate hallucinations in Vision Language Models (VLMs), as presented in the paper:

Systematic Reward Gap Optimization for Mitigating VLM Hallucinations

Project Page | GitHub Repository | Hugging Face Dataset

πŸŽ‰ News

  • [2024.12.08] We open-source the code, weights (7B, Lora) and data of TPO!
  • [2024.11.26] Our paper is accesible at arXiv now!

πŸ“œ Overview

We propose a topic-level self-correctional paradigm tailored for reducing hallucinations, Topic-level Preference Overwriting (TPO). We adopt a deconfounded algorithm that replaces all topics involved in a complex response, with the best or worst alternatives resampled multiple times from the reference model itself on the same topic.

intro1

Model Details

The model, trained using the RLHF/RLAIF methods proposed in the TPO paper by llava, has enhanced trustworthiness and reduced hallucinations.

Model Description

Usage

We provide a simple example to show how to use TPO for inference.

First, ensure you have the necessary packages installed (refer to the GitHub repository for requirements.txt):

conda create -n tpo python=3.10 -y
conda activate tpo
pip install -r requirements.txt

Then, you can use the following Python snippet:

from chat import TPOChat, img2base64

chat_model = TPOChat('helehan/topic-overwrite-llava-7b-full')
image_path="Your_Image_Path.jpg" # Replace with the path to your image
msgs = "Describe in detail the people in the picture."
inputs = {"image": image_path, "question": msgs}
answer = chat_model.chat(inputs)
print(answer)

You can also run this code to inference by executing the following script:

python chat.py

For more detailed usage, including training and evaluation instructions, please refer to the GitHub repository.

Dialogue Examples

Citation

If you find our work helpful or inspiring, please feel free to cite it:

@article{he2024topic,
  title={A Topic-level Self-Correctional Approach to Mitigate Hallucinations in MLLMs},
  author={He, Lehan and Chen, Zeren and Shi, Zhelun and Yu, Tianyu and Shao, Jing and Sheng, Lu},
  journal={arXiv preprint arXiv:2411.17265},
  year={2024}
}