|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
base_model: |
|
|
- meta-llama/Llama-3.1-8B-Instruct |
|
|
pipeline_tag: text-generation |
|
|
tags: |
|
|
- rag |
|
|
--- |
|
|
|
|
|
|
|
|
<div align="center"> |
|
|
<b style="font-size: 40px;">Ext2Gen-8B-R2</b> |
|
|
</div> |
|
|
|
|
|
Note: We are still working on this. |
|
|
|
|
|
Are you looking for a more robust and reliable generation model for RAG system? |
|
|
|
|
|
Here is a Ext2Gen-8B-R2 model that effectively mitigates hallucinations caused by retrieval noise and information overload. |
|
|
|
|
|
See the details in our paper [Link](https://arxiv.org/pdf/2503.04789) |
|
|
|
|
|
### What is Ext2Gen-8B-R2? |
|
|
Ext2Gen-8B-R2 is built upon Llama3.2-8B-Instruct, incorporating preference-aligned fine-tuning through pairwise feedback learning. |
|
|
|
|
|
This training strategy enables the model to: |
|
|
- Extract highly relevant sentences from retrieved chunks before generating an answer. |
|
|
- Filter out irrelevant or misleading information, reducing hallucinations. |
|
|
- Align generation with human preferences by optimizing for faithfulness, completeness, and conciseness. |
|
|
|
|
|
### Why does Ext2Gen-8B-R2 outperform standard RAG models? |
|
|
Standard RAG models often struggle due to: |
|
|
- Uncertain Placement – Relevant information may appear in unpredictable locations within retrieved chunks, making it difficult for LLMs to utilize it effectively. |
|
|
- Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations. |
|
|
- Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise. |
|
|
|
|
|
### Prompt |
|
|
|
|
|
TBD |
|
|
|
|
|
|
|
|
### Performance Benchmark |
|
|
Our evaluations demonstrate that Ext2Gen-8B-R2 significantly enhances robustness in RAG systems: |
|
|
* We conduct a QA task using RAG Systems on NQ, MS-MARCO, HotpotQA datasets. |
|
|
* The difference is the generation backbone: Llama3.1-8B-Instruct vs. Ext2Gen-8B-R2 |
|
|
|
|
|
See the results in the Figure below: |
|
|
|
|
|
 |
|
|
|
|
|
|
|
|
|
|
|
|