File size: 1,933 Bytes
c7b1e81 b555bb8 c7b1e81 b555bb8 c7b1e81 b555bb8 c7b1e81 b555bb8 c7b1e81 b555bb8 c7b1e81 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
---
license: apache-2.0
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: text-generation
tags:
- rag
---
<div align="center">
<b style="font-size: 40px;">Ext2Gen-8B-R2</b>
</div>
Are you looking for a more robust and reliable generation model for RAG system?
Here is a Ext2Gen-8B-R2 model that effectively mitigates hallucinations caused by retrieval noise and information overload.
See the details in our paper [Link](https://arxiv.org/pdf/2503.04789)
### What is Ext2Gen-8B-R2?
Ext2Gen-8B-R2 is built upon Llama3.2-8B-Instruct, incorporating preference-aligned fine-tuning through pairwise feedback learning.
This training strategy enables the model to:
- Extract highly relevant sentences from retrieved chunks before generating an answer.
- Filter out irrelevant or misleading information, reducing hallucinations.
- Align generation with human preferences by optimizing for faithfulness, completeness, and conciseness.
### Why does Ext2Gen-8B-R2 outperform standard RAG models?
Standard RAG models often struggle due to:
- Uncertain Placement – Relevant information may appear in unpredictable locations within retrieved chunks, making it difficult for LLMs to utilize it effectively.
- Information Overload – The presence of irrelevant chunks can distract the model, leading to errors or hallucinations.
- Lack of Alignment – Most generation models are not explicitly trained to prioritize relevant content over noise.
### Prompt
TBD
### Performance Benchmark
Our evaluations demonstrate that Ext2Gen-8B-R2 significantly enhances robustness in RAG systems:
* We conduct a QA task using RAG Systems on NQ, MS-MARCO, HotpotQA datasets.
* The difference is the generation backbone: Llama3.1-8B-Instruct vs. Ext2Gen-8B-R2
See the results in the Figure below:

|