Are you looking for a summarizer that outperforms the nearly 10x larger Llama3-70B-Instruct while offering much faster inference speed?
Our SummLlama3-8B could be exactly what you need!
SummLlama3 is initialized from Llama3-8B-Instruct, with additional training using Direct Preference Optimization (DPO) based on human-like summarization feedback.
Please refer to our paper to catch up how to exploit LLM-generated feedback in the context of text summarization.
We also released a larger model, SummLlama3-70B. Please go to the Huggingface link for this model.
Here is a brief overview of our summarizer:
Rather than relying on expensive human feedback, we utilize high-quality, multi-dimensional, and fine-grained feedback generated by large language models (LLMs).
This model excels at faithfulness, completeness, and conciseness, which are the three human-preferred aspects to judge what is a good summarizer.
- Faithfulness: a summarizer does not manipulate the information in the input text and add any information not directly inferable from the input text.
- Completeness: a summarizer ensures the inclusion of all key information from the input text in the output summary.
- Conciseness: a summarizer refrains from incorporating information outside the key information in the output, maintaining a succinct and focused summary.
Based on our comprehensive evaluation, which included both human and automated assessments of summary quality, SummLlama3 demonstrated significant improvements over the original Llama3 series.
Here is the results: