For more details please refer to our github repo: https://github.com/FlagOpen/FlagEmbedding

Lore-Bge3: Logic-ORiented Retriever Enhancement for BGE-M3

This model is a fine-tuned version of BAAI/bge-m3 using the LORE (Logic-ORiented Retriever Enhancement) method. It significantly improves retrieval performance for complex logical expressions and queries.

LORE Method Overview

LORE is a novel embedding enhancement method that improves retrieval performance through fine-grained contrastive learning:

Three-tier Contrastive Learning: Fine-grained sample classification with P (Positive), N1 (Distractor), and N2 (Negative) samples
Dual Encoder Architecture: Frozen document encoder M_d and trainable query encoder M_q
InfoNCE-based Loss: Differentiated weights for hierarchical separation P ≻ N1 ≻ N2
Query Rewriting: LLM-assisted dataset construction with discourse relations from Rhetorical Structure Theory (RST)
No External Dependencies: Requires no external supervision, resources, or pre-retrieval analysis

Key Improvements

Enhanced Logical Reasoning: Improved ability to handle complex logical expressions in queries
Fine-grained Discrimination: Better distinction between relevant content and distractors
Maintained Efficiency: Preserves the computational efficiency of the original model

Downloads last month: 2

Safetensors

Model size

0.6B params

Tensor type

F32

Paper for XiaSheng/Lore-Bge3

BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation

Paper • 2402.03216 • Published Feb 5, 2024 • 7