Model Card β BGE-Reranker-VietFinance
Overview
This is a cross-encoder reranker finetuned from BAAI/bge-reranker-v2-m3 to score (query, passage) pairs for retrieval reranking in Vietnamese financial/news search systems.
Intended Use
- Primary: improve Hit@K by re-ranking candidate passages produced by an upstream retriever (BM25 + embedding-based).
- Not for: standalone generation, non-Vietnamese domains, or high-stakes automated decisions without human review.
Essential Statistics
- Base model:
BAAI/bge-reranker-v2-m3 - Embedding model (used for retrieval/hard-negative mining):
BAAI/bge-m3 - Max sequence length for reranking: 1536 tokens (inputs longer than this are truncated)
- Retrieval strategy: temporal-aware hybrid (BM25 + dense embeddings with temporal boosting)
- Saved artifacts in this folder:
model.safetensors,tokenizer.json,tokenizer_config.json,config.json.
Evaluation (concise)
- Procedure: retrieve candidate passages (temporal-aware hybrid) β rerank with cross-encoder β compute Hit@K for K β {1,3,5,10,20}.
- Numeric results are saved in run outputs (summary CSVs / JSONL); include them here if you want the actual Hit@K values embedded.
Limitations & Risks
- Domain-specific: optimized for Vietnamese financial/news passages; generalization outside this domain/language is uncertain.
- Retrieval dependency: reranker cannot recover gold passages not present among retrieval candidates.
- Truncation risk: 1536-token truncation may drop important context for long passages.
- Data & license: dataset provenance and license are not specified here β verify before public distribution.
Bias & Safety
- Model reflects biases in the source news corpus (topic/regional biases).
- Temporal heuristics can misinterpret ambiguous locale-specific dates and cause incorrect boosts.
- Do not rely on reranker outputs alone for automated financial, legal, or medical decisions.
Quick usage (inference)
Load this checkpoint with AutoTokenizer / AutoModelForSequenceClassification, tokenize (query, passage) pairs, score in eval mode, and sort candidates by descending score (higher = more relevant).
License & Citation
- License: not specified in the checkpoint β confirm before redistribution.
- Cite the base models
BAAI/bge-reranker-v2-m3andBAAI/bge-m3when reporting results.
- Downloads last month
- 22
Model tree for tiam4tt/BGE-Reranker-VietFinance
Base model
BAAI/bge-reranker-v2-m3