BCT L12 Reranker

This is a cross-encoder reranker fine-tuned for retrieval-augmented generation over Banque Centrale de Tunisie regulatory documents. It scores (query, passage) pairs and is intended to reorder candidate chunks returned by a first-stage hybrid retriever.

The model is fine-tuned from cross-encoder/ms-marco-MiniLM-L12-v2.

Model Details

Developed by: slim0001
Model type: Cross-encoder sequence classification reranker
Language(s): French primarily, with some Arabic/English metadata and queries
Base model: cross-encoder/ms-marco-MiniLM-L12-v2
Task: Passage reranking for BCT regulatory RAG
License: Other / internal-use unless changed by owner

Intended Use

Direct Use

Use this model to score candidate regulatory text chunks for a user query. Higher scores indicate higher relevance.

Typical pipeline:

Retrieve candidates using dense/BM25/hybrid retrieval.
Score the top candidates with this reranker.
Sort candidates by reranker score.
Send the top passages to an answer-generation model.

Recommended setting from development experiments:

rerank_top_k = 10
final_top_k = 5-8
max_length = 512

### Out-of-Scope Use

This model is not an answer-generation model. It should not be used to generate legal, banking, or compliance advice directly. It only ranks passages.

It is not validated outside the BCT regulatory corpus and may perform poorly on unrelated domains.

## Training Data

The model was trained using generated and curated question/chunk supervision derived from a BCT regulatory document corpus.

Main training experiment:

training/eval source: qwen_bct_questions_5k.jsonl
split: 80% train / 10% validation / 10% test
train queries: ~4000
validation queries: ~500
test queries: ~500

Gold chunks were resolved from the indexed BCT corpus. Training used top-10 hybrid retriever candidates with injected gold chunks when missing.

## Training Procedure

The model was trained with a hybrid reranker distillation objective:

total_loss =
  0.1 * teacher_distillation_loss
  + 3.0 * gold_pairwise_loss

Teacher model:

BAAI/bge-reranker-v2-m3

Student/base model:

cross-encoder/ms-marco-MiniLM-L12-v2

Important hyperparameters:

hybrid_candidate_top_k: 10
learning_rate: 1e-5
num_epochs: 5
max_length: 512
batch_size: 16
selection_metric: hit@1

## Evaluation

Evaluation was performed on a strict top-10 candidate split without injected gold chunks or injected hard negatives. This better reflects real retrieval behavior, where the reranker cannot recover chunks that the first-stage
retriever did not retrieve.

### 5k Strict Test Results

┌───────────────────┬────────┬────────┬────────┬────────┬────────┬─────────┐
│ System            │  Hit@1 │  Hit@3 │  Hit@5 │ Hit@10 │ MRR@10 │ NDCG@10 │
├───────────────────┼────────┼────────┼────────┼────────┼────────┼─────────┤
│ No reranker       │ 0.7120 │ 0.8760 │ 0.9060 │ 0.9480 │ 0.8009 │  0.8290 │
│ BGE v2-m3 teacher │ 0.8380 │ 0.9220 │ 0.9400 │ 0.9480 │ 0.8827 │  0.8913 │
│ This model        │ 0.8260 │ 0.9180 │ 0.9360 │ 0.9480 │ 0.8744 │  0.8838 │
└───────────────────┴────────┴────────┴────────┴────────┴────────┴─────────┘

The model nearly matches the BGE v2-m3 teacher on this generated 5k strict test split.

### Latency

Measured on Kaggle GPU over top-10 reranking:

┌───────────────────┬───────────┬───────────┬───────────┬───────────┐
│ Model             │ Avg/query │ P50/query │ P95/query │ Pairs/sec │
├───────────────────┼───────────┼───────────┼───────────┼───────────┤
│ BGE v2-m3 teacher │   0.1920s │   0.2013s │   0.2153s │     52.10 │
│ This model        │   0.0998s │   0.1014s │   0.1063s │    100.24 │
└───────────────────┴───────────┴───────────┴───────────┴───────────┘

Approximate speedup:

1.92x faster than BGE v2-m3

## Limitations

- Evaluation is strongest on generated 5k questions; additional testing on real/manual queries is recommended.
- The model may incorrectly demote relevant chunks when the query is ambiguous or when multiple similar regulatory articles exist.
- It should be used with source display and human verification for compliance-sensitive workflows.
- It is optimized for BCT regulatory retrieval, not general legal or financial retrieval.

## Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model_id = "slim0001/bct-l12-reranker"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)
model.eval()

query = "Quelles sont les obligations de communication à la BCT ?"
passages = [
    "Article ... texte réglementaire ...",
    "Autre passage ..."
]

inputs = tokenizer(
    [query] * len(passages),
    passages,
    padding=True,
    truncation=True,
    max_length=512,
    return_tensors="pt",
)

with torch.no_grad():
    logits = model(**inputs).logits
    scores = logits.squeeze(-1) if logits.shape[-1] == 1 else logits[:, -1]

ranked = sorted(zip(scores.tolist(), passages), reverse=True)
print(ranked)

## Recommended Production Use

Use as a reranker inside a RAG system:

first-stage retrieval: BGE-M3 dense + BM25 + RRF
rerank_top_k: 10
final_top_k: 5-8
answer generation: separate LLM

## Contact

Model owner: slim0001

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for slim0001/bct-l12-reranker

Base model

microsoft/MiniLM-L12-H384-uncased

Quantized

cross-encoder/ms-marco-MiniLM-L12-v2

Finetuned

(33)

this model