Instructions to use zettascope/explainable-book-reranker-ko with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use zettascope/explainable-book-reranker-ko with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
explainable-book-reranker-ko
v2.0.0
A Korean book-recommendation reranker that returns both a relevance score and the evidence sentences behind it — not just a number. It is a LoRA adaptation of BAAI/bge-reranker-v2-m3 trained with the select-then-predict architecture, distilled from an LLM teacher.
What it does
Given a query and a candidate book (with its synopsis/review sentences), the model:
- Generator (LoRA on the encoder + a selection head) picks the 1–3 sentences that justify the match — the explanation.
- Predictor (LoRA + classification head) scores the book using only those selected sentences.
Because the score is computed only from the selected evidence, the highlighted sentences are a faithful reason for the ranking — the model cannot secretly rank on something it didn't show.
This is not a drop-in CrossEncoder: it has two adapters and a custom inference path (see Usage).
Results
Evaluated against held-out teacher labels (distillation generalization; gold = teacher, not human relevance).
| split | NDCG@5 | NDCG@10 | MRR | rationale F1 | rationale IoU |
|---|---|---|---|---|---|
| test (unbiased) | 0.581 | 0.606 | 0.880 | 0.434 | 0.317 |
| valid (selected epoch 4) | 0.597 | 0.618 | 0.928 | 0.432 | 0.312 |
vs v1.0.0 (989 labels): test NDCG@10 0.577 → 0.606 (+5%), rationale F1 0.197 → 0.434 (2.2×), rationale IoU 0.142 → 0.317.
Two levers drove the gain: (1) 2× the teacher data (989 → 1,907 labeled queries) lifted ranking — the data–quality curve was still rising; (2) capping each training query's pool to its top-24 candidates + hard negatives roughly doubled rationale faithfulness at equal ranking (it sharpens the generator's evidence selection).
Model selection: trained 5 epochs with per-epoch checkpoints; epoch 4 maximized validation NDCG@5. The final epoch reliably collapses (degenerate ranking overfit), so best-epoch selection is required — never use a fixed last epoch.
Files
generator_adapter/ LoRA_G — evidence-sentence selection (PEFT)
generator_head.pt selection head (fp32 linear)
predictor_adapter/ LoRA_P — relevance scoring (PEFT)
lora_target_modules.yaml LoRA config required by the loader
Usage
The architecture lives in the source repo. Install it, download this model, and load with the select-then-predict loader:
git clone https://github.com/reranker-master/explainable-reranker
cd explainable-reranker && pip install -e '.[gpu]'
huggingface-cli download zettascope/explainable-book-reranker-ko --local-dir ./ckpt
from explainable_reranker.models.select_predict.neural_model import load_neural_model
# select_fp32=True makes the rationale fully deterministic (no bf16/padding wobble) at
# ~50% extra latency; default False keeps ranking identical and is faster.
model = load_neural_model("./ckpt", "./ckpt/lora_target_modules.yaml", device="cuda")
# model.rerank_batch(batch) -> ranked books with score + selected evidence sentences
Or serve the /rerank HTTP endpoint:
PYTHONPATH=src python3 scripts/serve_rerank.py \
--checkpoint ./ckpt --lora-config ./ckpt/lora_target_modules.yaml
# add --select-fp32 for deterministic rationale
Latency (GB10, bf16, 50-candidate pool): ~1.8 s/query, scaling ~linearly at ~38 ms/candidate. Candidates are scored independently, so batching them is safe (ranking unchanged).
Training
- Base: BAAI/bge-reranker-v2-m3 (XLM-RoBERTa-large cross-encoder), LoRA r=16.
- Teacher labels: 1,907 Korean book-recommendation queries, each with ~50 candidates graded 0–3, top-10 grounded rationale sentences, and in-pool hard negatives. Labeled by an LLM teacher via grounded 2-pass prompting.
- Recipe: 5 epochs, lr 1e-4, train-candidate cap 24 (top-by-score + hard negatives; valid/test use the full pool), best-epoch selection on validation NDCG@5.
- Losses: listwise KD (ranking) + binary selection (which sentences are evidence) + hard-negative anchor + sparsity/continuity. Generator and Predictor trained jointly with a teacher→generator selection-packing schedule.
Limitations
- Quality is capped by the teacher (LLM-distilled); metrics measure agreement with the teacher on unseen queries, not absolute human relevance.
- Trained on 1,907 queries — the data curve was still rising, so more/better-teacher data should raise quality further.
- rationale IoU (0.32) means the selected sentences are reasonable evidence but do not exactly match the teacher's picks.
- Korean book-recommendation domain; not validated elsewhere.
Versions
- v2.0.0 — 1,907 LLM-distilled teacher labels (2×), train-candidate cap 24, epoch-4 (valid-NDCG@5-selected). Test NDCG@10 = 0.606, rationale F1 = 0.434. Adds a
select_fp32option for deterministic rationale. - v1.0.0 — first release. bge-reranker-v2-m3 + LoRA, 989 LLM-distilled teacher labels, epoch-3 (valid-NDCG@5-selected). Test NDCG@5 = 0.550, NDCG@10 = 0.577.
License
MIT, inheriting BAAI/bge-reranker-v2-m3 (MIT). Verify the base license before redistribution.
- Downloads last month
- -
Model tree for zettascope/explainable-book-reranker-ko
Base model
BAAI/bge-reranker-v2-m3