BGE-M3 Sentence Scorer

This repository contains a trained BGE-M3 encoder plus a regression head for sentence salience scoring.

Files

  • encoder/: fine-tuned BGE-M3 encoder saved with AutoModel.save_pretrained.
  • regression_head.pt: linear regression head state dict.
  • bge_m3_sentence_scorer_config.json: scorer settings used during training.
  • tokenizer files: tokenizer needed to encode input text.
  • sentence_scorer_inference.py: minimal PyTorch inference helper.

Usage

from sentence_scorer_inference import load_sentence_scorer

model, tokenizer, device = load_sentence_scorer("OneFly7/bge_m3_pointwise_bs16_lr1e5")
texts = ["[S001] Example article sentence. Summary candidate: example summary."]
scores = model.predict(texts, tokenizer, device=device, max_length=4096)
print(scores)

Training source directory:

/lustre/fswork/projects/rech/ges/uuy33zj/multi_lingual_self_distill/multi_lingual_self_distll/models/sentence_scorer_runs/bge_m3_pointwise_jz_sxxx_bs16_lr1e5

Scorer settings:

pooling=cls
dropout=0.1
max_length=4096
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OneFly7/bge_m3_pointwise_bs16_lr1e5

Base model

BAAI/bge-m3
Finetuned
(505)
this model