SentenceTransformer based on BAAI/bge-base-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-base-en-v1.5. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-base-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True, 'architecture': 'BertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Represent this sentence for searching relevant passages: When a unit is set up, what conditions must be met regarding coherency?',
    'Any time a unit is **set up** or **ends a move**, it must be in a single group. A unit is considered to be in a coherent group if each model in that unit is within **coherency range**, measured horizontally, of **at least 1 other model** in that unit (ignore differences in height between the two models).  \n\nFor the majority of units, **coherency range is 1/2"**, though some units (particularly those with large models with overhanging parts) have a longer coherency range noted on their warscroll for ease of play. While there are **7 or more** models in a unit, that unit is considered to be in a coherent group if each model in that unit is within coherency range of **at least 2 other models** in that unit.  \n\nIf it is not possible for a unit to end a move in a single coherent group, that move cannot be made.  \n\n\n*   After finishing a move, a unit must be in a single group.\n*   Coherency range is 1/2" horizontally.\n*   Each model must be within coherency range of a different model from the same unit.\n*   While a unit has 7+ models, each model must be in coherency with 2 other models in the unit.',
    'Armies are made up of one or more **regiments**, each of which is led by a Hero. You must have **at least 1** regiment in your army, and you can include a **maximum of 5** regiments. To add a regiment, pick 1 HERO from your faction, then pick up to **3** non‑HERO units to accompany them.\n\nEach HERO’s battle profile lists which units can be added to their regiment, and each non-HERO unit’s battle profile lists any relevant keywords it has. The battle profiles of some HEROES may say that they can be added to the regiment of another HERO in place of a non-HERO unit in that regiment.\n\nQ: If a HERO is able to join another HERO’s regiment (e.g. The Shadow Queen joining Morathi-Khaine or an Assassin joining a Dreadlord on Black Dragon), do they take the place of a non-HERO unit in that regiment?_\nA: Yes.\n\nQ: Can I add units from other factions to my HERO’ regiments?_\nA: No. The only way to add units from other factions to your army is by taking an eligible Regiment of Renown.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6561, 0.0181],
#         [0.6561, 1.0000, 0.1562],
#         [0.0181, 0.1562, 1.0000]])

Evaluation

Metrics

Information Retrieval

Dataset: warhammer-eval
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.689
cosine_accuracy@3	0.8819
cosine_accuracy@5	0.9331
cosine_accuracy@10	0.9646
cosine_precision@1	0.689
cosine_precision@3	0.294
cosine_precision@5	0.1866
cosine_precision@10	0.0965
cosine_recall@1	0.689
cosine_recall@3	0.8819
cosine_recall@5	0.9331
cosine_recall@10	0.9646
cosine_ndcg@10	0.8356
cosine_mrr@10	0.7932
cosine_map@100	0.7954

Training Details

Training Dataset

Unnamed Dataset

Size: 1,014 training samples
Columns: sentence_0 and sentence_1
Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1
type string string
details
min: 15 tokens
mean: 25.73 tokens
max: 50 tokens

min: 40 tokens
mean: 191.46 tokens
max: 408 tokens

	sentence_0	sentence_1
type	string	string
details	min: 15 tokens mean: 25.73 tokens max: 50 tokens	min: 40 tokens mean: 191.46 tokens max: 408 tokens

Samples:

sentence_0	sentence_1
`Represent this sentence for searching relevant passages: If a friendly unit has Strike-First, and a friendly unit without Strike-First can fight after, who fights first?`	Q: Can I use an ability that allows a friendly unit that does not have STRIKE-FIRST to fight immediately after a friendly unit that has STRIKE-FIRST if there are one or more enemy units with STRIKE-FIRST that have not yet been picked to fight?_ A: No. As mentioned in the sidebar next to 19.0, abilities that allow a unit to use a FIGHT ability immediately after another unit do not override the STRIKE-FIRST constraints, so you cannot pick a unit that does not have STRIKE-FIRST to fight until all other units that have STRIKE-FIRST have fought.
`Represent this sentence for searching relevant passages: When do I check if I gain control of an objective?`	* An objective marker is a 40mm round marker. * A model contests an objective if the objective marker is within its combat range. * A player gains control of an objective if the sum of the Control characteristics of friendly models contesting that objective is higher than that of enemy models. * Check if you gain control of objectives at the start of the first battle round and at the end of each turn. * An objective remains in your control until your opponent gains control of it. * Terrain features are controlled in the same way as objective markers but do not remain in your control if no friendly models are contesting them.
`Represent this sentence for searching relevant passages: Is setting up faction terrain in the deployment phase optional?`	`Q: Is it mandatory for players to set up a faction terrain feature (if one is included on their roster) during the deployment phase?_ A: No. A player can choose not to use the ‘Deploy Faction Terrain’ ability. However, if both players choose to set up a faction terrain feature, the player who begins deployment must set up their faction terrain features first (as specified in Step 1 of 10.0).`

Loss: main.TrackingLoss

Training Hyperparameters

Non-Default Hyperparameters

per_device_train_batch_size: 32
per_device_eval_batch_size: 32
num_train_epochs: 8
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

do_predict: False
eval_strategy: no
prediction_loss_only: True
per_device_train_batch_size: 32
per_device_eval_batch_size: 32
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1
num_train_epochs: 8
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: None
warmup_ratio: None
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
enable_jit_checkpoint: False
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
use_cpu: False
seed: 42
data_seed: None
bf16: False
fp16: False
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: -1
ddp_backend: None
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_for_metrics: []
eval_do_concat_batches: True
auto_find_batch_size: False
full_determinism: False
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
use_cache: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Epoch	Step	warhammer-eval_cosine_ndcg@10
-1	-1	0.8356

Framework Versions

Python: 3.12.12
Sentence Transformers: 5.2.2
Transformers: 5.0.0
PyTorch: 2.9.0+cu128
Accelerate: 1.12.0
Datasets: 4.0.0
Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

Downloads last month: 14

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for andmmenendez/bge-base-warhammer

Base model

BAAI/bge-base-en-v1.5

Finetuned

(444)

this model

Paper for andmmenendez/bge-base-warhammer

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 12

Evaluation results

Cosine Accuracy@1 on warhammer eval
self-reported

0.689
Cosine Accuracy@3 on warhammer eval
self-reported

0.882
Cosine Accuracy@5 on warhammer eval
self-reported

0.933
Cosine Accuracy@10 on warhammer eval
self-reported

0.965
Cosine Precision@1 on warhammer eval
self-reported

0.689
Cosine Precision@3 on warhammer eval
self-reported

0.294
Cosine Precision@5 on warhammer eval
self-reported

0.187
Cosine Precision@10 on warhammer eval
self-reported

0.096
Cosine Recall@1 on warhammer eval
self-reported

0.689
Cosine Recall@3 on warhammer eval
self-reported

0.882
Cosine Recall@5 on warhammer eval
self-reported

0.933
Cosine Recall@10 on warhammer eval
self-reported

0.965
Cosine Ndcg@10 on warhammer eval
self-reported

0.836
Cosine Mrr@10 on warhammer eval
self-reported

0.793
Cosine Map@100 on warhammer eval
self-reported

0.795