PhysiQuanty's picture
Update README.md
a8367f4 verified
|
Raw
History Blame Contribute Delete
2.92 kB
metadata
license: apache-2.0
pipeline_tag: sentence-similarity
language:
  - fr
tags:
  - embeddings
  - french
  - feature-extraction
  - bfloat16
  - sentence-similarity
  - text-embeddings
Evaluation task Embeddings-Francais-BF16-BASE-50M Test-Train-Avant-Main-Train
SICKFr 0.519713 0.699325
SyntecReranking 0.313680 0.328360
SummEvalFr 0.306903 0.305028
AlloProfClusteringS2S 0.213383 0.209503
SyntecRetrieval 0.051370 0.123900
HALClusteringS2S Failed 0.042094
Hyperparameter Embeddings-Francais-BF16-BASE-50M Test-Train-Avant-Main-Train
Training tokens seen 2.46B 61.44M + SFT
Parameters 169,896,960 21,240,576
Context length 4096 4096
Embedding dimension 1536 384
Vocabulary size 32768 32768
Layers 4 4
Heads 12 4
Head dimension 128 96
Precision bfloat16 bfloat16
Attention backend SageAttention SageAttention
Pooling Mean pooling Mean pooling
Normalization L2 normalize L2 normalize