ferrotorch/all-MiniLM-L6-v2

all-MiniLM-L6-v2 (sentence-transformers/all-MiniLM-L6-v2). BERT-family encoder-only sentence-embedding model, 22M parameters, 6 layers, hidden=384, intermediate=1536, num_attention_heads=12, vocab=30522, type_vocab_size=2, max_position_embeddings=512, post-norm residual, GELU FFN. Sentence pipeline = mean-pool over attention mask + L2 normalize. Apache 2.0 license. Pinned as the real-artifact baseline for sentence-embedding parity vs sentence_transformers==5.4.1 (issue #1148).

Provenance

  • Upstream: sentence-transformers/all-MiniLM-L6-v2 (apache-2.0).
  • Conversion script: ferrotorch/scripts/pin_pretrained_text_weights.py.
  • Ferrotorch issue: https://github.com/dollspace/ferrotorch/issues/1148.
  • SHA-256 of model.safetensors (this file is pinned in ferrotorch-hub/src/registry.rs): 53aa51172d142c89d9012cce15ae4d6cc0ca6895895114379cacb4fab128d9db.
  • Number of trainable parameters: 22,565,376.
  • Embedding dimension: 384.
  • Config snapshot: hidden=384, layers=6, heads=12, intermediate=1536, vocab=30522, max_position_embeddings=512, type_vocab_size=2, hidden_act=gelu, layer_norm_eps=1e-12.

Value-parity probe

Three extra files are uploaded so the ferrotorch-side harness can reproduce the parity verdict without re-running the upstream sentence-transformers model:

  • _value_parity_input.txt — verbatim sentence ("The quick brown fox jumps over the lazy dog.").
  • _value_parity_token_ids.json — upstream tokenizer(...) output for that sentence with add_special_tokens=True.
  • _value_parity_output.bin — float32 sentence embedding dumped from SentenceTransformer.encode(..., normalize_embeddings=True). Format: [u32 ndim][u32 × ndim shape][f32 × prod(shape) data] little-endian (matches the vision / causal-LM dumps).

How to load

use ferrotorch_bert::{BertConfig, HfBertConfig, load_sentence_transformer};
use ferrotorch_hub::{HubCache, hf_download_model};

let cache = HubCache::with_default_dir();
let repo_dir = hf_download_model("ferrotorch/all-MiniLM-L6-v2", "main", &cache)?;
let hf_cfg = HfBertConfig::from_file(repo_dir.join("config.json"))?;
let cfg = BertConfig::from_hf(&hf_cfg)?;
let (st, _report) = load_sentence_transformer::<f32>(
    &repo_dir.join("model.safetensors"),
    cfg,
    /* normalize = */ true,
    /* strict   = */ false,  // upstream has pooler.* + position_ids
)?;

Upstream license

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    https://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Downloads last month
123
Safetensors
Model size
22.7M params
Tensor type
I64
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support