Liam Wilbur
Add LoRA fine-tuned transformer with reduced degradation but more balanced synonym performance
6cd99c7
metadata
license: apache-2.0
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - feature-extraction
  - semantic-search
  - automotive-parts
  - synonyms
base_model: sentence-transformers/all-MiniLM-L6-v2

Automotive Parts Synonym Model

A fine-tuned SentenceTransformer model specialized for finding synonyms and related terms in automotive parts and service descriptions.

Model Description

This model is fine-tuned from sentence-transformers/all-MiniLM-L6-v2 specifically for automotive parts synonym detection. It can identify when different part names refer to the same or similar components.

Base Model: sentence-transformers/all-MiniLM-L6-v2

Training Details

Training Strategy: 3-phase approach

  1. Contextual Training (30 epochs): Full phrases with synonyms/antonyms
  2. Foundation Training (15 epochs): Word-to-word synonyms/antonyms
  3. Real-world Fine-tuning (4 epochs): Search phrases and repair descriptions

Loss Function: OnlineContrastiveLoss with varying margins (0.6 → 0.4 → 0.4)
Training Data: Automotive parts synonym/antonym pairs with contextual repair descriptions LoRA Integration:
To significantly improve general semantic performance, LoRA adapters were attached to the attention and feed-forward layers of the MiniLM encoder.

  • Config: r=64, alpha=256, dropout=0.05
  • Only adapter parameters were trained, while the majority of the base model remained frozen.
  • Adapters were enabled during training phases, disabled for baseline evaluations, and saved separately at checkpoints.

Performance

Evaluated on 15-state synonyms phrases, deepseek general queries, STS-B and MTB datasets:

  • Top-200 Synonym Recall 15192/15330 (99.1%)
  • MRR@200 0.6183
  • Top-200 Deepseek Recall 140844/142325 (99.0%)
  • MRR@200 0.4436
  • STS-B Spearman: 0.867
  • MTB Spearman: 0.724

Limitations

  • Optimized specifically for automotive parts and repair terminology
  • May not perform well on general-domain text