EphAsad's picture
Update scientific_lora/README.md
f93132a verified
metadata
base_model: sentence-transformers/all-MiniLM-L6-v2
library_name: peft
license: mit
tags:
  - lora
  - peft
  - scientific
  - research
  - academic
  - domain-adaptation
  - sentence-embeddings
language:
  - en

Scientific LoRA Adapter for DomainEmbedder-v2.6

Domain-specific LoRA adapter for scientific/research text embeddings.

Model Details

Property Value
Base Model sentence-transformers/all-MiniLM-L6-v2
Parent System DomainEmbedder-v2.6
Domain Scientific / Research
LoRA Rank 16
LoRA Alpha 32
Target Modules query, value
Trainable Params 147,456 (0.645%)

Training Data

Trained on 40,000 scientific text pairs from:

  • arXiv (document-level)
  • arXiv (section-level)
  • PubMed Artificial
  • Scientific Papers

Note: 87.3% real data + 12.7% augmented data (scientific domain had fewer available pairs)

Training Configuration

Parameter Value
Epochs 3
Batch Size 32
Learning Rate 2e-4
Loss Contrastive (InfoNCE)
Best Val Loss 0.0016

Usage

This adapter is part of the DomainEmbedder-v2.6 system. It is selected automatically by the RL policy when scientific content is detected.

from peft import PeftModel
from transformers import AutoModel

# Load base encoder
base_encoder = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')

# Apply scientific LoRA
scientific_model = PeftModel.from_pretrained(base_encoder, 'path/to/scientific_lora')

Author

Zain Asad

License

MIT License

Framework Versions

  • PEFT 0.18.1
  • Transformers 4.x
  • PyTorch 2.x