DomainEmbedder / scientific_lora /README.md

EphAsad

Update scientific_lora/README.md

f93132a verified 2 days ago

preview code

raw

history blame contribute delete

1.66 kB

metadata

base_model: sentence-transformers/all-MiniLM-L6-v2
library_name: peft
license: mit
tags:
  - lora
  - peft
  - scientific
  - research
  - academic
  - domain-adaptation
  - sentence-embeddings
language:
  - en

Scientific LoRA Adapter for DomainEmbedder-v2.6

Domain-specific LoRA adapter for scientific/research text embeddings.

Model Details

Property	Value
Base Model	sentence-transformers/all-MiniLM-L6-v2
Parent System	DomainEmbedder-v2.6
Domain	Scientific / Research
LoRA Rank	16
LoRA Alpha	32
Target Modules	query, value
Trainable Params	147,456 (0.645%)

Training Data

Trained on 40,000 scientific text pairs from:

arXiv (document-level)
arXiv (section-level)
PubMed Artificial
Scientific Papers

Note: 87.3% real data + 12.7% augmented data (scientific domain had fewer available pairs)

Training Configuration

Parameter	Value
Epochs	3
Batch Size	32
Learning Rate	2e-4
Loss	Contrastive (InfoNCE)
Best Val Loss	0.0016

Usage

This adapter is part of the DomainEmbedder-v2.6 system. It is selected automatically by the RL policy when scientific content is detected.

from peft import PeftModel
from transformers import AutoModel

# Load base encoder
base_encoder = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2')

# Apply scientific LoRA
scientific_model = PeftModel.from_pretrained(base_encoder, 'path/to/scientific_lora')

Author

Zain Asad

License

MIT License

Framework Versions

PEFT 0.18.1
Transformers 4.x
PyTorch 2.x