|
|
--- |
|
|
base_model: sentence-transformers/all-MiniLM-L6-v2 |
|
|
library_name: peft |
|
|
license: mit |
|
|
tags: |
|
|
- lora |
|
|
- peft |
|
|
- code |
|
|
- programming |
|
|
- software |
|
|
- domain-adaptation |
|
|
- sentence-embeddings |
|
|
language: |
|
|
- en |
|
|
--- |
|
|
|
|
|
# Code LoRA Adapter for DomainEmbedder-v2.6 |
|
|
|
|
|
Domain-specific LoRA adapter for code/programming text embeddings. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
| Property | Value | |
|
|
|----------|-------| |
|
|
| **Base Model** | sentence-transformers/all-MiniLM-L6-v2 | |
|
|
| **Parent System** | DomainEmbedder-v2.6 | |
|
|
| **Domain** | Code / Programming | |
|
|
| **LoRA Rank** | 16 | |
|
|
| **LoRA Alpha** | 32 | |
|
|
| **Target Modules** | query, value | |
|
|
| **Trainable Params** | 147,456 (0.645%) | |
|
|
|
|
|
## Training Data |
|
|
|
|
|
Trained on 40,000 code-related text pairs from: |
|
|
- Code Alpaca |
|
|
- MBPP (Mostly Basic Python Problems) |
|
|
- Code Contests |
|
|
- Python Instructions |
|
|
|
|
|
## Training Configuration |
|
|
|
|
|
| Parameter | Value | |
|
|
|-----------|-------| |
|
|
| Epochs | 3 | |
|
|
| Batch Size | 32 | |
|
|
| Learning Rate | 2e-4 | |
|
|
| Loss | Contrastive (InfoNCE) | |
|
|
| Best Val Loss | 0.0039 | |
|
|
|
|
|
## Usage |
|
|
|
|
|
This adapter is part of the DomainEmbedder-v2.6 system. It is selected automatically by the RL policy when code-related content is detected. |
|
|
|
|
|
```python |
|
|
from peft import PeftModel |
|
|
from transformers import AutoModel |
|
|
|
|
|
# Load base encoder |
|
|
base_encoder = AutoModel.from_pretrained('sentence-transformers/all-MiniLM-L6-v2') |
|
|
|
|
|
# Apply code LoRA |
|
|
code_model = PeftModel.from_pretrained(base_encoder, 'path/to/code_lora') |
|
|
``` |
|
|
|
|
|
## Author |
|
|
|
|
|
**Zain Asad** |
|
|
|
|
|
## License |
|
|
|
|
|
MIT License |
|
|
|
|
|
## Framework Versions |
|
|
|
|
|
- PEFT 0.18.1 |
|
|
- Transformers 4.x |
|
|
- PyTorch 2.x |
|
|
|