File size: 1,619 Bytes

# Trained LoRA Adapter for SGLang Embedding LoRA Testing

This is a **fine-tuned** LoRA adapter for testing SGLang's embedding LoRA implementation.
Unlike randomly initialized adapters, this one produces coherent text outputs.

## Configuration

- **Base model:** `TinyLlama/TinyLlama-1.1B-Chat-v1.0`
- **LoRA rank (r):** 8
- **LoRA alpha:** 16
- **Target modules:** embed_tokens, lm_head, q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- **Training steps:** 500
- **Training data:** alpaca dataset (instruction following)

## Weight Shapes

```
down_proj.lora_A: (8, 5632)
down_proj.lora_B: (2048, 8)
embed_tokens.lora_embedding_A: (8, 32000)
embed_tokens.lora_embedding_B: (2048, 8)
gate_proj.lora_A: (8, 2048)
gate_proj.lora_B: (5632, 8)
k_proj.lora_A: (8, 2048)
k_proj.lora_B: (256, 8)
lm_head.lora_A: (8, 2048)
lm_head.lora_B: (32000, 8)
o_proj.lora_A: (8, 2048)
o_proj.lora_B: (2048, 8)
q_proj.lora_A: (8, 2048)
q_proj.lora_B: (2048, 8)
up_proj.lora_A: (8, 2048)
up_proj.lora_B: (5632, 8)
v_proj.lora_A: (8, 2048)
v_proj.lora_B: (256, 8)
```

## Purpose

This adapter tests that SGLang's `ChunkedSgmvLoRABackend.run_lora_a_embedding()` correctly
handles embedding LoRA layers (`embed_tokens`, `lm_head`).

**Key:** `embed_tokens` is in `target_modules` (LoRA decomposition), NOT `modules_to_save` (full weights).

## Usage with SGLang

```python
# Used by: test/srt/lora/test_lora_hf_sgl_logprob_diff.py
# The adapter produces coherent outputs for meaningful CI/CD verification.
```

## Created with

```bash
python scripts/playground/lora/train_embedding_lora_adapter.py --num_train_steps 500
```