|
|
--- |
|
|
license: apache-2.0 |
|
|
base_model: meta-llama/Llama-3.1-8B |
|
|
tags: |
|
|
- lora |
|
|
- peft |
|
|
- polynomial |
|
|
- regression |
|
|
- grokking |
|
|
--- |
|
|
|
|
|
# LoRA adapter: poly_5_medium on meta-llama/Llama-3.1-8B |
|
|
|
|
|
Rank-8 **MLP-only LoRA** adapter for **polynomial regression (sequence classification head)** on the `poly_5_medium` polynomial dataset. |
|
|
|
|
|
## Adapter config |
|
|
|
|
|
| Setting | Value | |
|
|
|------------|-------| |
|
|
| Base model | `meta-llama/Llama-3.1-8B` | |
|
|
| Polynomial | `poly_5_medium` | |
|
|
| LoRA rank | 8 | |
|
|
| LoRA alpha | 16 | |
|
|
| LoRA dropout | 0.05 | |
|
|
| Target | MLP layers only (e.g. `gate_proj`, `up_proj`, `down_proj` for LLaMA) | |
|
|
|
|
|
## How to load and run |
|
|
|
|
|
From this repo (requires `transformers`, `peft`, `torch`): |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
base_model_id = "meta-llama/Llama-3.1-8B" |
|
|
adapter_repo_id = "AnonymousForReview2/script_poly_5_medium_Llama-3.1-8B_r8_mlp" |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained(base_model_id) |
|
|
base = AutoModelForSequenceClassification.from_pretrained( |
|
|
base_model_id, num_labels=1, problem_type="regression" |
|
|
) |
|
|
model = PeftModel.from_pretrained(base, adapter_repo_id) |
|
|
model.eval() |
|
|
|
|
|
# Example: predict for input vector |
|
|
text = "input: [1.0, 2.0, 3.0, 4.0] target:" |
|
|
inputs = tokenizer(text, return_tensors="pt") |
|
|
with torch.no_grad(): |
|
|
out = model(**inputs).logits.item() |
|
|
print(out) |
|
|
``` |
|
|
|
|
|
Or use the provided script from the project root: |
|
|
|
|
|
```bash |
|
|
python huggingface/load_from_hub.py --repo_id AnonymousForReview2/script_poly_5_medium_Llama-3.1-8B_r8_mlp |
|
|
``` |
|
|
|