AnonymousForReview2's picture
Upload README.md with huggingface_hub
960c120 verified
---
license: apache-2.0
base_model: meta-llama/Llama-3.1-8B
tags:
- lora
- peft
- polynomial
- regression
- grokking
---
# LoRA adapter: poly_5_medium on meta-llama/Llama-3.1-8B
Rank-8 **MLP-only LoRA** adapter for **polynomial regression (sequence classification head)** on the `poly_5_medium` polynomial dataset.
## Adapter config
| Setting | Value |
|------------|-------|
| Base model | `meta-llama/Llama-3.1-8B` |
| Polynomial | `poly_5_medium` |
| LoRA rank | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.05 |
| Target | MLP layers only (e.g. `gate_proj`, `up_proj`, `down_proj` for LLaMA) |
## How to load and run
From this repo (requires `transformers`, `peft`, `torch`):
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel
import torch
base_model_id = "meta-llama/Llama-3.1-8B"
adapter_repo_id = "AnonymousForReview2/script_poly_5_medium_Llama-3.1-8B_r8_mlp"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base = AutoModelForSequenceClassification.from_pretrained(
base_model_id, num_labels=1, problem_type="regression"
)
model = PeftModel.from_pretrained(base, adapter_repo_id)
model.eval()
# Example: predict for input vector
text = "input: [1.0, 2.0, 3.0, 4.0] target:"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
out = model(**inputs).logits.item()
print(out)
```
Or use the provided script from the project root:
```bash
python huggingface/load_from_hub.py --repo_id AnonymousForReview2/script_poly_5_medium_Llama-3.1-8B_r8_mlp
```