| | --- |
| | license: apache-2.0 |
| | base_model: meta-llama/Llama-3.1-8B |
| | tags: |
| | - lora |
| | - peft |
| | - polynomial |
| | - regression |
| | - grokking |
| | --- |
| | |
| | # LoRA adapter: poly_5_medium on meta-llama/Llama-3.1-8B |
| |
|
| | Rank-1 **MLP-only LoRA** adapter for **polynomial regression (sequence classification head)** on the `poly_5_medium` polynomial dataset. |
| |
|
| | ## Adapter config |
| |
|
| | | Setting | Value | |
| | |------------|-------| |
| | | Base model | `meta-llama/Llama-3.1-8B` | |
| | | Polynomial | `poly_5_medium` | |
| | | LoRA rank | 1 | |
| | | LoRA alpha | 16 | |
| | | LoRA dropout | 0.05 | |
| | | Target | MLP layers only (e.g. `gate_proj`, `up_proj`, `down_proj` for LLaMA) | |
| |
|
| | ## How to load and run |
| |
|
| | From this repo (requires `transformers`, `peft`, `torch`): |
| |
|
| | ```python |
| | from transformers import AutoModelForSequenceClassification, AutoTokenizer |
| | from peft import PeftModel |
| | import torch |
| | |
| | base_model_id = "meta-llama/Llama-3.1-8B" |
| | adapter_repo_id = "AnonymousForReview2/script_poly_5_medium_Llama-3.1-8B_r1_mlp" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(base_model_id) |
| | base = AutoModelForSequenceClassification.from_pretrained( |
| | base_model_id, num_labels=1, problem_type="regression" |
| | ) |
| | model = PeftModel.from_pretrained(base, adapter_repo_id) |
| | model.eval() |
| | |
| | # Example: predict for input vector |
| | text = "input: [1.0, 2.0, 3.0, 4.0] target:" |
| | inputs = tokenizer(text, return_tensors="pt") |
| | with torch.no_grad(): |
| | out = model(**inputs).logits.item() |
| | print(out) |
| | ``` |
| |
|
| | Or use the provided script from the project root: |
| |
|
| | ```bash |
| | python huggingface/load_from_hub.py --repo_id AnonymousForReview2/script_poly_5_medium_Llama-3.1-8B_r1_mlp |
| | ``` |
| |
|