Update README.md
Browse files
README.md
CHANGED
|
@@ -70,3 +70,5 @@ python applyweights.py \
|
|
| 70 |
| `--layers` | A space-separated list of specific layer indices to fuse (e.g., `--layers 5 11 17`), skipping the rest. | None |
|
| 71 |
| `--alpha` | Controls the variance scale multiplier for the `down_proj` update. | `0.02` |
|
| 72 |
| `--gamma-cap` | Sets the maximum fractional adjustment allowed for the `gate_proj`. | `0.05` |
|
|
|
|
|
|
|
|
|
| 70 |
| `--layers` | A space-separated list of specific layer indices to fuse (e.g., `--layers 5 11 17`), skipping the rest. | None |
|
| 71 |
| `--alpha` | Controls the variance scale multiplier for the `down_proj` update. | `0.02` |
|
| 72 |
| `--gamma-cap` | Sets the maximum fractional adjustment allowed for the `gate_proj`. | `0.05` |
|
| 73 |
+
|
| 74 |
+
# Note: the default values for Alpha and Gamma are extremely conservative, they will not influence model behavior much at the defaults; it an be pushed to around Alpha ~32 and Gamma-cap ~25 without breaking.
|