Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -30,7 +30,7 @@ trained with a knowledge cutoff of **January 2017**, from the
|
|
| 30 |
## Usage
|
| 31 |
|
| 32 |
```python
|
| 33 |
-
from transformers import AutoModelForCausalLM
|
| 34 |
import torch
|
| 35 |
|
| 36 |
model = AutoModelForCausalLM.from_pretrained(
|
|
@@ -43,6 +43,7 @@ model = AutoModelForCausalLM.from_pretrained(
|
|
| 43 |
## Conversion Notes
|
| 44 |
|
| 45 |
- Converted from the original Open LM `.pt` checkpoint to HuggingFace `LlamaForCausalLM` format.
|
|
|
|
| 46 |
- The original model uses **QK norm** (RMSNorm on Q and K projections), which is not natively
|
| 47 |
supported by HF Llama. QK norm weights are dropped during conversion. For exact numerical
|
| 48 |
equivalence, use the [open_lm](https://github.com/mlfoundations/open_lm) library.
|
|
|
|
| 30 |
## Usage
|
| 31 |
|
| 32 |
```python
|
| 33 |
+
from transformers import AutoModelForCausalLM
|
| 34 |
import torch
|
| 35 |
|
| 36 |
model = AutoModelForCausalLM.from_pretrained(
|
|
|
|
| 43 |
## Conversion Notes
|
| 44 |
|
| 45 |
- Converted from the original Open LM `.pt` checkpoint to HuggingFace `LlamaForCausalLM` format.
|
| 46 |
+
- Architecture dimensions are auto-detected from checkpoint weights.
|
| 47 |
- The original model uses **QK norm** (RMSNorm on Q and K projections), which is not natively
|
| 48 |
supported by HF Llama. QK norm weights are dropped during conversion. For exact numerical
|
| 49 |
equivalence, use the [open_lm](https://github.com/mlfoundations/open_lm) library.
|