dogtooth
/

open-lm-1b-201701

@@ -30,7 +30,7 @@ trained with a knowledge cutoff of **January 2017**, from the
 ## Usage
 ```python
-from transformers import AutoModelForCausalLM, AutoTokenizer
 import torch
 model = AutoModelForCausalLM.from_pretrained(
@@ -43,6 +43,7 @@ model = AutoModelForCausalLM.from_pretrained(
 ## Conversion Notes
 - Converted from the original Open LM `.pt` checkpoint to HuggingFace `LlamaForCausalLM` format.
 - The original model uses **QK norm** (RMSNorm on Q and K projections), which is not natively
   supported by HF Llama. QK norm weights are dropped during conversion. For exact numerical
   equivalence, use the [open_lm](https://github.com/mlfoundations/open_lm) library.

 ## Usage
 ```python
+from transformers import AutoModelForCausalLM
 import torch
 model = AutoModelForCausalLM.from_pretrained(
 ## Conversion Notes
 - Converted from the original Open LM `.pt` checkpoint to HuggingFace `LlamaForCausalLM` format.
+- Architecture dimensions are auto-detected from checkpoint weights.
 - The original model uses **QK norm** (RMSNorm on Q and K projections), which is not natively
   supported by HF Llama. QK norm weights are dropped during conversion. For exact numerical
   equivalence, use the [open_lm](https://github.com/mlfoundations/open_lm) library.