smithblack-0
/

SHRAM

Text Generation

sparse-attention

mixture-of-experts

Model card Files Files and versions

smithblack-0 commited on Apr 16

Commit

8d94cd0

·

verified ·

1 Parent(s): a5f76da

Update architecture and tokenizer

Files changed (1) hide show

architecture_core/README.md +16 -6

architecture_core/README.md CHANGED Viewed

@@ -34,10 +34,16 @@ All other components follow the Llama 3 baseline (RMSNorm, SwiGLU FFN, RoPE).
 ## Usage
 ```python
 from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
-# Pull architecture config — override any parameter at instantiation time
 config = AutoConfig.from_pretrained(
     "smithblack-0/SHRAM",
     trust_remote_code=True,
@@ -45,15 +51,19 @@ config = AutoConfig.from_pretrained(
     num_mosrah_heads=32,        # example override
 )
-# Instantiate with fresh random weights — no checkpoint required
 model = AutoModelForCausalLM.from_config(config, trust_remote_code=True)
-# Load tokenizer
 tokenizer = AutoTokenizer.from_pretrained("smithblack-0/SHRAM")
-# Save and reload after training
-model.save_pretrained("./checkpoint")
-model = AutoModelForCausalLM.from_pretrained("./checkpoint", trust_remote_code=True)
 ```
 ## Constructor Defaults

 ## Usage
+This repository contains no pretrained weights. The intended workflow is: pull the
+architecture config from the Hub, instantiate a model with fresh random weights, then
+train it yourself.
 ```python
 from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
+# Step 1: pull the architecture config from the Hub.
+# AutoConfig.from_pretrained downloads config.json only — no weights are loaded.
+# Override any parameter via kwargs.
 config = AutoConfig.from_pretrained(
     "smithblack-0/SHRAM",
     trust_remote_code=True,
     num_mosrah_heads=32,        # example override
 )
+# Step 2: instantiate with fresh random weights.
+# from_config never loads a checkpoint — it always produces a randomly initialised model.
 model = AutoModelForCausalLM.from_config(config, trust_remote_code=True)
+# Step 3: load the tokenizer.
 tokenizer = AutoTokenizer.from_pretrained("smithblack-0/SHRAM")
+```
+After training your own checkpoint, save and reload it in the standard way:
+```python
+model.save_pretrained("./my-checkpoint")
+model = AutoModelForCausalLM.from_pretrained("./my-checkpoint", trust_remote_code=True)
 ```
 ## Constructor Defaults