smithblack-0 commited on
Commit
8d94cd0
·
verified ·
1 Parent(s): a5f76da

Update architecture and tokenizer

Browse files
Files changed (1) hide show
  1. architecture_core/README.md +16 -6
architecture_core/README.md CHANGED
@@ -34,10 +34,16 @@ All other components follow the Llama 3 baseline (RMSNorm, SwiGLU FFN, RoPE).
34
 
35
  ## Usage
36
 
 
 
 
 
37
  ```python
38
  from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
39
 
40
- # Pull architecture config override any parameter at instantiation time
 
 
41
  config = AutoConfig.from_pretrained(
42
  "smithblack-0/SHRAM",
43
  trust_remote_code=True,
@@ -45,15 +51,19 @@ config = AutoConfig.from_pretrained(
45
  num_mosrah_heads=32, # example override
46
  )
47
 
48
- # Instantiate with fresh random weights — no checkpoint required
 
49
  model = AutoModelForCausalLM.from_config(config, trust_remote_code=True)
50
 
51
- # Load tokenizer
52
  tokenizer = AutoTokenizer.from_pretrained("smithblack-0/SHRAM")
 
53
 
54
- # Save and reload after training
55
- model.save_pretrained("./checkpoint")
56
- model = AutoModelForCausalLM.from_pretrained("./checkpoint", trust_remote_code=True)
 
 
57
  ```
58
 
59
  ## Constructor Defaults
 
34
 
35
  ## Usage
36
 
37
+ This repository contains no pretrained weights. The intended workflow is: pull the
38
+ architecture config from the Hub, instantiate a model with fresh random weights, then
39
+ train it yourself.
40
+
41
  ```python
42
  from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
43
 
44
+ # Step 1: pull the architecture config from the Hub.
45
+ # AutoConfig.from_pretrained downloads config.json only — no weights are loaded.
46
+ # Override any parameter via kwargs.
47
  config = AutoConfig.from_pretrained(
48
  "smithblack-0/SHRAM",
49
  trust_remote_code=True,
 
51
  num_mosrah_heads=32, # example override
52
  )
53
 
54
+ # Step 2: instantiate with fresh random weights.
55
+ # from_config never loads a checkpoint — it always produces a randomly initialised model.
56
  model = AutoModelForCausalLM.from_config(config, trust_remote_code=True)
57
 
58
+ # Step 3: load the tokenizer.
59
  tokenizer = AutoTokenizer.from_pretrained("smithblack-0/SHRAM")
60
+ ```
61
 
62
+ After training your own checkpoint, save and reload it in the standard way:
63
+
64
+ ```python
65
+ model.save_pretrained("./my-checkpoint")
66
+ model = AutoModelForCausalLM.from_pretrained("./my-checkpoint", trust_remote_code=True)
67
  ```
68
 
69
  ## Constructor Defaults