Inoob
/

Null-GPT2-Large

Model card Files Files and versions

Inoob commited on Sep 3, 2024

Commit

d39f1b0

·

verified ·

1 Parent(s): 6692a04

Update README.md

Files changed (1) hide show

README.md +46 -3

README.md CHANGED Viewed

@@ -1,3 +1,46 @@
----
-license: mit
----

+---
+license: mit
+base_model: openai-community/gpt2-large
+---
+# GPT2-LARGE ARCHITECTURE MODEL
+## Description
+This is a GPT2-LARGE Model, but only with the architecture, no pre-trained weights, biases, attention, etc.
+This is useful for researchers who want to play with training the model (not tuning).
+Generated via the github repo [Model Architecture Generator](https://github.com/ivanhe123/Model-Architecture-Generator)
+## Use
+First go into the directory of the model and then:
+```
+from transformers import AutoModel, AutoTokenizer
+import torch
+import os
+import argparse
+# Use the provided paths for input and output
+model_name = "./gpt2-large-architecture"
+output_dir = "./gpt2-large-reset"
+model = AutoModel.from_pretrained(model_name)
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+for name, param in model.named_parameters():
+    if param.dim() > 1:
+        torch.nn.init.xavier_uniform_(param)
+    else:
+        torch.nn.init.zeros_(param)
+if not os.path.exists(output_dir):
+    os.makedirs(output_dir)
+model.save_pretrained(output_dir)
+tokenizer.save_pretrained(output_dir)
+print(f"Model with randomized parameters saved to: {output_dir}")
+```