Upload folder using huggingface_hub

Files changed (3) hide show

README.md CHANGED Viewed

@@ -7,12 +7,12 @@ library_name: transformers
 Chain of Thought (CoT) transformer model trained to do multi-step integer arithmetic.
 Model details:
- - **Vocabulary Size**: 35 (Character Tokenization)
  - **Layer Count**: 8
  - **Attention Head Count**: 4
  - **Residual Stream Size**: 256
  - **Context Length**: 256
- - **Tokens Trained on**: 419,649,024
 Training Score During Training

 Chain of Thought (CoT) transformer model trained to do multi-step integer arithmetic.
 Model details:
+ - **Vocabulary Size**: 40 (Character Tokenization)
  - **Layer Count**: 8
  - **Attention Head Count**: 4
  - **Residual Stream Size**: 256
  - **Context Length**: 256
+ - **Tokens Trained on**: 419,612,160
 Training Score During Training

hyperparameters.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ {"MIN_DIFFICULTY": 2, "MAX_DIFFICULTY": 4, "TRAINING_SAMPLES": 3000000, "CONTEXT_LENGTH": 256, "RESIDUAL_EMBEDDING_SIZE": 256, "MLP_EMBEDDING_SIZE": 1024, "NUM_ATTENTION_HEADS": 4, "NUM_LAYERS": 8, "VOCAB_SIZE": 40, "TOTAL_TOKENS": 419612160}

tokenizer.json CHANGED Viewed

@@ -112,8 +112,34 @@
       "t": 36,
       "u": 37,
       "Ċ": 38,
-      "Ġ": 39
     },
-    "merges": []
   }
 }

       "t": 36,
       "u": 37,
       "Ċ": 38,
+      "Ġ": 39,
+      "Ġ-": 40,
+      "(-": 41,
+      "Ġ1": 42,
+      "St": 43,
+      "ep": 44
     },
+    "merges": [
+      [
+        "Ġ",
+        "-"
+      ],
+      [
+        "(",
+        "-"
+      ],
+      [
+        "Ġ",
+        "1"
+      ],
+      [
+        "S",
+        "t"
+      ],
+      [
+        "e",
+        "p"
+      ]
+    ]
   }
 }