Upload JiRack_H12_L6_V50257_D768_MSL8192_FF768x4.pt
Browse filesThis file is intended strictly for saving the initial weights (checkpoint) of the JiRack GPT model. The model is "clean," meaning it contains no data and has never undergone pre-training.
It is designed to be a maximum safe and robust base for starting training from scratch for specialized, smaller models, such as:
SPAM Detection Systems
FRAUD Detection Models
Background Check (BG Check) Models
A product of CMS Manhattan.
V OCAB_SIZE = 50257
MODEL_DIM = 768
NUM_HEADS = 12
NUM_LAYERS = 6
MAX_SEQ_LEN = 8192
FFN_HIDDEN_DIM = 4 * MODEL_DIM
HEAD_DIM = MODEL_DIM // NUM_HEADS
JiRack_H12_L6_V50257_D768_MSL8192_FF768x4.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:97aadb0eecd322942343d7aae105f3ce9a477aa42a1d81f82e6148bfcb0da9ee
|
| 3 |
+
size 349530076
|