kgrabko commited on
Commit
d63f819
·
verified ·
1 Parent(s): 960c6b4

Upload JiRack_H12_L6_V50257_D768_MSL8192_FF768x4.pt

Browse files

This file is intended strictly for saving the initial weights (checkpoint) of the JiRack GPT model. The model is "clean," meaning it contains no data and has never undergone pre-training.

It is designed to be a maximum safe and robust base for starting training from scratch for specialized, smaller models, such as:

SPAM Detection Systems

FRAUD Detection Models

Background Check (BG Check) Models

A product of CMS Manhattan.

V OCAB_SIZE = 50257
MODEL_DIM = 768
NUM_HEADS = 12
NUM_LAYERS = 6
MAX_SEQ_LEN = 8192
FFN_HIDDEN_DIM = 4 * MODEL_DIM
HEAD_DIM = MODEL_DIM // NUM_HEADS

JiRack_H12_L6_V50257_D768_MSL8192_FF768x4.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:97aadb0eecd322942343d7aae105f3ce9a477aa42a1d81f82e6148bfcb0da9ee
3
+ size 349530076