fim-pp / model_architecture.txt

Upload folder using huggingface_hub

9b43be1 2 months ago

6.88 kB

	===============================================================================================
	Layer (type:depth-idx) Output Shape Param #
	===============================================================================================
	FIMHawkes -- 256
	├─SineTimeEncoding: 1-1 [6, 1, 100, 256] --
	│ └─Linear: 2-1 [6, 1, 100, 1] 2
	│ └─Sequential: 2-2 [6, 1, 100, 255] --
	│ │ └─Linear: 3-1 [6, 1, 100, 255] 510
	│ │ └─SinActivation: 3-2 [6, 1, 100, 255] --
	├─SineTimeEncoding: 1-2 [6, 1, 100, 256] --
	│ └─Linear: 2-3 [6, 1, 100, 1] 2
	│ └─Sequential: 2-4 [6, 1, 100, 255] --
	│ │ └─Linear: 3-3 [6, 1, 100, 255] 510
	│ │ └─SinActivation: 3-4 [6, 1, 100, 255] --
	├─Linear: 1-3 [600, 256] 5,888
	├─LayerNorm: 1-4 [6, 1, 100, 256] 512
	├─SineTimeEncoding: 1-5 [6, 1999, 100, 256] (recursive)
	│ └─Linear: 2-5 [6, 1999, 100, 1] (recursive)
	│ └─Sequential: 2-6 [6, 1999, 100, 255] (recursive)
	│ │ └─Linear: 3-5 [6, 1999, 100, 255] (recursive)
	│ │ └─SinActivation: 3-6 [6, 1999, 100, 255] --
	├─SineTimeEncoding: 1-6 [6, 1999, 100, 256] (recursive)
	│ └─Linear: 2-7 [6, 1999, 100, 1] (recursive)
	│ └─Sequential: 2-8 [6, 1999, 100, 255] (recursive)
	│ │ └─Linear: 3-7 [6, 1999, 100, 255] (recursive)
	│ │ └─SinActivation: 3-8 [6, 1999, 100, 255] --
	├─Linear: 1-7 [1199400, 256] (recursive)
	├─LayerNorm: 1-8 [6, 1999, 100, 256] (recursive)
	├─TransformerEncoder: 1-9 [11994, 100, 256] --
	│ └─ModuleList: 2-9 -- --
	│ │ └─TransformerEncoderLayer: 3-9 [11994, 100, 256] 1,315,072
	│ │ └─TransformerEncoderLayer: 3-10 [11994, 100, 256] 1,315,072
	│ │ └─TransformerEncoderLayer: 3-11 [11994, 100, 256] 1,315,072
	│ │ └─TransformerEncoderLayer: 3-12 [11994, 100, 256] 1,315,072
	├─AttentionOperator: 1-10 [11994, 1, 256] --
	│ └─ModuleList: 2-10 -- --
	│ │ └─ResidualAttentionLayer: 3-13 [11994, 1, 256] 1,315,072
	├─TransformerEncoder: 1-11 [6, 1999, 256] --
	│ └─ModuleList: 2-11 -- --
	│ │ └─TransformerEncoderLayer: 3-14 [6, 1999, 256] 1,315,072
	│ │ └─TransformerEncoderLayer: 3-15 [6, 1999, 256] 1,315,072
	├─TransformerDecoder: 1-12 [6, 100, 256] --
	│ └─ModuleList: 2-12 -- --
	│ │ └─TransformerDecoderLayer: 3-16 [6, 100, 256] 1,578,752
	│ │ └─TransformerDecoderLayer: 3-17 [6, 100, 256] 1,578,752
	│ │ └─TransformerDecoderLayer: 3-18 [6, 100, 256] 1,578,752
	│ │ └─TransformerDecoderLayer: 3-19 [6, 100, 256] 1,578,752
	├─Linear: 1-13 [1, 256] 5,888
	├─MLP: 1-14 [600, 1] --
	│ └─Sequential: 2-13 [600, 1] --
	│ │ └─Linear: 3-20 [600, 256] 131,328
	│ │ └─GELU: 3-21 [600, 256] --
	│ │ └─Dropout: 3-22 [600, 256] --
	│ │ └─Linear: 3-23 [600, 256] 65,792
	│ │ └─GELU: 3-24 [600, 256] --
	│ │ └─Dropout: 3-25 [600, 256] --
	│ │ └─Linear: 3-26 [600, 1] 257
	├─MLP: 1-15 [600, 1] --
	│ └─Sequential: 2-14 [600, 1] --
	│ │ └─Linear: 3-27 [600, 256] 131,328
	│ │ └─GELU: 3-28 [600, 256] --
	│ │ └─Dropout: 3-29 [600, 256] --
	│ │ └─Linear: 3-30 [600, 256] 65,792
	│ │ └─GELU: 3-31 [600, 256] --
	│ │ └─Dropout: 3-32 [600, 256] --
	│ │ └─Linear: 3-33 [600, 1] 257
	├─MLP: 1-16 [600, 1] --
	│ └─Sequential: 2-15 [600, 1] --
	│ │ └─Linear: 3-34 [600, 256] 131,328
	│ │ └─GELU: 3-35 [600, 256] --
	│ │ └─Dropout: 3-36 [600, 256] --
	│ │ └─Linear: 3-37 [600, 256] 65,792
	│ │ └─GELU: 3-38 [600, 256] --
	│ │ └─Dropout: 3-39 [600, 256] --
	│ │ └─Linear: 3-40 [600, 1] 257
	===============================================================================================
	Total params: 16,126,211
	Trainable params: 16,126,211
	Non-trainable params: 0
	Total mult-adds (Units.GIGABYTES): 70.54
	===============================================================================================
	Input size (MB): 28.96
	Forward/backward pass size (MB): 118787.71
	Params size (MB): 48.71
	Estimated Total Size (MB): 118865.38
	===============================================================================================