Upload pickle due to diff vs. safetensors

As I noticed (small) differences in e.g. attention heatmaps if loading from .safetensors vs. full model pickle (.pt), I am sharing the original torch.save model. All evals / benchmarks were done on the pickle.

Files changed (2) hide show

ViT-L-14-REG-GATED-balanced-ckpt12.pt +3 -0
ViT-L-14-REG-GATED-xtreme-ckpt20.pt +3 -0

ViT-L-14-REG-GATED-balanced-ckpt12.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3302e00a9fd691783423e20b4b35184546baae1d8203d4fff23d4956ebe9f8e1
+size 1811525394

ViT-L-14-REG-GATED-xtreme-ckpt20.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2b10359f2428b7d6edb8cf5ea88271d20c6d0c230a4fed7c62f8c884c910bbd6
+size 1811525394