Synthyra
/

FastESM2_650

Model card Files Files and versions

lhallee commited on Dec 4, 2024

Commit

4f950cb

·

verified ·

1 Parent(s): 1542bb0

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -26,4 +26,8 @@ with torch.no_grad():
 print(embeddings.shape) # (1, 11, 1280)
 ```

 print(embeddings.shape) # (1, 11, 1280)
 ```
+Because we trained in mixed-precision float16, float16 has closer outputs to the float32 weights then bfloat16.
+When summing the MSE of 1000 sequences vs. the float32 weights:
+Average MSE for FP16: 0.00000140
+Average MSE for BF16: 0.00004125