LaughingLogits commited on
Commit
f173236
·
verified ·
1 Parent(s): 92ff78f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -4,3 +4,11 @@ This Model is currently anonymized during the paper review process.
4
  The AP-MAE transformer model design and configuration is available at: https://github.com/LaughingLogits/attention-astronaut
5
 
6
  This version of AP-MAE is trained on attention heads generated by StarCoder2-3B during inference. The inference task used for generating attention outputs is FiM token prediction for a random 3-10 length masked section of Java code, with exactly 256 tokens of surrounding context.
 
 
 
 
 
 
 
 
 
4
  The AP-MAE transformer model design and configuration is available at: https://github.com/LaughingLogits/attention-astronaut
5
 
6
  This version of AP-MAE is trained on attention heads generated by StarCoder2-3B during inference. The inference task used for generating attention outputs is FiM token prediction for a random 3-10 length masked section of Java code, with exactly 256 tokens of surrounding context.
7
+
8
+ # Usage:
9
+ ```
10
+ from ap_mae import APMAE
11
+ model = APMAE.from_pretrained(
12
+ "LaughingLogits/AP-MAE-SC2-3B"
13
+ )
14
+ ```