| | --- |
| | library_name: transformers |
| | tags: [] |
| | --- |
| | |
| | # AP-MAE-SC2-7B |
| | This Model is currently anonymized during the paper review process. |
| |
|
| | The AP-MAE transformer model design and configuration is available in the reproduction package attached to the submission |
| |
|
| | This version of AP-MAE is trained on attention heads generated by StarCoder2-7B during inference. The inference task used for generating attention outputs is FiM token prediction for a random 3-10 length masked section of Java code, with exactly 256 tokens of surrounding context. |
| |
|
| | # Usage: |
| | ``` |
| | from ap_mae import APMAE |
| | model = APMAE.from_pretrained( |
| | "LaughingLogits/AP-MAE-SC2-7B" |
| | ) |
| | ``` |