NetherlandsForensicInstitute
/

ARM64BERT

@@ -3,24 +3,23 @@ license: eupl-1.1
 language: code
 ---
-Model Card - ARM64BERT
-----------
 [GitHub repository](https://github.com/NetherlandsForensicInstitute/asmtransformers)
 ## General
 ### What is the purpose of the model
 The model is a BERT model for ARM64 assembly code. This specific model has NOT been specifically finetuned for semantic similarity, you most likely want
-to use our [other
-model](https://huggingface.co/NetherlandsForensicInstitute/ARM64bert-embedding). The main purpose of the ARM64BERT is to be a baseline
 to compare the finetuned model against.
-### What does the model architecture look like?
 The model architecture is inspired by [jTrans](https://github.com/vul337/jTrans) (Wang et al., 2022). It is a BERT model
 (Devlin et al. 2019),
 although the typical Next Sentence Prediction has been replaced with Jump Target Prediction, as proposed in Wang et al.
-### What is the output of the model?
 The model is a BERT base model, of which the outputs are not meant to be used directly.
 ### How does the model perform?
@@ -67,8 +66,7 @@ either the train or the test set, not both. We have not performed any deduplicat
 The dataset was collected by our team. The annotation of similar/non-similar function comes from the different compilation
 levels, i.e. what we consider "similar functions" is in fact the same function that has been compiled in a different way.
-### Any remarks on data quality and bias?
 The way we classify functions as similar may have implications. For example, sometimes, two different ways of compiling
 the same function does not result in a different piece of code. We did not remove duplicates from the data during training,
 but we did implement checks in the evaluation stage and it seems that the model has not suffered from the simple training

 language: code
 ---
+ARM64BERT 🦾
+------------
 [GitHub repository](https://github.com/NetherlandsForensicInstitute/asmtransformers)
 ## General
 ### What is the purpose of the model
 The model is a BERT model for ARM64 assembly code. This specific model has NOT been specifically finetuned for semantic similarity, you most likely want
+to use our [other model](https://huggingface.co/NetherlandsForensicInstitute/ARM64bert-embedding). The main purpose of the ARM64BERT is to be a baseline
 to compare the finetuned model against.
+### What does the model architecture look like?
 The model architecture is inspired by [jTrans](https://github.com/vul337/jTrans) (Wang et al., 2022). It is a BERT model
 (Devlin et al. 2019),
 although the typical Next Sentence Prediction has been replaced with Jump Target Prediction, as proposed in Wang et al.
+### What is the output of the model?
 The model is a BERT base model, of which the outputs are not meant to be used directly.
 ### How does the model perform?
 The dataset was collected by our team. The annotation of similar/non-similar function comes from the different compilation
 levels, i.e. what we consider "similar functions" is in fact the same function that has been compiled in a different way.
+### Any remarks on data quality and bias?
 The way we classify functions as similar may have implications. For example, sometimes, two different ways of compiling
 the same function does not result in a different piece of code. We did not remove duplicates from the data during training,
 but we did implement checks in the evaluation stage and it seems that the model has not suffered from the simple training