--- model-index: - name: netFound-640M-base results: - task: type: fill-mask metrics: - name: Macro MLM F1 type: f1 value: 0.4038 - name: Weighted MLM F1 type: f1 value: 0.8451 - name: MLM Accuracy type: accuracy value: 0.8514 - name: Swapped Weighted F1 type: f1 value: 0.9605 - name: Perplexity type: perplexity value: 6.5842 --- # netFound-640M-base ## Description netFound is a network traffic foundation model that uses transformer architecture and includes a pretraining phase on unlabeled data to achieve high results. Key features: - netFound takes raw PCAP data as input - netFound can (and need) be pretrained on the unlabeled dataset - netFound uses Hierarchical Transformer architecture to take into account packet burst and flow behavior - netFound uses burst metadata (inter arrival time, number of bytes per burst, etc) ## Source code https://github.com/SNL-UCSB/netfound ## Pretraining dataset For pretraining, we used a private real-world dataset consisting of more than 450mln network flows. The model was pretrained for approximately 1 epoch (iterated through ~480mln flows). ## Checkpoint Model: Large (16 heads, 24 hidden layers, 1024 hidden size) Total params: 643,825,672 January 17, 2025 ## Paper https://arxiv.org/abs/2310.17025