|
|
--- |
|
|
model-index: |
|
|
- name: netFound-640M-base |
|
|
results: |
|
|
- task: |
|
|
type: fill-mask |
|
|
metrics: |
|
|
- name: Macro MLM F1 |
|
|
type: f1 |
|
|
value: 0.4038 |
|
|
- name: Weighted MLM F1 |
|
|
type: f1 |
|
|
value: 0.8451 |
|
|
- name: MLM Accuracy |
|
|
type: accuracy |
|
|
value: 0.8514 |
|
|
- name: Swapped Weighted F1 |
|
|
type: f1 |
|
|
value: 0.9605 |
|
|
- name: Perplexity |
|
|
type: perplexity |
|
|
value: 6.5842 |
|
|
--- |
|
|
|
|
|
# netFound-640M-base |
|
|
|
|
|
## Description |
|
|
|
|
|
netFound is a network traffic foundation model that uses transformer architecture and includes a pretraining phase on unlabeled data to achieve high results. |
|
|
|
|
|
Key features: |
|
|
- netFound takes raw PCAP data as input |
|
|
- netFound can (and need) be pretrained on the unlabeled dataset |
|
|
- netFound uses Hierarchical Transformer architecture to take into account packet burst and flow behavior |
|
|
- netFound uses burst metadata (inter arrival time, number of bytes per burst, etc) |
|
|
|
|
|
## Source code |
|
|
|
|
|
https://github.com/SNL-UCSB/netfound |
|
|
|
|
|
## Pretraining dataset |
|
|
|
|
|
For pretraining, we used a private real-world dataset consisting of more than 450mln network flows. The model was pretrained for approximately 1 epoch (iterated through ~480mln flows). |
|
|
|
|
|
## Checkpoint |
|
|
|
|
|
Model: Large (16 heads, 24 hidden layers, 1024 hidden size) |
|
|
Total params: 643,825,672 |
|
|
January 17, 2025 |
|
|
|
|
|
## Paper |
|
|
https://arxiv.org/abs/2310.17025 |