Text Generation
Transformers
Safetensors
PyTorch
English
llama
llama-3
DAT
robust
adversarial
conversational
text-generation-inference
JonasDornbusch commited on
Commit
64f3d3a
·
verified ·
1 Parent(s): 4440194

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -3
README.md CHANGED
@@ -1,3 +1,44 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model:
4
+ - meta-llama/Meta-Llama-3-8B-Instruct
5
+ datasets:
6
+ - HuggingFaceH4/ultrachat_200k
7
+ - walledai/HarmBench
8
+ language:
9
+ - en
10
+ new_version: ASSELab/DAT-Llama-3-8B-Instruct
11
+ tags:
12
+ - pytorch
13
+ - llama
14
+ - llama-3
15
+ - DAT
16
+ - robust
17
+ - adversarial
18
+ library_name: transformers
19
+ ---
20
+
21
+ # DAT - Distributional Adversarial Training
22
+
23
+ [![arXiv](https://img.shields.io/badge/arXiv-2511.04316-b31b1b.svg)](...)
24
+ [![GitHub](https://img.shields.io/badge/GitHub-DAT-181717?logo=github&logoColor=white)](https://github.com/ASSELab/DAT)
25
+
26
+ DAT utilizes [continuous adversarial training](https://arxiv.org/abs/2405.15589) on [diffusion-based](https://arxiv.org/abs/2511.00203v1) adversarial examples to close the gap between empirical and population-robust risk.
27
+ We fine-tune [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
28
+
29
+ #### This model is <u>**NOT**</u> using adversarial training! This is an ablation/baseline using just the diffusion data to fine-tune.
30
+
31
+ For further information, consult our paper []() or repository [https://github.com/ASSELab/DAT](https://github.com/ASSELab/DAT)
32
+
33
+ ## Citation
34
+
35
+ ```tex
36
+ @misc{,
37
+ title={},
38
+ author={},
39
+ year={2026},
40
+ eprint={},
41
+ archivePrefix={arXiv},
42
+ primaryClass={cs.LG}
43
+ }
44
+ ```