|
|
--- |
|
|
license: mit |
|
|
base_model: |
|
|
- meta-llama/Meta-Llama-3-8B-Instruct |
|
|
datasets: |
|
|
- HuggingFaceH4/ultrachat_200k |
|
|
- walledai/HarmBench |
|
|
language: |
|
|
- en |
|
|
new_version: ASSELab/Diffusion-Llama-3-8B-Instruct |
|
|
tags: |
|
|
- pytorch |
|
|
- llama |
|
|
- llama-3 |
|
|
- DAT |
|
|
- robust |
|
|
- adversarial |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
# DAT - Distributional Adversarial Training |
|
|
|
|
|
[](...) |
|
|
[](https://github.com/ASSELab/DAT) |
|
|
|
|
|
DAT utilizes [continuous adversarial training](https://arxiv.org/abs/2405.15589) on [diffusion-based](https://arxiv.org/abs/2511.00203v1) adversarial examples to close the gap between empirical and population-robust risk. |
|
|
We fine-tune [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct). |
|
|
|
|
|
#### This model is <u>**NOT**</u> using adversarial training! This is an ablation/baseline using just the diffusion data to fine-tune. |
|
|
|
|
|
For further information, consult our paper []() or repository [https://github.com/ASSELab/DAT](https://github.com/ASSELab/DAT) |
|
|
|
|
|
## Citation |
|
|
|
|
|
```tex |
|
|
@misc{, |
|
|
title={}, |
|
|
author={}, |
|
|
year={2026}, |
|
|
eprint={}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.LG} |
|
|
} |
|
|
``` |