Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
hfuserh
/
LLaMA-3.1-8B-JailbreakSafe
like
0
Text Generation
Transformers
Safetensors
allenai/wildjailbreak
English
llama
jailbreak
safety
alignment
prompt-injection
dpo
lora
large-language-models
arxiv:
2406.18510
arxiv:
2502.18935
License:
llama3.1
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
LLaMA-3.1-8B-JailbreakSafe
Commit History
add DPO Dataset Construction
e62884c
verified
hfuserh
commited on
Feb 3
add DPO Dataset Construction
4ef129f
verified
hfuserh
commited on
Feb 3
first commit
83b7785
verified
hfuserh
commited on
Feb 2
first commit
711ecc0
verified
hfuserh
commited on
Feb 2
Upload 7 files
37f222c
verified
hfuserh
commited on
Feb 2
first commit
d18785b
verified
hfuserh
commited on
Feb 2
first commit
80c5e84
verified
hfuserh
commited on
Feb 2
initial commit
2bd9dfb
verified
hfuserh
commited on
Feb 2