Jim Miller's picture

Jim Miller

ironhide

·

JimMiller-0

AI & ML interests

Security

Recent Activity

liked a model 2 months ago

openai/privacy-filter

liked a model 2 months ago

openai/gpt-oss-safeguard-20b

updated a model 4 months ago

ironhide/shieldgemma-awq-2b

View all activity

Organizations

None yet

liked 2 models 2 months ago

openai/privacy-filter

Token Classification • 1B • Updated Apr 22 • 304k • • 1.67k

openai/gpt-oss-safeguard-20b

Text Generation • 22B • Updated Jan 14 • 106k • • 234

updated a model 4 months ago

ironhide/shieldgemma-awq-2b

Text Generation • 3B • Updated Mar 4 • 8 • 1

liked a model 4 months ago

ironhide/shieldgemma-awq-2b

Text Generation • 3B • Updated Mar 4 • 8 • 1

published a model 4 months ago

ironhide/shieldgemma-awq-2b

Text Generation • 3B • Updated Mar 4 • 8 • 1

upvoted 2 papers 5 months ago

Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails

Paper • 2504.11168 • Published Apr 15, 2025 • 3

Defeating Prompt Injections by Design

Paper • 2503.18813 • Published Mar 24, 2025 • 25

liked a model 8 months ago

google/vaultgemma-1b

Text Generation • 1B • Updated Sep 12, 2025 • 3.79k • 411

liked a model 10 months ago

rogue-security/prompt-injection-jailbreak-sentinel-v2

Text Classification • 0.6B • Updated Mar 11 • 13.2k • 36

liked 8 models about 1 year ago

nvidia/domain-classifier

0.2B • Updated Sep 22, 2025 • 6.66k • 98

garak-llm/toxic-comment-model

67M • Updated Aug 27, 2024 • 26.8k • 2

nvidia/Aegis-AI-Content-Safety-LlamaGuard-Defensive-1.0

Text Classification • Updated Sep 22, 2025 • 3.39k • 33

nvidia/Aegis-AI-Content-Safety-LlamaGuard-Permissive-1.0

Text Classification • Updated Sep 22, 2025 • 1.36k • 19

MerlynMind/merlyn-education-safety

Text Generation • Updated Jun 27, 2023 • 44 • 16

google/shieldgemma-2-4b-it

Image-Text-to-Text • 4B • Updated Apr 4, 2025 • 4.97k • 165

google/shieldgemma-27b

Text Generation • 27B • Updated Sep 6, 2024 • 135 • 29

meta-llama/Llama-Guard-4-12B

Image-Text-to-Text • 12B • Updated Apr 29, 2025 • 152k • • 106

upvoted a collection about 1 year ago

ShieldGemma

ShieldGemma is a family of models for text and image content moderation. • 4 items • Updated Mar 12 • 15

liked 2 datasets about 1 year ago

rogue-security/prompt-injections-benchmark

Viewer • Updated Apr 2 • 5k • 1.26k • 37

facebook/cyberseceval3-visual-prompt-injection

Viewer • Updated Mar 13, 2025 • 1k • 2.03k • 9