Spaces:

Jaswanth-K
/

Inject-Arena

Sleeping

App Files Files Community

Inject-Arena

Commit History

Upload frontend/src/index.css with huggingface_hub

343e8dc
verified

Jaswanth-K commited on Apr 26

Upload frontend/src/App.jsx with huggingface_hub

b5b3db2
verified

Jaswanth-K commited on Apr 26

feat: instant demo mode, inline results after attack, 3-option launch modal

3191547

Jaswanth1210 Claude Opus 4.7 commited on Apr 26

feat: training results dashboard with stats cards, comparison table, and plots

474639f

Jaswanth1210 Claude Opus 4.7 commited on Apr 26

feat: launch mode modal with demo vs live options and time estimate

d882fcd

Jaswanth1210 Claude Opus 4.7 commited on Apr 26

feat: battlefield visualization — attacker/firewalls/agent with live reward graph

ff4fe4a

Jaswanth1210 Claude Opus 4.7 commited on Apr 26

feat: fill in real results, fix openenv.yaml Space URL

bc3c044

Jaswanth1210 Claude Opus 4.7 commited on Apr 26

feat: real continuous GRPO reward curve + 5-plot dashboard

723a54a

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 26

feat: launch-mode modal (recorded vs live) + reframed success card

32513fb

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: shorten HF short_description to <= 60 chars

c884702

Jaswanth1210 commited on Apr 25

fix: HF Space URLs point to Jaswanth-K/Inject-Arena (the actual Space)

46d858d

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

feat: wire frontend to real API, ship plots, multi-stage Docker build

5cceafb

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

Phase 7: replay traces for HF Space demo

e60c2d2

colab commited on Apr 25

Phase 7: replay traces for HF Space demo

a4470c7

colab commited on Apr 25

feat: HF Space replay backend — trace store, /api endpoints, Docker, Cell 9

8c536e6

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

feat: 5 training plots (reward±std, KL/loss, completion stats) + Drive backup in Cell 8

ff63792

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: plots gitignore, trainer_state fallback for reward curve, Cell 8 --trainer-state arg

2fe5366

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: Cell 8 — pull before plots, GH_TOKEN push, copy Drive logs

19d8929

Jaswanth1210 commited on Apr 25

Phase 7/8: full README, video script, make_plots.py, Cell 8 implementation

5244e53

Jaswanth1210 Claude Opus 4.7 commited on Apr 25

feat: fill in Cell 7 for full training run (Phase 6)

f94c60c

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: use huggingface_hub.login() before LoRA download to fix 401

013046f

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: pre-download SecAlign LoRA before vLLM init to avoid 401 in subprocess

b5c4619

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: dtype kwarg (torch_dtype deprecated), vLLM max_model_len=4096

aaa7c61

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: max_completion_length 512→128, firewall circuit-breaker

b7d3a14

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: drop BnB 4-bit, load attacker in plain bf16

6aebb94

Jaswanth1210 Claude Opus 4.7 commited on Apr 25

fix: dtype kwarg + cast lm_head/embeds to bfloat16 to fix GRPO generate

33bf00a

Jaswanth1210 commited on Apr 25

fix: add torch_dtype=bfloat16 to prevent Float/BFloat16 mismatch in GRPO

b42adcc

Jaswanth1210 commited on Apr 25

fix: GRPO batch_size must be divisible by num_generations (1→4)

0d411fb

Jaswanth1210 commited on Apr 25

fix: skip Unsloth in GRPO trainer (grpo_accumulated_loss signature mismatch)

17a9ff7

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: stub GuidedDecodingParams for vLLM 0.19+ / TRL compatibility

deab900

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

Phase 5: training pipeline — client, GRPO trainer, eval, baselines (23 handcrafted attacks)

550a83e

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

Phase 4: InjectArenaEnv + FastAPI server + Dockerfile + env tests (81 passing)

b54a031

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: load SecAlign before PG2 in Cell 3 to avoid vLLM CUDA init conflict

3d180ef

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: add %cd /content/injectarena to cells 3+4 (lost after runtime restart)

6207128

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: FirewallWrapper falls back to pg2 instance when llamafirewall scanner fails

d089589

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: LlamaFirewall async handling + conditional AgentAlignment (TOGETHER_API_KEY)

90afc08

Jaswanth1210 commited on Apr 25

fix: SecAlign fallback uses transformers+4bit instead of vLLM (T4 CUDA init conflict)

730165b

Jaswanth1210 commited on Apr 25

fix: PG2 label check — model returns LABEL_0/LABEL_1, not MALICIOUS/BENIGN

4baabbe

Jaswanth1210 commited on Apr 25

Phase 3: defense wrappers + Colab smoke/benchmark cells

a9424d2

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

Fix: widen requires-python to include 3.12

0c220a9

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

Notebook: hard-code REPO_URL to Inject-Arena GitHub

1e979cd

Jaswanth1210 Claude Opus 4.7 commited on Apr 25

Phase 2: verifiers, embedding cache, reward function

c59510c

Jaswanth1210 Claude Opus 4.7 commited on Apr 24

Phase 1: schemas, safety filter, scenario bank

383f8a5

Jaswanth1210 Claude Opus 4.7 commited on Apr 24

Phase 0: bootstrap

15bf5e6

Jaswanth1210 Claude Opus 4.7 commited on Apr 24

Commit History

Upload frontend/src/index.css with huggingface_hub 343e8dc verified

Upload frontend/src/App.jsx with huggingface_hub b5b3db2 verified

feat: instant demo mode, inline results after attack, 3-option launch modal 3191547

feat: training results dashboard with stats cards, comparison table, and plots 474639f

feat: launch mode modal with demo vs live options and time estimate d882fcd

feat: battlefield visualization — attacker/firewalls/agent with live reward graph ff4fe4a

feat: fill in real results, fix openenv.yaml Space URL bc3c044

feat: real continuous GRPO reward curve + 5-plot dashboard 723a54a

feat: launch-mode modal (recorded vs live) + reframed success card 32513fb

fix: shorten HF short_description to <= 60 chars c884702

fix: HF Space URLs point to Jaswanth-K/Inject-Arena (the actual Space) 46d858d

feat: wire frontend to real API, ship plots, multi-stage Docker build 5cceafb

Phase 7: replay traces for HF Space demo e60c2d2

Phase 7: replay traces for HF Space demo a4470c7

feat: HF Space replay backend — trace store, /api endpoints, Docker, Cell 9 8c536e6

feat: 5 training plots (reward±std, KL/loss, completion stats) + Drive backup in Cell 8 ff63792

fix: plots gitignore, trainer_state fallback for reward curve, Cell 8 --trainer-state arg 2fe5366

fix: Cell 8 — pull before plots, GH_TOKEN push, copy Drive logs 19d8929

Phase 7/8: full README, video script, make_plots.py, Cell 8 implementation 5244e53

feat: fill in Cell 7 for full training run (Phase 6) f94c60c

fix: use huggingface_hub.login() before LoRA download to fix 401 013046f

fix: pre-download SecAlign LoRA before vLLM init to avoid 401 in subprocess b5c4619

fix: dtype kwarg (torch_dtype deprecated), vLLM max_model_len=4096 aaa7c61

fix: max_completion_length 512→128, firewall circuit-breaker b7d3a14

fix: drop BnB 4-bit, load attacker in plain bf16 6aebb94

fix: dtype kwarg + cast lm_head/embeds to bfloat16 to fix GRPO generate 33bf00a

fix: add torch_dtype=bfloat16 to prevent Float/BFloat16 mismatch in GRPO b42adcc

fix: GRPO batch_size must be divisible by num_generations (1→4) 0d411fb

fix: skip Unsloth in GRPO trainer (grpo_accumulated_loss signature mismatch) 17a9ff7

fix: stub GuidedDecodingParams for vLLM 0.19+ / TRL compatibility deab900

Phase 5: training pipeline — client, GRPO trainer, eval, baselines (23 handcrafted attacks) 550a83e

Phase 4: InjectArenaEnv + FastAPI server + Dockerfile + env tests (81 passing) b54a031

fix: load SecAlign before PG2 in Cell 3 to avoid vLLM CUDA init conflict 3d180ef

fix: add %cd /content/injectarena to cells 3+4 (lost after runtime restart) 6207128

fix: FirewallWrapper falls back to pg2 instance when llamafirewall scanner fails d089589

fix: LlamaFirewall async handling + conditional AgentAlignment (TOGETHER_API_KEY) 90afc08

fix: SecAlign fallback uses transformers+4bit instead of vLLM (T4 CUDA init conflict) 730165b

fix: PG2 label check — model returns LABEL_0/LABEL_1, not MALICIOUS/BENIGN 4baabbe

Phase 3: defense wrappers + Colab smoke/benchmark cells a9424d2

Fix: widen requires-python to include 3.12 0c220a9

Notebook: hard-code REPO_URL to Inject-Arena GitHub 1e979cd

Phase 2: verifiers, embedding cache, reward function c59510c

Phase 1: schemas, safety filter, scenario bank 383f8a5

Phase 0: bootstrap 15bf5e6

Upload frontend/src/index.css with huggingface_hub

343e8dc
verified

Upload frontend/src/App.jsx with huggingface_hub

b5b3db2
verified

feat: instant demo mode, inline results after attack, 3-option launch modal

3191547

feat: training results dashboard with stats cards, comparison table, and plots

474639f

feat: launch mode modal with demo vs live options and time estimate

d882fcd

feat: battlefield visualization — attacker/firewalls/agent with live reward graph

ff4fe4a

feat: fill in real results, fix openenv.yaml Space URL

bc3c044

feat: real continuous GRPO reward curve + 5-plot dashboard

723a54a

feat: launch-mode modal (recorded vs live) + reframed success card

32513fb

fix: shorten HF short_description to <= 60 chars

c884702

fix: HF Space URLs point to Jaswanth-K/Inject-Arena (the actual Space)

46d858d

feat: wire frontend to real API, ship plots, multi-stage Docker build

5cceafb

Phase 7: replay traces for HF Space demo

e60c2d2

Phase 7: replay traces for HF Space demo

a4470c7

feat: HF Space replay backend — trace store, /api endpoints, Docker, Cell 9

8c536e6

feat: 5 training plots (reward±std, KL/loss, completion stats) + Drive backup in Cell 8

ff63792

fix: plots gitignore, trainer_state fallback for reward curve, Cell 8 --trainer-state arg

2fe5366

fix: Cell 8 — pull before plots, GH_TOKEN push, copy Drive logs

19d8929

Phase 7/8: full README, video script, make_plots.py, Cell 8 implementation

5244e53

feat: fill in Cell 7 for full training run (Phase 6)

f94c60c

fix: use huggingface_hub.login() before LoRA download to fix 401

013046f

fix: pre-download SecAlign LoRA before vLLM init to avoid 401 in subprocess

b5c4619

fix: dtype kwarg (torch_dtype deprecated), vLLM max_model_len=4096

aaa7c61

fix: max_completion_length 512→128, firewall circuit-breaker

b7d3a14

fix: drop BnB 4-bit, load attacker in plain bf16

6aebb94

fix: dtype kwarg + cast lm_head/embeds to bfloat16 to fix GRPO generate

33bf00a

fix: add torch_dtype=bfloat16 to prevent Float/BFloat16 mismatch in GRPO

b42adcc

fix: GRPO batch_size must be divisible by num_generations (1→4)

0d411fb

fix: skip Unsloth in GRPO trainer (grpo_accumulated_loss signature mismatch)

17a9ff7

fix: stub GuidedDecodingParams for vLLM 0.19+ / TRL compatibility

deab900

Phase 5: training pipeline — client, GRPO trainer, eval, baselines (23 handcrafted attacks)

550a83e

Phase 4: InjectArenaEnv + FastAPI server + Dockerfile + env tests (81 passing)

b54a031

fix: load SecAlign before PG2 in Cell 3 to avoid vLLM CUDA init conflict

3d180ef

fix: add %cd /content/injectarena to cells 3+4 (lost after runtime restart)

6207128

fix: FirewallWrapper falls back to pg2 instance when llamafirewall scanner fails

d089589

fix: LlamaFirewall async handling + conditional AgentAlignment (TOGETHER_API_KEY)

90afc08

fix: SecAlign fallback uses transformers+4bit instead of vLLM (T4 CUDA init conflict)

730165b

fix: PG2 label check — model returns LABEL_0/LABEL_1, not MALICIOUS/BENIGN

4baabbe

Phase 3: defense wrappers + Colab smoke/benchmark cells

a9424d2

Fix: widen requires-python to include 3.12

0c220a9

Notebook: hard-code REPO_URL to Inject-Arena GitHub

1e979cd

Phase 2: verifiers, embedding cache, reward function

c59510c

Phase 1: schemas, safety filter, scenario bank

383f8a5

Phase 0: bootstrap

15bf5e6