Spaces:

Jaswanth-K
/

Inject-Arena

Sleeping

App Files Files Community

Inject-Arena / train

Commit History

fix: dtype kwarg (torch_dtype deprecated), vLLM max_model_len=4096

aaa7c61

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: max_completion_length 512→128, firewall circuit-breaker

b7d3a14

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: drop BnB 4-bit, load attacker in plain bf16

6aebb94

Jaswanth1210 Claude Opus 4.7 commited on Apr 25

fix: dtype kwarg + cast lm_head/embeds to bfloat16 to fix GRPO generate

33bf00a

Jaswanth1210 commited on Apr 25

fix: add torch_dtype=bfloat16 to prevent Float/BFloat16 mismatch in GRPO

b42adcc

Jaswanth1210 commited on Apr 25

fix: GRPO batch_size must be divisible by num_generations (1→4)

0d411fb

Jaswanth1210 commited on Apr 25

fix: skip Unsloth in GRPO trainer (grpo_accumulated_loss signature mismatch)

17a9ff7

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

fix: stub GuidedDecodingParams for vLLM 0.19+ / TRL compatibility

deab900

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

Phase 5: training pipeline — client, GRPO trainer, eval, baselines (23 handcrafted attacks)

550a83e

Jaswanth1210 Claude Sonnet 4.6 commited on Apr 25

Phase 0: bootstrap

15bf5e6

Jaswanth1210 Claude Opus 4.7 commited on Apr 24