Jordan Legg PRO
takarajordan
AI & ML interests
Chief AI Officer @takara.ai. Diffusion, Inference optimisation and all things MultiModal.
Recent Activity
reacted
to
danielhanchen's
post
with π₯
10 days ago
You can now do reinforcement learning training with 7Γ longer context and no accuracy loss, via our new batching algorithms.
Long reasoning chains in RL are costly, but now we enable you to train gpt-oss with GRPO & reach 380K context on a 192GB GPU.
Blog: https://unsloth.ai/docs/new/grpo-long-context