hanoz bhathena
bh9052
AI & ML interests
None yet
Recent Activity
updated a collection about 5 hours ago
Post training updated a collection 1 day ago
Post training updated a collection 1 day ago
Post training Organizations
None yet
CUA
-
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 56 -
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents
Paper • 2605.18652 • Published • 7 -
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
Paper • 2605.12481 • Published • 27
Post training
-
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 69 -
Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR
Paper • 2605.20164 • Published • 5 -
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
Paper • 2605.19577 • Published • 55 -
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
Paper • 2605.18703 • Published • 46
Agent harness
Continual learning
-
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration
Paper • 2605.20025 • Published • 112 -
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 56 -
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation
Paper • 2605.10912 • Published • 45 -
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents
Paper • 2605.13941 • Published • 24
Evaluation
Agent harness
CUA
-
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 56 -
MementoGUI: Learning Agentic Multimodal Memory Control for Long-Horizon GUI Agents
Paper • 2605.18652 • Published • 7 -
ToolCUA: Towards Optimal GUI-Tool Path Orchestration for Computer Use Agents
Paper • 2605.12481 • Published • 27
Continual learning
-
AutoResearchClaw: Self-Reinforcing Autonomous Research with Human-AI Collaboration
Paper • 2605.20025 • Published • 112 -
OpenComputer: Verifiable Software Worlds for Computer-Use Agents
Paper • 2605.19769 • Published • 56 -
WildClawBench: A Benchmark for Real-World, Long-Horizon Agent Evaluation
Paper • 2605.10912 • Published • 45 -
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents
Paper • 2605.13941 • Published • 24
Post training
-
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation
Paper • 2603.19220 • Published • 69 -
Not Every Rubric Teaches Equally: Policy-Aware Rubric Rewards for RLVR
Paper • 2605.20164 • Published • 5 -
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
Paper • 2605.19577 • Published • 55 -
EnvFactory: Scaling Tool-Use Agents via Executable Environments Synthesis and Robust RL
Paper • 2605.18703 • Published • 46