Saurabh Jha

saurabhjha1

5 5

AI & ML interests

None yet

Recent Activity

upvoted an article about 1 month ago

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

liked a dataset about 1 month ago

ibm-research/ITBench-Trajectories

liked a dataset about 1 month ago

ibm-research/ITBench-Lite

View all activity

Organizations

upvoted an article about 1 month ago

Article

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

ibm-research

•

Jun 30

• 26

liked 3 datasets about 1 month ago

upvoted 2 articles 2 months ago

Article

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic

ibm-research

•

Jun 1

• 89

Article

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ibm-research

•

May 27

• 18

published an article 2 months ago

Article

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ibm-research

•

May 27

• 18

liked a Space 5 months ago

ITBench-Lite-Space

🚀

Develop and run interactive code notebooks with JupyterLab

upvoted an article 5 months ago

Article

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

ibm-research

•

Feb 18

• 19

published an article 5 months ago

Article

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

ibm-research

•

Feb 18

• 19

liked a model over 1 year ago

ibm-granite/granite-3.0-8b-base

Text Generation • 8B • Updated Dec 19, 2024 • 5.51k • 26

upvoted a paper over 2 years ago

Larimar: Large Language Models with Episodic Memory Control

Paper • 2403.11901 • Published Mar 18, 2024 • 33

updated a model over 3 years ago

saurabhjha1/ppo-LunarLander-v2

Reinforcement Learning • Updated Jan 16, 2023

Saurabh Jha

AI & ML interests

Recent Activity

Organizations

saurabhjha1's activity

ScarfBench: Benchmarking AI Agents for Enterprise Java Framework Migration

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ITBench-Lite-Space

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST

IBM and UC Berkeley Diagnose Why Enterprise Agents Fail Using IT-Bench and MAST