2 2 3

Bhavya

bhavya24

AI & ML interests

None yet

Recent Activity

liked a Space about 4 hours ago

analogy-evaluation/Analogy-Evaluation-Challenge

liked a Space 4 days ago

ibm-research/cuga-apps

liked a dataset 6 days ago

ArtificialAnalysis/ITBench-AA

View all activity

Organizations

liked a Space about 4 hours ago

Analogy Evaluation Challenge

🏆

Shared task from INLG Generation Challenges

liked a Space 4 days ago

CUGA Apps

🚀

Explore and manage your CUGA apps from a web dashboard

liked a dataset 6 days ago

ArtificialAnalysis/ITBench-AA

Viewer • Updated May 27 • 40 • 37.3k • 43

upvoted 2 articles 28 days ago

Article

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

ibm-research

•

May 27

• 17

Article

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic

ibm-research

•

28 days ago

• 88

New activity in ibm-research/ITBench-Trajectories 3 months ago

Task description defined twice in the input

#2 opened 3 months ago by

kyzor

New activity in ibm-research/ITBench-Lite 5 months ago

activities

#1 opened 5 months ago by

bhavya24

activities

#1 opened 5 months ago by

bhavya24

Bhavya

AI & ML interests

Recent Activity

Organizations

bhavya24's activity

Analogy Evaluation Challenge

CUGA Apps

ITBench-AA: Frontier Models Score Below 50% on the First Benchmark for Agentic Enterprise IT Tasks — by Artificial Analysis and IBM

Beyond LLMs: Why Scalable Enterprise AI Adoption Depends on Agent Logic

Task description defined twice in the input

activities

activities