HKBU NLP Lab

university

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

danielhzlin submitted a paper 5 days ago

From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms

danielhzlin submitted a paper 11 days ago

Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling

Ziyang updated a Space 4 months ago

HKBU-NLP/README

View all activity

Papers

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

MM-CRITIC: A Holistic Evaluation of Large Multimodal Models as Multimodal Critique

View all Papers

submitted a paper to Daily Papers 5 days ago

From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms

Paper • 2605.06716 • Published 9 days ago • 5

submitted a paper to Daily Papers 11 days ago

Talker-T2AV: Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling

Paper • 2604.23586 • Published 20 days ago • 3

updated a Space 4 months ago

README

submitted a paper to Daily Papers 4 months ago

Towards Comprehensive Stage-wise Benchmarking of Large Language Models in Fact-Checking

Paper • 2601.02669 • Published Jan 6 • 4

authored a paper 4 months ago

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Paper • 2601.03559 • Published Jan 7 • 14

authored a paper 4 months ago

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Paper • 2601.03559 • Published Jan 7 • 14

submitted a paper to Daily Papers 4 months ago

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Paper • 2601.03559 • Published Jan 7 • 14

authored 4 papers 6 months ago

ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use

Paper • 2504.07981 • Published Apr 4, 2025 • 5

AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness

Paper • 2507.01702 • Published Jul 2, 2025 • 4

FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models

Paper • 2502.17924 • Published Feb 25, 2025

AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries?

Paper • 2508.09631 • Published Aug 13, 2025

authored a paper 7 months ago

EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty

Paper • 2510.00732 • Published Oct 1, 2025 • 6

authored a paper 7 months ago

EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty

Paper • 2510.00732 • Published Oct 1, 2025 • 6

authored a paper 9 months ago

MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers

Paper • 2508.14704 • Published Aug 20, 2025 • 43

authored 4 papers 10 months ago

Aria-UI: Visual Grounding for GUI Instructions

Paper • 2412.16256 • Published Dec 20, 2024 • 1

ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use

Paper • 2504.07981 • Published Apr 4, 2025 • 5

Mercury: Ultra-Fast Language Models Based on Diffusion

Paper • 2506.17298 • Published Jun 17, 2025 • 10

GTA1: GUI Test-time Scaling Agent

Paper • 2507.05791 • Published Jul 8, 2025 • 27

authored a paper about 1 year ago

ScratchEval: Are GPT-4o Smarter than My Child? Evaluating Large Multimodal Models with Visual Programming Challenges

Paper • 2411.18932 • Published Nov 28, 2024 • 1

in HKBU-NLP/GOAT-Bench about 1 year ago

Add task category

#3 opened about 1 year ago by