Charles Paul's picture

Charles Paul

plywood1643

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

upvoted a paper about 1 month ago

SketchVLM: Vision language models can annotate images to explain thoughts and guide users

upvoted a paper about 1 month ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

View all activity

Organizations

None yet

upvoted 20 papers about 1 month ago

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

Paper • 2604.23781 • Published Apr 26 • 33

SketchVLM: Vision language models can annotate images to explain thoughts and guide users

Paper • 2604.22875 • Published Apr 23 • 35

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published Apr 27 • 71

Why Fine-Tuning Encourages Hallucinations and How to Fix It

Paper • 2604.15574 • Published Apr 16 • 25

Rewarding the Scientific Process: Process-Level Reward Modeling for Agentic Data Analysis

Paper • 2604.24198 • Published Apr 27 • 22

UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models

Paper • 2604.17565 • Published Apr 19 • 10

Sapiens2

Paper • 2604.21681 • Published Apr 23 • 19

For-Value: Efficient Forward-Only Data Valuation for finetuning LLMs and VLMs

Paper • 2508.10180 • Published Apr 25 • 18

Efficient Agent Evaluation via Diversity-Guided User Simulation

Paper • 2604.21480 • Published Apr 23 • 15

Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment

Paper • 2604.19548 • Published Apr 21 • 16

Zero-to-CAD: Agentic Synthesis of Interpretable CAD Programs at Million-Scale Without Real Data

Paper • 2604.24479 • Published Apr 27 • 9

PageGuide: Browser extension to assist users in navigating a webpage and locating information

Paper • 2604.23772 • Published Apr 26 • 7

Learning to Identify Out-of-Distribution Objects for 3D LiDAR Anomaly Segmentation

Paper • 2604.23604 • Published Apr 26 • 6

ATTN-FIQA: Interpretable Attention-based Face Image Quality Assessment with Vision Transformers

Paper • 2604.22841 • Published Apr 21 • 5

RaV-IDP: A Reconstruction-as-Validation Framework for Faithful Intelligent Document Processing

Paper • 2604.23644 • Published Apr 26 • 5

Discovering Agentic Safety Specifications from 1-Bit Danger Signals

Paper • 2604.23210 • Published Apr 25 • 4

EX-FIQA: Leveraging Intermediate Early eXit Representations from Vision Transformers for Face Image Quality Assessment

Paper • 2604.22842 • Published Apr 21 • 3

Towards Understanding the Robustness of Sparse Autoencoders

Paper • 2604.18756 • Published Apr 20 • 10

BARRED: Synthetic Training of Custom Policy Guardrails via Asymmetric Debate

Paper • 2604.25203 • Published Apr 28 • 8

MAIC-UI: Making Interactive Courseware with Generative UI

Paper • 2604.25806 • Published Apr 28 • 8