Flax Community

non-profit

https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects

AI & ML interests

JAX, Flax, TPU, 🤗

Recent Activity

shahrukhx01 authored a paper 15 days ago

Transformers for molecular property prediction: Domain adaptation efficiently improves performance

gagan3012 authored a paper about 1 month ago

Who Annotates in NLP? A Large-scale Assessment of Human Annotation Reporting between 2018 and 2025

gagan3012 submitted a paper about 1 month ago

Who Annotates in NLP? A Large-scale Assessment of Human Annotation Reporting between 2018 and 2025

View all activity

authored a paper 15 days ago

Transformers for molecular property prediction: Domain adaptation efficiently improves performance

Paper • 2503.03360 • Published Mar 5, 2025 • 1

authored a paper 18 days ago

Position: Hippocampal Explicit Memory Is the Cornerstone for AGI

Paper • 2606.11245 • Published 29 days ago

authored 14 papers about 1 month ago

Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?

Paper • 2402.11597 • Published Feb 18, 2024

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Paper • 2406.05761 • Published Jun 9, 2024 • 3

BenchHub: A Unified Benchmark Suite for Holistic and Customizable LLM Evaluation

Paper • 2506.00482 • Published May 31, 2025 • 8

From KMMLU-Redux to KMMLU-Pro: A Professional Korean Benchmark Suite for LLM Evaluation

Paper • 2507.08924 • Published Jul 11, 2025 • 18

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

Paper • 2510.04230 • Published Oct 5, 2025 • 27

Revisiting the Uniform Information Density Hypothesis in LLM Reasoning Traces

Paper • 2510.06953 • Published Oct 8, 2025 • 9

Ko-PIQA: A Korean Physical Commonsense Reasoning Dataset with Cultural Context

Paper • 2509.11303 • Published Sep 14, 2025

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Paper • 2601.06165 • Published Jan 7 • 16

Global PIQA: Evaluating Physical Commonsense Reasoning Across 100+ Languages and Cultures

Paper • 2510.24081 • Published Oct 28, 2025 • 24

Judging What We Cannot Solve: A Consequence-Based Approach for Oracle-Free Evaluation of Research-Level Math

Paper • 2602.06291 • Published Feb 6 • 24

KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context

Paper • 2604.13058 • Published Mar 18 • 2

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published May 9 • 82

Self-Improving CAD Generation Agents with Finite Element Analysis as Feedback

Paper • 2605.17448 • Published May 17 • 19

ResearchMath-14K: Scaling Research-Level Mathematics via Agents

Paper • 2605.28003 • Published May 27 • 50

submitted 2 papers to Daily Papers about 1 month ago

ResearchMath-14K: Scaling Research-Level Mathematics via Agents

Paper • 2605.28003 • Published May 27 • 50

Chartographer: Counterfactual Chart Generation for Evaluating Vision-Language Models

Paper • 2605.27311 • Published May 26 • 3

submitted a paper to Daily Papers about 2 months ago

GLiNER-Relex: A Unified Framework for Joint Named Entity Recognition and Relation Extraction

Paper • 2605.10108 • Published May 11 • 1

submitted a paper to Daily Papers about 2 months ago

Soohak: A Mathematician-Curated Benchmark for Evaluating Research-level Math Capabilities of LLMs

Paper • 2605.09063 • Published May 9 • 82