Papers
arxiv:2606.09004

LATTEArena: An Evaluation Framework for LLM-powered Tabular Feature Engineering (Extended Version)

Published on Jun 16
Authors:
,
,
,

Abstract

Latent feature engineering methods are systematically analyzed and benchmarked through a standardized framework that evaluates cost-effectiveness and deployment recommendations.

Feature engineering remains a cornerstone of tabular data analysis, and Large Language Models (LLMs) have emerged as a promising paradigm for its automation, giving rise to LLM-powered Automated Tabular Feature Engineering (LATTE). However, the field lacks standardized, cost-aware evaluation platforms, and the combinatorial explosion of design choices obscures true algorithmic progress. To bridge these gaps, we systematically deconstruct 15 representative LATTE methods into a unified 6-dimensional taxonomy. Based on this abstraction, we introduce LATTEArena, a standardized, modular, and extensible benchmarking framework that decouples monolithic pipelines into reusable execution blocks. By distilling the massive combinatorial space, we evaluate 24 core LATTE configurations across 7 research questions. Our head-to-head benchmarking goes beyond predictive accuracy to quantify token efficiency and execution robustness, yielding 17 empirical findings on cost-effectiveness trade-offs. Furthermore, we provide 3 concrete recommendations for optimal real-world deployment. By enabling controlled component-level comparisons, LATTEArena shifts the paradigm from ad-hoc prompt engineering to systematic context management. All code, datasets, and over 4,000 execution logs are publicly available to foster a dynamic, community-driven benchmark. Our framework, leaderboard, and all artifacts are hosted on the LATTEArena project website at https://goodenhak.github.io/LATTEArena.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2606.09004
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2606.09004 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2606.09004 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2606.09004 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.