--- license: mit ---

# Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning Orion-MSP is a tabular foundation model for in-context learning. It uses multi-scale sparse attention and Perceiver-style memory to process tabular data at multiple granularities, capturing both local feature interactions and global dataset-level patterns. OrionMSP can be used either directly via its own Python package or through [TabTune](https://github.com/Lexsi-Labs/TabTune), which provides a unified interface over several tabular foundation models. ## Key Features - **Multi-Scale Sparse Attention:** Processes features at three levels (scales 1, 4, 16) using windowed, global, and random attention patterns, reducing quadratic complexity to near-linear. - **Hierarchical Feature Understanding:** Captures patterns from individual cells to feature groups through scale-aware attention. - **Perceiver-Style Memory:** Cross-component memory that compresses dataset information for efficient processing across samples - **Memory-Efficient:** Block-sparse masking enables efficient processing of large tabular datasets - **Scikit-learn Compatible:** Drop-in replacement with .fit() and .predict() methods ## Architecture Orion-MSP consists of four main components: - **Column-wise Embedding:** Distribution-aware feature embeddings using Induced Set Attention Blocks (ISAB) - **Multi-Scale Row Interaction:** Sparse attention with windowed, global, and random patterns across multiple scales - **Cross-Component Memory:** Perceiver-style memory for efficient dataset-level context - **Dataset-wise ICL:** Enhanced predictor leveraging enriched representations for few-shot tabular classification ## Performance

Performance comparison across three benchmark suites—TALENT, OpenML-CC18, and TabZilla. Ranks are mean ranks based on accuracy (lower is better). Metrics: ACC = Accuracy, F1 = Weighted F1. 1st; 2nd.
Models	All	TALENT			OpenML-CC18			TabZilla
Models	Rank	Rank	ACC	F1	Rank	ACC	F1	Rank	ACC	F1
XGBoost	6.70	6.02	0.8403	0.8360	5.89	0.8558	0.8537	6.07	0.8612	0.8326
CatBoost	6.43	5.57	0.8336	0.8259	6.25	0.8588	0.8520	7.13	0.8579	0.8384
Random Forest	7.38	6.15	0.8285	0.8209	6.36	0.8547	0.8497	8.42	0.8358	0.8399
LightGBM	6.78	6.11	0.8331	0.8245	6.18	0.8581	0.8493	5.25	0.8618	0.8211
TabICL	4.96	4.09	0.8471	0.8379	4.69	0.8667	0.8623	5.89	0.8734	0.8698
OrionBiX	5.37	4.59	0.8346	0.8260	4.98	0.8653	0.8596	4.89	0.8728	0.8628
OrionMSP	3.58	3.26	0.8461	0.8360	4.12	0.8722	0.8676	3.84	0.8821	0.8786
TabPFN	4.61	3.72	0.8514	0.8412	4.76	0.8714	0.8663	4.86	0.8752	0.8716
Mitra	11.77	10.38	0.3921	0.2868	10.52	0.3614	0.2522	11.21	0.3152	0.1830
ContextTab	9.70	9.84	0.5474	0.4596	6.28	0.8639	0.8581	7.13	0.8389	0.8334
TabDPT	5.42	5.19	0.8408	0.8318	4.64	0.8672	0.8625	3.94	0.8814	0.8775

Orion-MSP is the most consistent top performer across all three benchmarks, achieving the best overall rank. - On TALENT, it ranks **1** overall, while TabPFN edges the highest ACC/F1 by a hair. - On OpenML-CC18, Orion-MSP attains the top ACC/F1 (0.8722/0.8676), narrowly ahead of TabPFN and TabDPT. - On TabZilla, it leads with the highest ACC/F1 and the best rank. - Classical baselines (XGBoost/LightGBM/CatBoost/RF) trail noticeably, highlighting Orion-MSP’s robustness across diverse tabular tasks.

Performance variation by dataset size across all benchmark suites. Rank = mean rank by accuracy (lower is better). ACC = Accuracy; F1 = Weighted F1. Size buckets: Small (<1K), Medium (1K–10K), Large (>10K).
Models	Small (<1K)			Medium (1K–10K)			Large (>10K)
Models	Rank	ACC	F1	Rank	ACC	F1	Rank	ACC	F1
XGBoost	7.70	0.8168	0.7964	6.88	0.8363	0.8314	5.41	0.8969	0.8920
CatBoost	7.88	0.8124	0.7935	6.47	0.8340	0.8264	5.48	0.8797	0.8733
Random Forest	8.55	0.7988	0.8187	7.16	0.8285	0.8221	7.30	0.8694	0.8628
LightGBM	7.80	0.8143	0.7789	6.94	0.8314	0.8226	5.63	0.8827	0.8764
TabICL	6.04	0.8301	0.8338	4.77	0.8486	0.8398	4.61	0.8802	0.8743
OrionBiX	6.32	0.8330	0.8150	5.48	0.8348	0.8260	4.42	0.8729	0.8670
OrionMSP	5.93	0.8232	0.8194	3.70	0.8494	0.8402	3.04	0.8843	0.8768
TabPFN	6.50	0.8325	0.8131	3.81	0.8557	0.8462	5.73	0.8783	0.8713
Mitra	13.88	0.4334	0.3236	11.59	0.3600	0.2553	11.11	0.3837	0.2754
ContextTab	9.60	0.7578	0.7363	9.52	0.6210	0.5566	10.22	0.6388	0.5638
TabDPT	5.48	0.8333	0.8271	5.40	0.8424	0.8339	5.26	0.8831	0.8765

OrionMSP is the most consistent top-ranked model as data grows (especially Medium/Large), while TabPFN peaks on Medium and GBDTs (e.g., XGBoost) catch up in raw ACC/F1 on Large.

Performance vs. feature dimensionality. Rank = mean accuracy rank (lower is better). ACC = Accuracy; F1 = Weighted F1. Groups: Narrow (<10), Medium (10–100), Wide (>100). 1st ; 2nd within each group.
Models	Narrow (<10)			Medium (10–100)			Wide (>100)
Models	Rank	ACC	F1	Rank	ACC	F1	Rank	ACC	F1
XGBoost	6.77	0.8222	0.8159	6.90	0.8482	0.8410	4.79	0.9140	0.9039
CatBoost	5.63	0.8145	0.8067	6.88	0.8441	0.8344	5.50	0.9157	0.9084
Random Forest	7.15	0.8005	0.7044	7.44	0.8410	0.8235	7.52	0.9034	0.8936
LightGBM	6.15	0.8128	0.7907	6.92	0.8458	0.8326	7.47	0.8999	0.8908
TabICL	5.14	0.8208	0.8119	4.61	0.8627	0.8549	6.46	0.9101	0.8936
OrionBiX	4.64	0.8112	0.8043	5.46	0.8510	0.8417	6.73	0.8859	0.8849
OrionMSP	3.76	0.8394	0.8314	4.09	0.8572	0.8478	5.69	0.8860	0.8837
TabPFN	5.30	0.8187	0.8092	4.07	0.8676	0.8589	6.141	0.9129	0.9111
Mitra	11.25	0.3737	0.2683	11.84	0.3886	0.2781	13.03	0.2521	0.1497
ContextTab	9.52	0.6391	0.5719	9.59	0.6480	0.5843	10.97	0.6017	0.5651
TabDPT	4.66	0.8262	0.8189	5.45	0.8566	0.8483	7.23	0.8845	0.8820

OrionMSP excels on narrow and stays strong on medium width, while TabPFN dominates medium-width features and GBDTs (XGBoost/CatBoost) shine on wide feature spaces. ## Usage ### Direct (OrionMSP Python package) ```python from orion_msp.sklearn import OrionMSPClassifier # Initialize and use clf = OrionMSPClassifier() clf.fit(X_train, y_train) predictions = clf.predict(X_test) ``` This code will automatically download the pre-trained model from Hugging Face and use a GPU if available. ### Via TabTune (unified TFM library) ```python from tabtune import TabularPipeline pipeline = TabularPipeline( model_name="OrionMSP", # use OrionMSP through TabTune tuning_strategy="inference", # zero-shot / in-context mode tuning_params={"device": "cuda"} # or "cpu" ) pipeline.fit(X_train, y_train) predictions = pipeline.predict(X_test) ``` When used through TabTune, the OrionMSP weights are automatically downloaded from this Hugging Face repository on first use, and TabTune handles model-aware preprocessing for you. ## Installation ### Via TabTune (recommended if you want multiple tabular FMs) ```bash pip install tabtune ``` This installs TabTune and its built-in OrionMSP integration; no separate orion-msp install is required. ### From the OrionMSP source #### Option 1: From the local clone ```bash cd orion-msp pip install -e . ``` #### Option 2: From the Git Remote ```bash pip install git+https://github.com/Lexsi-Labs/Orion-MSP.git ``` ## Citation If you use Orion-MSP, please cite our [paper](https://arxiv.org/abs/2511.02818): ```bibtex @article{bouadi25orionmsp, title={Orion-MSP: Multi-Scale Sparse Attention for Tabular In-Context Learning}, author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu}, year={2025} eprint={2511.02818}, archivePrefix={arXiv}, primaryClass={cs.AI}, url={https://arxiv.org/abs/2511.02818}, } ```