File size: 7,030 Bytes
079f69c
 
051dfeb
079f69c
 
051dfeb
 
 
 
d12a6f6
 
 
 
079f69c
 
 
 
051dfeb
079f69c
 
 
 
 
 
 
 
 
 
051dfeb
 
 
079f69c
051dfeb
079f69c
051dfeb
079f69c
051dfeb
079f69c
 
 
051dfeb
 
 
 
 
 
 
 
079f69c
051dfeb
079f69c
 
051dfeb
079f69c
 
051dfeb
 
 
 
 
 
 
 
079f69c
051dfeb
079f69c
 
051dfeb
 
079f69c
 
 
051dfeb
 
 
 
 
 
079f69c
051dfeb
079f69c
 
 
 
 
 
 
 
 
 
 
 
051dfeb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
079f69c
 
051dfeb
079f69c
 
051dfeb
 
 
 
 
 
 
 
079f69c
051dfeb
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
---
license: mit
pipeline_tag: table-question-answering
---

This model is presented in the paper [Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning](https://huggingface.co/papers/2512.00181).
Authors: Mohamed Bouadi, Pratinav Seth, Aditya Tanna, Vinay Kumar Sankarapu
Project Page: https://www.lexsi.ai/

<div align="center">
  <img src="logo.png" alt="Orion-BiX Logo" width="700"/>
</div>

<div align="center">
  <a href="https://lexsi.ai/">
    <img src="https://img.shields.io/badge/Lexsi-Homepage-FF6B6B?style=for-the-badge" alt="Homepage"/>
  </a>
  <a href="https://huggingface.co/Lexsi/Orion-BiX">
    <img src="https://img.shields.io/badge/πŸ€—%20Hugging%20Face-Lexsi AI-FFD21E?style=for-the-badge" alt="Hugging Face"/>
  </a>
  <a href="https://discord.gg/dSB62Q7A">
    <img src="https://img.shields.io/badge/Discord-Join-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Discord"/>
  </a>
  <a href="https://github.com/Lexsi-Labs/Orion-BiX">
    <img src="https://img.shields.io/badge/GitHub-Repository-181717?style=for-the-badge&logo=github&logoColor=white" alt="GitHub"/>
  </a>
</div>

[![Python 3.9+](https://img.shields.io/badge/python-3.9+-blue.svg)](https://www.python.org/downloads/)
[![PyTorch](https://img.shields.io/badge/PyTorch-2.2+-orange.svg)](https://pytorch.org/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

# Orion-BiX: Bi-Axial Meta-Learning for Tabular In-Context Learning

**[Orion-BiX](https://arxiv.org/abs/2512.00181)** is an advanced tabular foundation model that combines **Bi-Axial Attention** with **Meta-Learning** capabilities for few-shot tabular classification. The model extends the TabICL architecture with alternating attention patterns and episode-based training, achieving state-of-the-art performance on domain-specific benchamrks such as Healthcare and Finance.

## πŸ—οΈ Approach and Architecture

### Key Innovations

Orion-BiX introduces three key architectural innovations:

1.  **Bi-Axial Attention**: Alternating attention patterns (Standard β†’ Grouped β†’ Hierarchical β†’ Relational) that capture multi-scale feature interactions
2.  **Meta-Learning**: Episode-based training with k-NN support selection for few-shot learning
3.  **Configurable Architecture**: Flexible design supporting various attention mechanisms and training modes
4.  **Production Ready**: Memory optimization, distributed training support, and scikit-learn interface

### Component Details

Orion-BiX follows a three-component architecture:

```
Input β†’ Column Embedder (Set Transformer) β†’ Bi-Axial Attention β†’ ICL Predictor β†’ Output
```

1.  **Column Embedder**: Set Transformer for statistical distribution learning across features from TabICL
2.  **Bi-Axial Attention**: Replaces standard RowInteraction with alternating attention patterns:
    -   **Standard Cross-Feature Attention**: Direct attention between features
    -   **Grouped Feature Attention**: Attention within feature groups
    -   **Hierarchical Feature Attention**: Hierarchical feature patterns
    -   **Relational Feature Attention**: Full feature-to-feature attention
    -   **CLS Token Aggregation**: Multiple CLS tokens (default: 4) for feature summarization
3.  **tf_icl ICL Predictor**: In-context learning module for few-shot prediction

Each `BiAxialAttentionBlock` applies four attention patterns in sequence:

```
Standard β†’ Grouped β†’ Hierarchical β†’ Relational β†’ CLS Aggregation
```

## Installation

### Prerequisites

-   Python 3.9-3.12
-   PyTorch 2.2+ (with CUDA support recommended)
-   CUDA-capable GPU (recommended for training)

### From the source

#### Option 1: From the local clone

```bash
cd orion-bix
pip install -e .
```

#### Option 2: From the Git Remote
```bash
pip install git+https://github.com/Lexsi-Labs/Orion-BiX.git
```

## Usage

Orion-BiX provides a scikit-learn compatible interface for easy integration:

```python
from orion_bix.sklearn import OrionBixClassifier

# Initialize and fit the classifier
clf = OrionBixClassifier()

# Fit the model (prepares data transformations)
clf.fit(X_train, y_train)

# Make predictions
predictions = clf.predict(X_test)
probabilities = clf.predict_proba(X_test)
```

## Preprocessing

Orion-BiX includes automatic preprocessing that handles:

1.  **Categorical Encoding**: Automatically encodes categorical features using ordinal encoding
2.  **Missing Value Imputation**: Handles missing values using median imputation for numerical features
3.  **Feature Normalization**: Supports multiple normalization methods:
    -   `"none"`: No normalization
    -   `"power"`: Yeo-Johnson power transform
    -   `"quantile"`: Quantile transformation to normal distribution
    -   `"quantile_rtdl"`: RTDL-style quantile transform
    -   `"robust"`: Robust scaling using median and quantiles
4.  **Outlier Handling**: Clips outliers beyond a specified Z-score threshold (default: 4.0)
5.  **Feature Permutation**: Applies systematic feature shuffling for ensemble diversity:
    -   `"none"`: Original feature order
    -   `"shift"`: Circular shifting
    -   `"random"`: Random permutation
    -   `"latin"`: Latin square patterns (recommended)

The preprocessing is automatically applied during `fit()` and `predict()`, so no manual preprocessing is required.

## Performance

<div align="center">
  <img src="figures/accuracy_ranking_talent.png" alt="Accuracy Ranking TALENT" width="700"/>
</div>

<div align="center">
  <img src="figures/accuracy_ranking_tabzilla.png" alt="Accuracy Ranking TabZilla" width="700"/>
</div>

<div align="center">
  <img src="figures/accuracy_ranking_openml-cc18.png" alt="Accuracy Ranking OPENML-CC18" width="700"/>
</div>

<div align="center">
  <table>
    <tr>
      <td style="padding: 5px;"><img src="figures/relative_acc_improvement_over_tabzilla.png" alt="Relative Improvement over XGBoost on TabZilla" width="700"/></td>
    </tr>
  </table>
</div>

## Citation

If you use Orion-BiX in your research, please cite our [paper](https://arxiv.org/abs/2512.00181):

```bibtex
@article{bouadi2025orionbix,
      title={Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning}, 
      author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu},
      year={2025},
      eprint={2512.00181},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2512.00181}, 
}
```

## License

This project is released under the MIT License. See [LICENSE](LICENSE) for details.

## Contact

For questions, issues, or contributions, please:
- Open an issue on [GitHub](https://github.com/Lexsi-Labs/Orion-BiX/issues)
- Join our [Discord](https://discord.gg/dSB62Q7A) community


## πŸ™ Acknowledgments

Orion-BiX is built on top of [TabICL](https://github.com/soda-inria/tabicl), a tabular foundation model for in-context learning. We gratefully acknowledge the TabICL authors for their foundational work and for making their codebase publicly available.