File size: 7,030 Bytes
079f69c 051dfeb 079f69c 051dfeb d12a6f6 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb 079f69c 051dfeb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 | ---
license: mit
pipeline_tag: table-question-answering
---
This model is presented in the paper [Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning](https://huggingface.co/papers/2512.00181).
Authors: Mohamed Bouadi, Pratinav Seth, Aditya Tanna, Vinay Kumar Sankarapu
Project Page: https://www.lexsi.ai/
<div align="center">
<img src="logo.png" alt="Orion-BiX Logo" width="700"/>
</div>
<div align="center">
<a href="https://lexsi.ai/">
<img src="https://img.shields.io/badge/Lexsi-Homepage-FF6B6B?style=for-the-badge" alt="Homepage"/>
</a>
<a href="https://huggingface.co/Lexsi/Orion-BiX">
<img src="https://img.shields.io/badge/π€%20Hugging%20Face-Lexsi AI-FFD21E?style=for-the-badge" alt="Hugging Face"/>
</a>
<a href="https://discord.gg/dSB62Q7A">
<img src="https://img.shields.io/badge/Discord-Join-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Discord"/>
</a>
<a href="https://github.com/Lexsi-Labs/Orion-BiX">
<img src="https://img.shields.io/badge/GitHub-Repository-181717?style=for-the-badge&logo=github&logoColor=white" alt="GitHub"/>
</a>
</div>
[](https://www.python.org/downloads/)
[](https://pytorch.org/)
[](https://opensource.org/licenses/MIT)
# Orion-BiX: Bi-Axial Meta-Learning for Tabular In-Context Learning
**[Orion-BiX](https://arxiv.org/abs/2512.00181)** is an advanced tabular foundation model that combines **Bi-Axial Attention** with **Meta-Learning** capabilities for few-shot tabular classification. The model extends the TabICL architecture with alternating attention patterns and episode-based training, achieving state-of-the-art performance on domain-specific benchamrks such as Healthcare and Finance.
## ποΈ Approach and Architecture
### Key Innovations
Orion-BiX introduces three key architectural innovations:
1. **Bi-Axial Attention**: Alternating attention patterns (Standard β Grouped β Hierarchical β Relational) that capture multi-scale feature interactions
2. **Meta-Learning**: Episode-based training with k-NN support selection for few-shot learning
3. **Configurable Architecture**: Flexible design supporting various attention mechanisms and training modes
4. **Production Ready**: Memory optimization, distributed training support, and scikit-learn interface
### Component Details
Orion-BiX follows a three-component architecture:
```
Input β Column Embedder (Set Transformer) β Bi-Axial Attention β ICL Predictor β Output
```
1. **Column Embedder**: Set Transformer for statistical distribution learning across features from TabICL
2. **Bi-Axial Attention**: Replaces standard RowInteraction with alternating attention patterns:
- **Standard Cross-Feature Attention**: Direct attention between features
- **Grouped Feature Attention**: Attention within feature groups
- **Hierarchical Feature Attention**: Hierarchical feature patterns
- **Relational Feature Attention**: Full feature-to-feature attention
- **CLS Token Aggregation**: Multiple CLS tokens (default: 4) for feature summarization
3. **tf_icl ICL Predictor**: In-context learning module for few-shot prediction
Each `BiAxialAttentionBlock` applies four attention patterns in sequence:
```
Standard β Grouped β Hierarchical β Relational β CLS Aggregation
```
## Installation
### Prerequisites
- Python 3.9-3.12
- PyTorch 2.2+ (with CUDA support recommended)
- CUDA-capable GPU (recommended for training)
### From the source
#### Option 1: From the local clone
```bash
cd orion-bix
pip install -e .
```
#### Option 2: From the Git Remote
```bash
pip install git+https://github.com/Lexsi-Labs/Orion-BiX.git
```
## Usage
Orion-BiX provides a scikit-learn compatible interface for easy integration:
```python
from orion_bix.sklearn import OrionBixClassifier
# Initialize and fit the classifier
clf = OrionBixClassifier()
# Fit the model (prepares data transformations)
clf.fit(X_train, y_train)
# Make predictions
predictions = clf.predict(X_test)
probabilities = clf.predict_proba(X_test)
```
## Preprocessing
Orion-BiX includes automatic preprocessing that handles:
1. **Categorical Encoding**: Automatically encodes categorical features using ordinal encoding
2. **Missing Value Imputation**: Handles missing values using median imputation for numerical features
3. **Feature Normalization**: Supports multiple normalization methods:
- `"none"`: No normalization
- `"power"`: Yeo-Johnson power transform
- `"quantile"`: Quantile transformation to normal distribution
- `"quantile_rtdl"`: RTDL-style quantile transform
- `"robust"`: Robust scaling using median and quantiles
4. **Outlier Handling**: Clips outliers beyond a specified Z-score threshold (default: 4.0)
5. **Feature Permutation**: Applies systematic feature shuffling for ensemble diversity:
- `"none"`: Original feature order
- `"shift"`: Circular shifting
- `"random"`: Random permutation
- `"latin"`: Latin square patterns (recommended)
The preprocessing is automatically applied during `fit()` and `predict()`, so no manual preprocessing is required.
## Performance
<div align="center">
<img src="figures/accuracy_ranking_talent.png" alt="Accuracy Ranking TALENT" width="700"/>
</div>
<div align="center">
<img src="figures/accuracy_ranking_tabzilla.png" alt="Accuracy Ranking TabZilla" width="700"/>
</div>
<div align="center">
<img src="figures/accuracy_ranking_openml-cc18.png" alt="Accuracy Ranking OPENML-CC18" width="700"/>
</div>
<div align="center">
<table>
<tr>
<td style="padding: 5px;"><img src="figures/relative_acc_improvement_over_tabzilla.png" alt="Relative Improvement over XGBoost on TabZilla" width="700"/></td>
</tr>
</table>
</div>
## Citation
If you use Orion-BiX in your research, please cite our [paper](https://arxiv.org/abs/2512.00181):
```bibtex
@article{bouadi2025orionbix,
title={Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning},
author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu},
year={2025},
eprint={2512.00181},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2512.00181},
}
```
## License
This project is released under the MIT License. See [LICENSE](LICENSE) for details.
## Contact
For questions, issues, or contributions, please:
- Open an issue on [GitHub](https://github.com/Lexsi-Labs/Orion-BiX/issues)
- Join our [Discord](https://discord.gg/dSB62Q7A) community
## π Acknowledgments
Orion-BiX is built on top of [TabICL](https://github.com/soda-inria/tabicl), a tabular foundation model for in-context learning. We gratefully acknowledge the TabICL authors for their foundational work and for making their codebase publicly available. |