Orion-BiX / README.md
nielsr's picture
nielsr HF Staff
Improve model card: Add pipeline tag, paper/project links, authors & full content
051dfeb verified
|
raw
history blame
7.03 kB
metadata
license: mit
pipeline_tag: table-question-answering

This model is presented in the paper Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning. Authors: Mohamed Bouadi, Pratinav Seth, Aditya Tanna, Vinay Kumar Sankarapu Project Page: https://www.lexsi.ai/

Orion-BiX Logo

Python 3.9+ PyTorch License: MIT

Orion-BiX: Bi-Axial Meta-Learning for Tabular In-Context Learning

Orion-BiX is an advanced tabular foundation model that combines Bi-Axial Attention with Meta-Learning capabilities for few-shot tabular classification. The model extends the TabICL architecture with alternating attention patterns and episode-based training, achieving state-of-the-art performance on domain-specific benchamrks such as Healthcare and Finance.

πŸ—οΈ Approach and Architecture

Key Innovations

Orion-BiX introduces three key architectural innovations:

  1. Bi-Axial Attention: Alternating attention patterns (Standard β†’ Grouped β†’ Hierarchical β†’ Relational) that capture multi-scale feature interactions
  2. Meta-Learning: Episode-based training with k-NN support selection for few-shot learning
  3. Configurable Architecture: Flexible design supporting various attention mechanisms and training modes
  4. Production Ready: Memory optimization, distributed training support, and scikit-learn interface

Component Details

Orion-BiX follows a three-component architecture:

Input β†’ Column Embedder (Set Transformer) β†’ Bi-Axial Attention β†’ ICL Predictor β†’ Output
  1. Column Embedder: Set Transformer for statistical distribution learning across features from TabICL
  2. Bi-Axial Attention: Replaces standard RowInteraction with alternating attention patterns:
    • Standard Cross-Feature Attention: Direct attention between features
    • Grouped Feature Attention: Attention within feature groups
    • Hierarchical Feature Attention: Hierarchical feature patterns
    • Relational Feature Attention: Full feature-to-feature attention
    • CLS Token Aggregation: Multiple CLS tokens (default: 4) for feature summarization
  3. tf_icl ICL Predictor: In-context learning module for few-shot prediction

Each BiAxialAttentionBlock applies four attention patterns in sequence:

Standard β†’ Grouped β†’ Hierarchical β†’ Relational β†’ CLS Aggregation

Installation

Prerequisites

  • Python 3.9-3.12
  • PyTorch 2.2+ (with CUDA support recommended)
  • CUDA-capable GPU (recommended for training)

From the source

Option 1: From the local clone

cd orion-bix
pip install -e .

Option 2: From the Git Remote

pip install git+https://github.com/Lexsi-Labs/Orion-BiX.git

Usage

Orion-BiX provides a scikit-learn compatible interface for easy integration:

from orion_bix.sklearn import OrionBixClassifier

# Initialize and fit the classifier
clf = OrionBixClassifier()

# Fit the model (prepares data transformations)
clf.fit(X_train, y_train)

# Make predictions
predictions = clf.predict(X_test)
probabilities = clf.predict_proba(X_test)

Preprocessing

Orion-BiX includes automatic preprocessing that handles:

  1. Categorical Encoding: Automatically encodes categorical features using ordinal encoding
  2. Missing Value Imputation: Handles missing values using median imputation for numerical features
  3. Feature Normalization: Supports multiple normalization methods:
    • "none": No normalization
    • "power": Yeo-Johnson power transform
    • "quantile": Quantile transformation to normal distribution
    • "quantile_rtdl": RTDL-style quantile transform
    • "robust": Robust scaling using median and quantiles
  4. Outlier Handling: Clips outliers beyond a specified Z-score threshold (default: 4.0)
  5. Feature Permutation: Applies systematic feature shuffling for ensemble diversity:
    • "none": Original feature order
    • "shift": Circular shifting
    • "random": Random permutation
    • "latin": Latin square patterns (recommended)

The preprocessing is automatically applied during fit() and predict(), so no manual preprocessing is required.

Performance

Accuracy Ranking TALENT
Accuracy Ranking TabZilla
Accuracy Ranking OPENML-CC18
Relative Improvement over XGBoost on TabZilla

Citation

If you use Orion-BiX in your research, please cite our paper:

@article{bouadi2025orionbix,
      title={Orion-Bix: Bi-Axial Attention for Tabular In-Context Learning}, 
      author={Mohamed Bouadi and Pratinav Seth and Aditya Tanna and Vinay Kumar Sankarapu},
      year={2025},
      eprint={2512.00181},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2512.00181}, 
}

License

This project is released under the MIT License. See LICENSE for details.

Contact

For questions, issues, or contributions, please:

πŸ™ Acknowledgments

Orion-BiX is built on top of TabICL, a tabular foundation model for in-context learning. We gratefully acknowledge the TabICL authors for their foundational work and for making their codebase publicly available.