xuange commited on
Commit
be7b2b4
·
verified ·
1 Parent(s): d232ecb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -95
README.md CHANGED
@@ -2,10 +2,11 @@
2
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/LimiX-Logo.png" alt="LimiX summary" width="89%">
3
  </div>
4
 
5
- # 💥 最新进展
6
- - 2025-11-10: LimiX-2M 正式发布!与 LimiX-16M 相比,此版本在显著降低 GPU 内存占用的同时,实现了更快的推理速度。在此基础上,LimiX 的样本检索机制也得到了优化,模型的性能进一步提升,并有效减少了该模式下的推理时间与显存消耗
7
- - 2025-08-29: LimiX V1.0 发布
8
- # ⚡ 最新评测
 
9
  <div align="center">
10
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/BCCO-CLS.png" width="30%">
11
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/TabArena-CLS.png" width="30%">
@@ -17,64 +18,72 @@
17
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/CTR23-REG.png" width="30%">
18
  </div>
19
 
20
- # ➤ 简介
 
21
  <div align="center">
22
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/LimiX_Summary.png" alt="LimiX summary" width="89%">
23
  </div>
24
- 我们推出 LDM 系列的首个模型—极数。极数的目标在于进一步提升通用性:在统一的训练与推理框架下,同时处理分类、回归、缺失值插补、特征选择、样本选择和因果推理等任务,从而推动表格学习从定制化流程迈向基础模型范式的转变。
 
 
25
 
26
- 极数基于专为结构化数据建模和任务泛化优化的 Transformer 架构。 模型首先将先验知识库中的特征 𝑋 与目标 𝑌 映射为 token 表示。在核心模块中,注意力机制同时作用于样本维度和特征维度,以捕捉关键样本与特征的显著模式。随后,高维表示被送入回归与分类模块,从而支持多种任务的输出。
27
 
28
- 技术报告详见:[LimiX:Unleashing Structured-Data Modeling Capability for Generalist Intelligence](https://arxiv.org/abs/2509.03505) / [LimiX_Technical_Report.pdf](https://github.com/limix-ldm/LimiX/blob/main/LimiX_Technical_Report.pdf)
 
29
 
30
- # 对比测试
31
- LimiX模型在多个任务的测试中达到了当前最优性能.
32
- ## ➩ 分类
33
  <div align="center">
34
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/Classifier.png" alt="Classification" width="80%">
35
  </div>
36
 
37
- ## ➩ 回归
38
  <div align="center">
39
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/Regression.png" alt="Regression" width="60%">
40
  </div>
41
 
42
- ## ➩ 缺失值插补
43
  <div align="center">
44
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/MissingValueImputation.png" alt="Missing value imputation" width="80%">
45
  </div>
46
 
47
-
48
- # 教程
49
- ## 安装
50
- ### 方式1,使用Dockerfile (建议)
51
- 下载[Dockerfile](https://github.com/limix-ldm/LimiX/blob/main/Dockerfile)
52
  ```bash
53
  docker build --network=host -t limix/infe:v1 --build-arg FROM_IMAGES=nvidia/cuda:12.2.0-base-ubuntu22.04 -f Dockerfile .
54
  ```
55
- ### 方式2,自行构建环境
56
- 下载 flash_attn 预编译文件
 
57
  ```bash
58
  wget -O flash_attn-2.8.0.post2+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.0.post2/flash_attn-2.8.0.post2+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl
59
  ```
60
- 环境安装
61
  ```bash
62
  pip install python==3.12.7 torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1
63
  pip install flash_attn-2.8.0.post2+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl
64
  pip install scikit-learn einops huggingface-hub matplotlib networkx numpy pandas scipy tqdm typing_extensions xgboost kditransform hyperopt
65
  ```
66
 
67
- # 推理任务
68
- LimiX支持分类、回归、缺失值插补等任务。
69
- ## 模型下载
70
- | 模型尺寸 | 下载链接 | 支持的任务 |
 
 
 
 
 
 
71
  | --- | --- | --- |
72
- | LimiX-16M | [LimiX-16M.ckpt](https://huggingface.co/stableai-org/LimiX-16M/tree/main) | ✅ 分类 ✅回归 ✅缺失值插补 |
73
- | LimiX-2M | [LimiX-2M.ckpt](https://huggingface.co/stableai-org/LimiX-2M/tree/main) | ✅ 分类 ✅回归 ✅缺失值插补 |
74
 
 
75
 
76
- ## 接口说明
77
- ### 创建模型
78
  ```python
79
  class LimiXPredictor:
80
  def __init__(self,
@@ -90,97 +99,108 @@ class LimiXPredictor:
90
  inference_with_DDP: bool = False,
91
  seed:int=0)
92
  ```
93
- | 参数名 | 数据类型 | 参数说明 |
94
  |--------|----------|----------|
95
- | device | torch.device | 运行模型的硬件 |
96
- | model_path | str | 需要加载的模型的路径 |
97
- | mix_precision | bool | 是否启动混合进度计算 |
98
- | inference_config | list/str | 推理使用的配置文件 |
99
- | categorical_features_indices | list | 数据表中,分类列的序号 |
100
- | outlier_remove_std | float | 移除异常值时采用的标准差倍数阈值 |
101
- | softmax_temperature | float | Softmax 温度 温度系数 |
102
- | task_type | str | 任务类型,取值范围为:Classification, Regression |
103
- | mask_prediction | bool | 是否启用缺失值插补功能 |
104
- | inference_with_DDP | bool | 在推理时是否开启DDP |
105
- | seed | int | 随机种子 |
106
- ### 数据推理
107
  ```python
108
  def predict(self, x_train:np.ndarray, y_train:np.ndarray, x_test:np.ndarray) -> np.ndarray:
109
  ```
110
- | 参数名 | 数据类型 | 参数说明 |
111
  | ------- | ---------- | ----------------- |
112
- | x_train | np.ndarray | 训练集的 feature |
113
- | y_train | np.ndarray | 训练集的预测目标 |
114
- | x_test | np.ndarray | 测试集的 feature |
115
 
116
- ## 推理配置文件说明
117
- | 配置文件名称 | 配置文件说明 | 差异 |
118
  | ------- | ---------- | ----- |
119
- | cls_default_retrieval.json | 默认的**含有retrieval**功能的**分类任务**推理配置文件 | 分类性能更好 |
120
- | cls_default_noretrieval.json | 默认的**不含有retrieval**功能的**分类任务**推理配置文件 | 速度更快、显存需求更低 |
121
- | reg_default_retrieval.json | 默认的**含有retrieval**功能的**回归任务**推理配置文件 | 回归性能更好 |
122
- | reg_default_noretrieval.json | 默认的**不含有retrieval**功能的**回归任务**推理配置文件 | 速度更快、显存需求更低 |
123
- | reg_default_noretrieval_MVI.json | 默认的**缺失值插补**任务推理配置文件 | |
124
-
125
- ## ➩ 基于样本检索的ensemble推理
126
- 基于样本检索的ensemble推理的详细技术描述详见[LimiX技术报告](https://github.com/limix-ldm/LimiX/blob/main/LimiX_Technical_Report.pdf)
127
- 考虑到推理速度、显存占用,基于样本检索的ensemble推理目前只支持基于版本高于NVIDIA-RTX 4090显卡的硬件条件。
128
- ### 分类任务
 
 
 
 
129
  ```
130
  python inference_classifier.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
131
  ```
132
 
133
- ### 回归任务
 
134
  ```
135
  python inference_regression.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
136
  ```
137
 
138
- ### 个性化设置推理任务的数据预处理方式
139
- #### 首先生成inference_config文件
 
140
  ```python
141
- generate_infenerce_config()
142
  ```
143
 
144
- ### 分类任务
145
- #### 单卡或者CPU
 
146
  ```
147
  python inference_classifier.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
148
  ```
149
- #### 多卡分布式推理
 
 
150
  ```
151
  torchrun --nproc_per_node=8 inference_classifier.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data --inference_with_DDP
152
  ```
153
 
154
- ### 回归任务
155
- #### 单卡或者CPU
 
156
  ```
157
  python inference_regression.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
158
  ```
159
- #### 多卡分布式推理
 
 
160
  ```
161
  torchrun --nproc_per_node=8 inference_regression.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data --inference_with_DDP
162
  ```
163
 
164
- ### 样本检索的超参检索
165
- 本项目提供了一个样本检测的超参搜索套件。为了获得最佳性能,我们使用 Optuna 对检索参数进行超参数优化
166
- #### 安装
 
167
  ```
168
  pip install optuna
169
  ```
170
- #### 使用方法
171
- 若要使用超参数优化进行标准推理,请参考如下代码,这段代码将启动一个 Optuna 搜索,用于为特定数据集和应用场景寻找最佳的检索参数组合:
172
  ```
173
  searchInference = RetrievalSearchHyperparameters(
174
- dict(device_id=0, model_path=model_path), X_train, y_train, X_test, y_test,
175
  )
176
- config, result = searchInference.search(
177
- n_trials=10, metric="AUC",
178
- inference_config='config/cls_default_retrieval.json',
179
- task_type="cls")
180
  ```
 
181
 
182
- ## ➩ 分类
183
-
184
  ```python
185
  from sklearn.datasets import load_breast_cancer
186
  from sklearn.metrics import accuracy_score, roc_auc_score
@@ -210,9 +230,9 @@ prediction = clf.predict(X_train, y_train, X_test)
210
  print("roc_auc_score:", roc_auc_score(y_test, prediction[:, 1]))
211
  print("accuracy_score:", accuracy_score(y_test, np.argmax(prediction, axis=1)))
212
  ```
213
- 更加详细的样例详见: [inference_classifier.py](./inference_classifier.py)
214
 
215
- ## ➩ 回归
216
  ```python
217
  from functools import partial
218
 
@@ -246,7 +266,6 @@ y_std = y_train.std()
246
  y_train_normalized = (y_train - y_mean) / y_std
247
  y_test_normalized = (y_test - y_mean) / y_std
248
 
249
- data_device = f'cuda:0'
250
  model_path = hf_hub_download(repo_id="stableai-org/LimiX-16M", filename="LimiX-16M.ckpt", local_dir="./cache")
251
 
252
  model = LimiXPredictor(device='cuda', model_path=model_path, inference_config='config/reg_default_retrieval.json')
@@ -260,21 +279,22 @@ r2 = r2_score(y_test_normalized, y_pred)
260
  print(f'RMSE: {rmse}')
261
  print(f'R2: {r2}')
262
  ```
263
- 更加详细的样例详见: [inference_regression.py](./inference_regression.py)
264
 
265
- ## ➩ 缺失值插补
266
- 样例详见: [examples/demo_missing_value_imputation.py](examples/inference_regression.py)
267
 
268
- # ➤ 链接
269
- - LimiX(极数):结构化数据通用大模型:[LimiX:Unleashing Structured-Data Modeling Capability for Generalist Intelligence](https://arxiv.org/abs/2509.03505)
270
- - LimiX技术报告:[LimiX_Technical_Report.pdf](https://github.com/limix-ldm/LimiX/blob/main/LimiX_Technical_Report.pdf)
271
- - 平衡、全面、有挑战、跨领域的分类数据集:[bcco_cls](https://huggingface.co/datasets/stableai-org/bcco_cls)
272
- - 平衡、全面、有挑战、跨领域的回归数据集:[bcco_reg](https://huggingface.co/datasets/stableai-org/bcco_reg)
 
273
 
274
- # ➤ 协议
275
- 本仓库的代码依照 [Apache-2.0](LICENSE.txt) 协议开源,LimiX 模型的权重的使用则需要遵循 Model LicenseLimiX 权重对学术研究完全开放,在进行授权后允许商业使用。
276
 
277
- # ➤ 引用
278
  ```
279
  @article{LimiX,
280
  title={LimiX:Unleashing Structured-Data Modeling Capability for Generalist Intelligence},
@@ -282,4 +302,4 @@ print(f'R2: {r2}')
282
  journal={arXiv preprint arXiv:2509.03505},
283
  year={2025}
284
  }
285
- ```
 
2
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/LimiX-Logo.png" alt="LimiX summary" width="89%">
3
  </div>
4
 
5
+ # :boom: News
6
+ - 2025-11-10: LimiX-2M is officially released! Compared to LimiX-16M, this smaller variant offers significantly lower GPU memory usage and faster inference speed. The retrieval mechanism has also been enhanced, further improving model performance while reducing both inference time and memory consumption.
7
+ - 2025-08-29: LimiX V1.0 Released.
8
+
9
+ # ⚡ Latest Results Compared with SOTA Models
10
  <div align="center">
11
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/BCCO-CLS.png" width="30%">
12
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/TabArena-CLS.png" width="30%">
 
18
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/CTR23-REG.png" width="30%">
19
  </div>
20
 
21
+
22
+ # ➤ Overview
23
  <div align="center">
24
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/LimiX_Summary.png" alt="LimiX summary" width="89%">
25
  </div>
26
+ We introduce LimiX, the first installment of our LDM series. LimiX aims to push generality further: a single model that handles classification, regression, missing-value imputation, feature selection, sample selection, and causal inference under one training and inference recipe, advancing the shift from bespoke pipelines to unified, foundation-style tabular learning.
27
+
28
+ LimiX adopts a transformer architecture optimized for structured data modeling and task generalization. The model first embeds features X and targets Y from the prior knowledge base into token representations. Within the core modules, attention mechanisms are applied across both sample and feature dimensions to identify salient patterns in key samples and features. The resulting high-dimensional representations are then passed to regression and classification heads, enabling the model to support diverse predictive tasks.
29
 
30
+ For details, please refer to the technical report at the link: [LimiX:Unleashing Structured-Data Modeling Capability for Generalist Intelligence](https://arxiv.org/abs/2509.03505) or [LimiX_Technical_Report.pdf](https://github.com/limix-ldm/LimiX/blob/main/LimiX_Technical_Report.pdf).
31
 
32
+ # Superior Performance
33
+ The LimiX model achieved SOTA performance across multiple tasks.
34
 
35
+ ## Classification (Tech Report)
 
 
36
  <div align="center">
37
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/Classifier.png" alt="Classification" width="80%">
38
  </div>
39
 
40
+ ## ➩ Regression (Tech Report)
41
  <div align="center">
42
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/Regression.png" alt="Regression" width="60%">
43
  </div>
44
 
45
+ ## ➩ Missing Values Imputation (Tech Report)
46
  <div align="center">
47
  <img src="https://github.com/limix-ldm/LimiX/raw/main/doc/MissingValueImputation.png" alt="Missing value imputation" width="80%">
48
  </div>
49
 
50
+ # ➤ Tutorials
51
+ ## Installation
52
+ ### Option 1 (recommended): Use the Dockerfile
53
+ Download [Dockerfile](https://github.com/limix-ldm/LimiX/blob/main/Dockerfile)
 
54
  ```bash
55
  docker build --network=host -t limix/infe:v1 --build-arg FROM_IMAGES=nvidia/cuda:12.2.0-base-ubuntu22.04 -f Dockerfile .
56
  ```
57
+
58
+ ### Option 2: Build manually
59
+ Download the prebuilt flash_attn files
60
  ```bash
61
  wget -O flash_attn-2.8.0.post2+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl https://github.com/Dao-AILab/flash-attention/releases/download/v2.8.0.post2/flash_attn-2.8.0.post2+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl
62
  ```
63
+ Install Python dependencies
64
  ```bash
65
  pip install python==3.12.7 torch==2.7.1 torchvision==0.22.1 torchaudio==2.7.1
66
  pip install flash_attn-2.8.0.post2+cu12torch2.7cxx11abiTRUE-cp312-cp312-linux_x86_64.whl
67
  pip install scikit-learn einops huggingface-hub matplotlib networkx numpy pandas scipy tqdm typing_extensions xgboost kditransform hyperopt
68
  ```
69
 
70
+ ### Download source code
71
+ ```bash
72
+ git clone https://github.com/limix-ldm/LimiX.git
73
+ cd LimiX
74
+ ```
75
+
76
+ # ➤ Inference
77
+ LimiX supports tasks such as classification, regression, and missing value imputation
78
+ ## ➩ Model download
79
+ | Model size | Download link | Tasks supported |
80
  | --- | --- | --- |
81
+ | LimiX-16M | [LimiX-16M.ckpt](https://huggingface.co/stableai-org/LimiX-16M/tree/main) | ✅ classification ✅regression ✅missing value imputation |
82
+ | LimiX-2M | [LimiX-2M.ckpt](https://huggingface.co/stableai-org/LimiX-2M/tree/main) | ✅ classification ✅regression ✅missing value imputation |
83
 
84
+ ## ➩ Interface description
85
 
86
+ ### Model Creation
 
87
  ```python
88
  class LimiXPredictor:
89
  def __init__(self,
 
99
  inference_with_DDP: bool = False,
100
  seed:int=0)
101
  ```
102
+ | Parameter | Data Type | Description |
103
  |--------|----------|----------|
104
+ | device | torch.device | The hardware that loads the model |
105
+ | model_path | str | The path to the model that needs to be loaded |
106
+ | mix_precision | bool | Whether to enable the mixed precision inference |
107
+ | inference_config | list/str | Configuration file used for inference |
108
+ | categorical_features_indices | list | The indices of categorical columns in the tabular data |
109
+ | outlier_remove_std | float | The threshold is employed to remove outliers, defined as values that are multiples of the standard deviation |
110
+ | softmax_temperature | float | The temperature used to control the behavior of softmax operator |
111
+ | task_type | str | The task type which can be either "Classification" or "Regression" |
112
+ | mask_prediction | bool | Whether to enable missing value imputation |
113
+ | inference_with_DDP | bool | Whether to enable DDP during inference |
114
+ | seed | int | The seed to control random states |
115
+ ### Predict
116
  ```python
117
  def predict(self, x_train:np.ndarray, y_train:np.ndarray, x_test:np.ndarray) -> np.ndarray:
118
  ```
119
+ | Parameter | Data Type | Description |
120
  | ------- | ---------- | ----------------- |
121
+ | x_train | np.ndarray | The input features of the training set |
122
+ | y_train | np.ndarray | The target variable of the training set |
123
+ | x_test | np.ndarray | The input features of the test set |
124
 
125
+ ## Inference Configuration File Description
126
+ | Configuration File Name | Description | Difference |
127
  | ------- | ---------- | ----- |
128
+ | cls_default_retrieval.json | Default **classification task** inference configuration file **with retrieval** | Better classification performance |
129
+ | cls_default_noretrieval.json | Default **classification task** inference configuration file **without retrieval** | Faster speed, lower memory requirements |
130
+ | reg_default_retrieval.json | Default **regression task** inference configuration file **with retrieval** | Better regression performance |
131
+ | reg_default_noretrieval.json | Default **regression task** inference configuration file **without retrieval** | Faster speed, lower memory requirements |
132
+ | reg_default_noretrieval_MVI.json | Default inference configuration file for **missing value imputation task** | |
133
+
134
+ ## ➩ Ensemble Inference Based on Sample Retrieval
135
+
136
+ For a detailed technical introduction to Ensemble Inference Based on Sample Retrieval, please refer to the [technical report](https://github.com/limix-ldm/LimiX/blob/main/LimiX_Technical_Report.pdf).
137
+
138
+ Considering inference speed and memory requirements, ensemble inference based on sample retrieval currently only supports hardware with specifications higher than the NVIDIA RTX 4090 GPU.
139
+
140
+ ### Classification Task
141
+
142
  ```
143
  python inference_classifier.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
144
  ```
145
 
146
+ ### Regression Task
147
+
148
  ```
149
  python inference_regression.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
150
  ```
151
 
152
+ ### Customizing Data Preprocessing for Inference Tasks
153
+ #### First, Generate the Inference Configuration File
154
+
155
  ```python
156
+ generate_inference_config()
157
  ```
158
 
159
+ ### Classification Task
160
+ #### Single GPU or CPU
161
+
162
  ```
163
  python inference_classifier.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
164
  ```
165
+
166
+ #### Multi-GPU Distributed Inference
167
+
168
  ```
169
  torchrun --nproc_per_node=8 inference_classifier.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data --inference_with_DDP
170
  ```
171
 
172
+ ### Regression Task
173
+ #### Single GPU or CPU
174
+
175
  ```
176
  python inference_regression.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data
177
  ```
178
+
179
+ #### Multi-GPU Distributed Inference
180
+
181
  ```
182
  torchrun --nproc_per_node=8 inference_regression.py --save_name your_save_name --inference_config_path path_to_retrieval_config --data_dir path_to_data --inference_with_DDP
183
  ```
184
 
185
+ ### Retrieval Optimization Project
186
+ This project implements an optimized retrieval system. To achieve the best performance, we utilize Optuna for hyperparameter tuning of retrieval parameters.
187
+ #### Installation
188
+ Ensure you have the required dependencies installed:
189
  ```
190
  pip install optuna
191
  ```
192
+ #### Usage
193
+ For standard inference using pre-optimized parameters, refer to the code below:
194
  ```
195
  searchInference = RetrievalSearchHyperparameters(
196
+ dict(device_id=0,model_path=model_path), X_train, y_train, X_test, y_test,
197
  )
198
+ config, result = searchInference.search(n_trials=10, metric="AUC",
199
+ inference_config='config/cls_default_retrieval.json',task_type="cls")
 
 
200
  ```
201
+ This will launch an Optuna study to find the best combination of retrieval parameters for your specific dataset and use case.
202
 
203
+ ## ➩ Classification
 
204
  ```python
205
  from sklearn.datasets import load_breast_cancer
206
  from sklearn.metrics import accuracy_score, roc_auc_score
 
230
  print("roc_auc_score:", roc_auc_score(y_test, prediction[:, 1]))
231
  print("accuracy_score:", accuracy_score(y_test, np.argmax(prediction, axis=1)))
232
  ```
233
+ For additional examples, refer to [inference_classifier.py](./inference_classifier.py)
234
 
235
+ ## ➩ Regression
236
  ```python
237
  from functools import partial
238
 
 
266
  y_train_normalized = (y_train - y_mean) / y_std
267
  y_test_normalized = (y_test - y_mean) / y_std
268
 
 
269
  model_path = hf_hub_download(repo_id="stableai-org/LimiX-16M", filename="LimiX-16M.ckpt", local_dir="./cache")
270
 
271
  model = LimiXPredictor(device='cuda', model_path=model_path, inference_config='config/reg_default_retrieval.json')
 
279
  print(f'RMSE: {rmse}')
280
  print(f'R2: {r2}')
281
  ```
282
+ For additional examples, refer to [inference_regression.py](https://github.com/limix-ldm/LimiX/raw/main/inference_regression.py)
283
 
284
+ ## ➩ Missing value imputation
285
+ For the demo file, see [examples/demo_missing_value_imputation.py](https://github.com/limix-ldm/LimiX/raw/main/examples/inference_regression.py)
286
 
287
+ # ➤ Link
288
+ - LimiX:Unleashing Structured-Data Modeling Capability for Generalist Intelligence: [LimiX:Unleashing Structured-Data Modeling Capability for Generalist Intelligence](https://arxiv.org/abs/2509.03505)
289
+ - LimiX Technical Report: [LimiX_Technical_Report.pdf](https://github.com/limix-ldm/LimiX/blob/main/LimiX_Technical_Report.pdf)
290
+ - Detailed instructions for using Limix: [Visit the official Limix documentation](https://www.limix.ai/doc/)
291
+ - Balance Comprehensive Challenging Omni-domain Classification Benchmark: [bcco_cls](https://huggingface.co/datasets/stableai-org/bcco_cls)
292
+ - Balance Comprehensive Challenging Omni-domain Regression Benchmark: [bcco_reg](https://huggingface.co/datasets/stableai-org/bcco_reg)
293
 
294
+ # ➤ License
295
+ The code in this repository is open-sourced under the [Apache-2.0](LICENSE.txt) license, while the usage of the LimiX model weights is subject to the Model License. The LimiX weights are fully available for academic research and may be used commercially upon obtaining proper authorization.
296
 
297
+ # ➤ Citation
298
  ```
299
  @article{LimiX,
300
  title={LimiX:Unleashing Structured-Data Modeling Capability for Generalist Intelligence},
 
302
  journal={arXiv preprint arXiv:2509.03505},
303
  year={2025}
304
  }
305
+ ```