Safetensors
English

Improve language tag

#1
by lbourdois - opened
Files changed (1) hide show
  1. README.md +165 -153
README.md CHANGED
@@ -1,154 +1,166 @@
1
- ---
2
- license: mit
3
- datasets:
4
- - SIA-IDE/MBHM
5
- language:
6
- - en
7
- base_model:
8
- - Qwen/Qwen2.5-1.5B-Instruct
9
- ---
10
- <div align="center">
11
- <a href="https://github.com/SIA-IDE/BearLLM">
12
- <img src="https://raw.githubusercontent.com/SIA-IDE/BearLLM/refs/heads/main/docs/images/logo.svg" width="200" alt="logo"/>
13
- </a>
14
- <h1>BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation</h1>
15
-
16
- <a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/Python-3.12-blue"></a>
17
- <a href="https://pytorch.org/"><img alt="PyTorch" src="https://img.shields.io/badge/Pytorch-latest-orange"></a>
18
- <a href="https://arxiv.org/abs/2408.11281"><img alt="arXiv" src="https://img.shields.io/badge/Paper-arXiv-B31B1B"></a>
19
- <a href="https://huggingface.co/datasets/SIA-IDE/MBHM"><img alt="Dataset" src="https://img.shields.io/badge/Dataset-๐Ÿค—-FFFDF5"></a>
20
- <a href="https://github.com/SIA-IDE/BearLLM"><img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/SIA-IDE/BearLLM"></a>
21
- </div>
22
-
23
- <h4 align="center">
24
- <p>
25
- <b>English</b> |
26
- <a href="https://github.com/SIA-IDE/BearLLM/blob/main/docs/README_zh.md">็ฎ€ไฝ“ไธญๆ–‡</a>
27
- </p>
28
- </h4>
29
-
30
- ## ๐Ÿ”ฅ NEWS
31
- - **[2025-04-11]** ๐ŸŽ‰ The [AAAI-25 Proceedings](https://aaai.org/proceeding/aaai-39-2025/) are now officially published! Our [conference paper](https://ojs.aaai.org/index.php/AAAI/article/view/34188) is included. We welcome you to read and cite it!
32
- - **[2025-03-06]** ๐ŸŒŸ The complete dataset and code are now officially open source!
33
- - **[2024-12-11]** โซ We are now working on making the code of BearLLM public. Stay tuned!
34
- - **[2024-12-10]** ๐ŸŽ‰ The BearLLM paper is accepted by the Thirty-Ninth AAAI Conference on Artificial Intelligence ([AAAI-25](https://aaai.org/conference/aaai/aaai-25/)).
35
- - **[2024-08-21]** ๐Ÿ“ The preprint of the BearLLM paper is available on arXiv. Check the [paper page](https://arxiv.org/abs/2408.11281) for more details.
36
-
37
- ## ๐Ÿ“… TODO
38
- - [ ] Improve related comments and documentation.
39
- - [x] Upload the complete BearLLM demo code.
40
- - [x] Upload the health management corpus of the MBHM dataset.
41
- - [x] Collect the codes for pre-training and fine-tuning BearLLM.
42
- - [x] Collect the codes of BearLLM's classification network and other comparison models.
43
- - [x] Upload the vibration signal portion of the MBHM dataset.
44
-
45
- ## ๐Ÿ“š Introduction
46
- The [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM) dataset is the first multimodal dataset designed for the study of bearing health management. It is divided into two parts: vibration signals and health management corpus. The vibration signals and condition information are derived from 9 publicly available datasets, and are still under continuous updating and improvement. The thousands of working conditions pose more difficult challenges for the identification model and better represent real-world usage scenarios.
47
-
48
- [BearLLM](https://github.com/SIA-IDE/BearLLM) is a prior knowledge-enhanced bearing health management framework with a unified vibration signal representation. This framework transforms the signal to be tested into the frequency domain, enabling effective identification of spectral differences compared to the vibration signal under fault-free conditions. By aligning the vibration signal with the fault semantic embedding, we achieve a unified natural language response for various health management tasks through a fine-tuned language model with low computational overhead. Experiments demonstrate that this framework achieves leading performance under thousands of working conditions.
49
-
50
- ## ๐Ÿ’ป Requirements
51
-
52
- The code is implemented in Python 3.12. The required packages are listed in the `requirements.txt` file. You can install the required packages by running the following command:
53
-
54
- ```bash
55
- conda create --name bearllm python=3.12
56
- conda activate bearllm
57
- pip install -r requirements.txt
58
- ```
59
-
60
-
61
- ## ๐Ÿš€ Quick Start
62
-
63
- ### 1. Download Demo Data / Use Your Own Data
64
-
65
- First, you need to download the `demo_data.json` from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset.
66
- For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
67
-
68
- Or, you can also build your own test data in the same format:
69
- `instruction`: Text instruction for health management task.
70
- `vib_data`: Vibration signal data to be identified, with a required duration of 1 second.
71
- `ref_data`: Reference vibration signal data without faults, with a required duration of 1 second.
72
-
73
- ```json
74
- {
75
- "instruction": "xxx.",
76
- "vib_data": [1.0, 0.0, 1.0, ...],
77
- "ref_data": [1.0, 0.0, 1.0, ...],
78
- }
79
- ```
80
-
81
- ### 2. Download Weights
82
-
83
- You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
84
-
85
- Additionally, you need to download the weights of [BearLLM](https://huggingface.co/SIA-IDE/BearLLM/tree/main).
86
-
87
- ### 3. Organize Files
88
-
89
- It is recommended to organize the weights and test data as follows:
90
-
91
- ```
92
- BearLLM/
93
- โ”œโ”€โ”€ qwen_weights/
94
- โ”‚ โ”œโ”€โ”€ model.safetensors
95
- โ”‚ โ”œโ”€โ”€ tokenizer.json
96
- โ”‚ โ”œโ”€โ”€ config.json
97
- โ”‚ โ””โ”€โ”€ other files...
98
- โ”œโ”€โ”€ bearllm_weights/
99
- โ”‚ โ”œโ”€โ”€ vibration_adapter.pth
100
- โ”‚ โ”œโ”€โ”€ adapter_config.json
101
- โ”‚ โ””โ”€โ”€ adapter_model.safetensors
102
- โ””โ”€โ”€ mbhm_dataset/
103
- โ””โ”€โ”€ demo_data.json
104
- ```
105
-
106
- ### 4. Run Code
107
- First, copy the `.env.example` file to `.env` and modify the data paths inside.
108
- Then, you can run the code using the following command:
109
-
110
- ```bash
111
- python run_demo.py
112
- ```
113
-
114
- ## โš™๏ธ Development
115
-
116
- ### 1. Download Dataset
117
-
118
- First, you need to download the following files from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset. For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
119
-
120
- - `data.hdf5`: Contains the vibration signal data.
121
- - `corpus.json`: Contains the health management corpus.
122
- - `metadata.sqlite`: Contains metadata information of the dataset.
123
-
124
- ### 2. Download Weights
125
-
126
- You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
127
-
128
- ### 3. Modify Environment Variables
129
-
130
- Copy the `.env.example` file to `.env` and modify the data paths inside.
131
-
132
- ### 4. Pre-train and Fine-tune Model
133
-
134
- Pre-train according to `src/pre_training.py`.
135
- Fine-tune according to `src/fine_tuning.py`.
136
-
137
- ## ๐Ÿ“– Citation
138
- Please cite the following paper if you use this study in your research:
139
-
140
- ```
141
- @article{pengBearLLMPriorKnowledgeEnhanced2025,
142
- title = {{{BearLLM}}: {{A Prior Knowledge-Enhanced Bearing Health Management Framework}} with {{Unified Vibration Signal Representation}}},
143
- author = {Peng, Haotian and Liu, Jiawei and Du, Jinsong and Gao, Jie and Wang, Wei},
144
- year = {2025},
145
- month = apr,
146
- journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
147
- volume = {39},
148
- number = {19},
149
- pages = {19866--19874},
150
- issn = {2374-3468},
151
- doi = {10.1609/aaai.v39i19.34188},
152
- urldate = {2025-04-11},
153
- }
 
 
 
 
 
 
 
 
 
 
 
 
154
  ```
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - SIA-IDE/MBHM
5
+ language:
6
+ - zho
7
+ - eng
8
+ - fra
9
+ - spa
10
+ - por
11
+ - deu
12
+ - ita
13
+ - rus
14
+ - jpn
15
+ - kor
16
+ - vie
17
+ - tha
18
+ - ara
19
+ base_model:
20
+ - Qwen/Qwen2.5-1.5B-Instruct
21
+ ---
22
+ <div align="center">
23
+ <a href="https://github.com/SIA-IDE/BearLLM">
24
+ <img src="https://raw.githubusercontent.com/SIA-IDE/BearLLM/refs/heads/main/docs/images/logo.svg" width="200" alt="logo"/>
25
+ </a>
26
+ <h1>BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation</h1>
27
+
28
+ <a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/Python-3.12-blue"></a>
29
+ <a href="https://pytorch.org/"><img alt="PyTorch" src="https://img.shields.io/badge/Pytorch-latest-orange"></a>
30
+ <a href="https://arxiv.org/abs/2408.11281"><img alt="arXiv" src="https://img.shields.io/badge/Paper-arXiv-B31B1B"></a>
31
+ <a href="https://huggingface.co/datasets/SIA-IDE/MBHM"><img alt="Dataset" src="https://img.shields.io/badge/Dataset-๐Ÿค—-FFFDF5"></a>
32
+ <a href="https://github.com/SIA-IDE/BearLLM"><img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/SIA-IDE/BearLLM"></a>
33
+ </div>
34
+
35
+ <h4 align="center">
36
+ <p>
37
+ <b>English</b> |
38
+ <a href="https://github.com/SIA-IDE/BearLLM/blob/main/docs/README_zh.md">็ฎ€ไฝ“ไธญๆ–‡</a>
39
+ </p>
40
+ </h4>
41
+
42
+ ## ๐Ÿ”ฅ NEWS
43
+ - **[2025-04-11]** ๐ŸŽ‰ The [AAAI-25 Proceedings](https://aaai.org/proceeding/aaai-39-2025/) are now officially published! Our [conference paper](https://ojs.aaai.org/index.php/AAAI/article/view/34188) is included. We welcome you to read and cite it!
44
+ - **[2025-03-06]** ๐ŸŒŸ The complete dataset and code are now officially open source!
45
+ - **[2024-12-11]** โซ We are now working on making the code of BearLLM public. Stay tuned!
46
+ - **[2024-12-10]** ๐ŸŽ‰ The BearLLM paper is accepted by the Thirty-Ninth AAAI Conference on Artificial Intelligence ([AAAI-25](https://aaai.org/conference/aaai/aaai-25/)).
47
+ - **[2024-08-21]** ๐Ÿ“ The preprint of the BearLLM paper is available on arXiv. Check the [paper page](https://arxiv.org/abs/2408.11281) for more details.
48
+
49
+ ## ๐Ÿ“… TODO
50
+ - [ ] Improve related comments and documentation.
51
+ - [x] Upload the complete BearLLM demo code.
52
+ - [x] Upload the health management corpus of the MBHM dataset.
53
+ - [x] Collect the codes for pre-training and fine-tuning BearLLM.
54
+ - [x] Collect the codes of BearLLM's classification network and other comparison models.
55
+ - [x] Upload the vibration signal portion of the MBHM dataset.
56
+
57
+ ## ๐Ÿ“š Introduction
58
+ The [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM) dataset is the first multimodal dataset designed for the study of bearing health management. It is divided into two parts: vibration signals and health management corpus. The vibration signals and condition information are derived from 9 publicly available datasets, and are still under continuous updating and improvement. The thousands of working conditions pose more difficult challenges for the identification model and better represent real-world usage scenarios.
59
+
60
+ [BearLLM](https://github.com/SIA-IDE/BearLLM) is a prior knowledge-enhanced bearing health management framework with a unified vibration signal representation. This framework transforms the signal to be tested into the frequency domain, enabling effective identification of spectral differences compared to the vibration signal under fault-free conditions. By aligning the vibration signal with the fault semantic embedding, we achieve a unified natural language response for various health management tasks through a fine-tuned language model with low computational overhead. Experiments demonstrate that this framework achieves leading performance under thousands of working conditions.
61
+
62
+ ## ๐Ÿ’ป Requirements
63
+
64
+ The code is implemented in Python 3.12. The required packages are listed in the `requirements.txt` file. You can install the required packages by running the following command:
65
+
66
+ ```bash
67
+ conda create --name bearllm python=3.12
68
+ conda activate bearllm
69
+ pip install -r requirements.txt
70
+ ```
71
+
72
+
73
+ ## ๐Ÿš€ Quick Start
74
+
75
+ ### 1. Download Demo Data / Use Your Own Data
76
+
77
+ First, you need to download the `demo_data.json` from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset.
78
+ For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
79
+
80
+ Or, you can also build your own test data in the same format:
81
+ `instruction`: Text instruction for health management task.
82
+ `vib_data`: Vibration signal data to be identified, with a required duration of 1 second.
83
+ `ref_data`: Reference vibration signal data without faults, with a required duration of 1 second.
84
+
85
+ ```json
86
+ {
87
+ "instruction": "xxx.",
88
+ "vib_data": [1.0, 0.0, 1.0, ...],
89
+ "ref_data": [1.0, 0.0, 1.0, ...],
90
+ }
91
+ ```
92
+
93
+ ### 2. Download Weights
94
+
95
+ You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
96
+
97
+ Additionally, you need to download the weights of [BearLLM](https://huggingface.co/SIA-IDE/BearLLM/tree/main).
98
+
99
+ ### 3. Organize Files
100
+
101
+ It is recommended to organize the weights and test data as follows:
102
+
103
+ ```
104
+ BearLLM/
105
+ โ”œโ”€โ”€ qwen_weights/
106
+ โ”‚ โ”œโ”€โ”€ model.safetensors
107
+ โ”‚ โ”œโ”€โ”€ tokenizer.json
108
+ โ”‚ โ”œโ”€โ”€ config.json
109
+ โ”‚ โ””โ”€โ”€ other files...
110
+ โ”œโ”€โ”€ bearllm_weights/
111
+ โ”‚ โ”œโ”€โ”€ vibration_adapter.pth
112
+ โ”‚ โ”œโ”€โ”€ adapter_config.json
113
+ โ”‚ โ””โ”€โ”€ adapter_model.safetensors
114
+ โ””โ”€โ”€ mbhm_dataset/
115
+ โ””โ”€โ”€ demo_data.json
116
+ ```
117
+
118
+ ### 4. Run Code
119
+ First, copy the `.env.example` file to `.env` and modify the data paths inside.
120
+ Then, you can run the code using the following command:
121
+
122
+ ```bash
123
+ python run_demo.py
124
+ ```
125
+
126
+ ## โš™๏ธ Development
127
+
128
+ ### 1. Download Dataset
129
+
130
+ First, you need to download the following files from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset. For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
131
+
132
+ - `data.hdf5`: Contains the vibration signal data.
133
+ - `corpus.json`: Contains the health management corpus.
134
+ - `metadata.sqlite`: Contains metadata information of the dataset.
135
+
136
+ ### 2. Download Weights
137
+
138
+ You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
139
+
140
+ ### 3. Modify Environment Variables
141
+
142
+ Copy the `.env.example` file to `.env` and modify the data paths inside.
143
+
144
+ ### 4. Pre-train and Fine-tune Model
145
+
146
+ Pre-train according to `src/pre_training.py`.
147
+ Fine-tune according to `src/fine_tuning.py`.
148
+
149
+ ## ๐Ÿ“– Citation
150
+ Please cite the following paper if you use this study in your research:
151
+
152
+ ```
153
+ @article{pengBearLLMPriorKnowledgeEnhanced2025,
154
+ title = {{{BearLLM}}: {{A Prior Knowledge-Enhanced Bearing Health Management Framework}} with {{Unified Vibration Signal Representation}}},
155
+ author = {Peng, Haotian and Liu, Jiawei and Du, Jinsong and Gao, Jie and Wang, Wei},
156
+ year = {2025},
157
+ month = apr,
158
+ journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
159
+ volume = {39},
160
+ number = {19},
161
+ pages = {19866--19874},
162
+ issn = {2374-3468},
163
+ doi = {10.1609/aaai.v39i19.34188},
164
+ urldate = {2025-04-11},
165
+ }
166
  ```