Safetensors
English
KarlRaphel commited on
Commit
3cba222
ยท
verified ยท
1 Parent(s): 9745246

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +140 -0
README.md ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+ <a href="https://github.com/SIA-IDE/BearLLM">
3
+ <img src="https://raw.githubusercontent.com/SIA-IDE/BearLLM/refs/heads/main/docs/images/logo.svg" width="200" alt="logo"/>
4
+ </a>
5
+ <h1>BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation</h1>
6
+
7
+ <a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/Python-3.12-blue"></a>
8
+ <a href="https://pytorch.org/"><img alt="PyTorch" src="https://img.shields.io/badge/Pytorch-latest-orange"></a>
9
+ <a href="https://arxiv.org/abs/2408.11281"><img alt="arXiv" src="https://img.shields.io/badge/Paper-arXiv-B31B1B"></a>
10
+ <a href="https://huggingface.co/datasets/SIA-IDE/MBHM"><img alt="Dataset" src="https://img.shields.io/badge/Dataset-๐Ÿค—-FFFDF5"></a>
11
+ <a href="https://github.com/SIA-IDE/BearLLM"><img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/SIA-IDE/BearLLM"></a>
12
+ </div>
13
+
14
+ <h4 align="center">
15
+ <p>
16
+ <b>English</b> |
17
+ <a href="https://github.com/SIA-IDE/BearLLM/blob/main/docs/README_zh.md">็ฎ€ไฝ“ไธญๆ–‡</a>
18
+ </p>
19
+ </h4>
20
+
21
+ ## ๐Ÿ”ฅ NEWS
22
+ - **[2025-03-06]** ๐ŸŒŸ The complete dataset and code are now officially open source!
23
+ - **[2024-12-11]** โซ We are now working on making the code of BearLLM public. Stay tuned!
24
+ - **[2024-12-10]** ๐ŸŽ‰ The BearLLM paper is accepted by the Thirty-Ninth AAAI Conference on Artificial Intelligence ([AAAI-25](https://aaai.org/conference/aaai/aaai-25/)).
25
+ - **[2024-08-21]** ๐Ÿ“ The preprint of the BearLLM paper is available on arXiv. Check the [paper page](https://arxiv.org/abs/2408.11281) for more details.
26
+
27
+ ## ๐Ÿ“… TODO
28
+ - [ ] Improve related comments and documentation.
29
+ - [x] Upload the complete BearLLM demo code.
30
+ - [x] Upload the health management corpus of the MBHM dataset.
31
+ - [x] Collect the codes for pre-training and fine-tuning BearLLM.
32
+ - [x] Collect the codes of BearLLM's classification network and other comparison models.
33
+ - [x] Upload the vibration signal portion of the MBHM dataset.
34
+
35
+ ## ๐Ÿ“š Introduction
36
+ The [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM) dataset is the first multimodal dataset designed for the study of bearing health management. It is divided into two parts: vibration signals and health management corpus. The vibration signals and condition information are derived from 9 publicly available datasets, and are still under continuous updating and improvement. The thousands of working conditions pose more difficult challenges for the identification model and better represent real-world usage scenarios.
37
+
38
+ [BearLLM](https://github.com/SIA-IDE/BearLLM) is a prior knowledge-enhanced bearing health management framework with a unified vibration signal representation. This framework transforms the signal to be tested into the frequency domain, enabling effective identification of spectral differences compared to the vibration signal under fault-free conditions. By aligning the vibration signal with the fault semantic embedding, we achieve a unified natural language response for various health management tasks through a fine-tuned language model with low computational overhead. Experiments demonstrate that this framework achieves leading performance under thousands of working conditions.
39
+
40
+ ## ๐Ÿ’ป Requirements
41
+
42
+ The code is implemented in Python 3.12. The required packages are listed in the `requirements.txt` file. You can install the required packages by running the following command:
43
+
44
+ ```bash
45
+ conda create --name bearllm python=3.12
46
+ conda activate bearllm
47
+ pip install -r requirements.txt
48
+ ```
49
+
50
+
51
+ ## ๐Ÿš€ Quick Start
52
+
53
+ ### 1. Download Demo Data / Use Your Own Data
54
+
55
+ First, you need to download the `demo_data.json` from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset.
56
+ For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
57
+
58
+ Or, you can also build your own test data in the same format:
59
+ `instruction`: Text instruction for health management task.
60
+ `vib_data`: Vibration signal data to be identified, with a required duration of 1 second.
61
+ `ref_data`: Reference vibration signal data without faults, with a required duration of 1 second.
62
+
63
+ ```json
64
+ {
65
+ "instruction": "xxx.",
66
+ "vib_data": [1.0, 0.0, 1.0, ...],
67
+ "ref_data": [1.0, 0.0, 1.0, ...],
68
+ }
69
+ ```
70
+
71
+ ### 2. Download Weights
72
+
73
+ You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
74
+
75
+ Additionally, you need to download the weights of [BearLLM](https://huggingface.co/SIA-IDE/BearLLM/tree/main).
76
+
77
+ ### 3. Organize Files
78
+
79
+ It is recommended to organize the weights and test data as follows:
80
+
81
+ ```
82
+ BearLLM/
83
+ โ”œโ”€โ”€ qwen_weights/
84
+ โ”‚ โ”œโ”€โ”€ model.safetensors
85
+ โ”‚ โ”œโ”€โ”€ tokenizer.json
86
+ โ”‚ โ”œโ”€โ”€ config.json
87
+ โ”‚ โ””โ”€โ”€ other files...
88
+ โ”œโ”€โ”€ bearllm_weights/
89
+ โ”‚ โ”œโ”€โ”€ vibration_adapter.pth
90
+ โ”‚ โ”œโ”€โ”€ adapter_config.json
91
+ โ”‚ โ””โ”€โ”€ adapter_model.safetensors
92
+ ๏ฟฝ๏ฟฝ๏ฟฝโ”€โ”€ mbhm_dataset/
93
+ โ””โ”€โ”€ demo_data.json
94
+ ```
95
+
96
+ ### 4. Run Code
97
+ First, copy the `.env.example` file to `.env` and modify the data paths inside.
98
+ Then, you can run the code using the following command:
99
+
100
+ ```bash
101
+ python run_demo.py
102
+ ```
103
+
104
+ ## โš™๏ธ Development
105
+
106
+ ### 1. Download Dataset
107
+
108
+ First, you need to download the following files from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset. For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
109
+
110
+ - `data.hdf5`: Contains the vibration signal data.
111
+ - `corpus.json`: Contains the health management corpus.
112
+ - `metadata.sqlite`: Contains metadata information of the dataset.
113
+
114
+ ### 2. Download Weights
115
+
116
+ You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
117
+
118
+ ### 3. Modify Environment Variables
119
+
120
+ Copy the `.env.example` file to `.env` and modify the data paths inside.
121
+
122
+ ### 4. Pre-train and Fine-tune Model
123
+
124
+ Pre-train according to `src/pre_training.py`.
125
+ Fine-tune according to `src/fine_tuning.py`.
126
+
127
+ ## ๐Ÿ“– Citation
128
+ Please cite the following paper if you use this study in your research:
129
+
130
+ ```
131
+ @misc{peng2024bearllmpriorknowledgeenhancedbearing,
132
+ title={BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation},
133
+ author={Haotian Peng and Jiawei Liu and Jinsong Du and Jie Gao and Wei Wang},
134
+ year={2024},
135
+ eprint={2408.11281},
136
+ archivePrefix={arXiv},
137
+ primaryClass={cs.AI},
138
+ url={https://arxiv.org/abs/2408.11281},
139
+ }
140
+ ```