File size: 7,135 Bytes
3210e71 3cba222 fd2859d 3cba222 fd2859d 3cba222 3210e71 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 | ---
license: mit
datasets:
- SIA-IDE/MBHM
language:
- en
base_model:
- Qwen/Qwen2.5-1.5B-Instruct
---
<div align="center">
<a href="https://github.com/SIA-IDE/BearLLM">
<img src="https://raw.githubusercontent.com/SIA-IDE/BearLLM/refs/heads/main/docs/images/logo.svg" width="200" alt="logo"/>
</a>
<h1>BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation</h1>
<a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/Python-3.12-blue"></a>
<a href="https://pytorch.org/"><img alt="PyTorch" src="https://img.shields.io/badge/Pytorch-latest-orange"></a>
<a href="https://arxiv.org/abs/2408.11281"><img alt="arXiv" src="https://img.shields.io/badge/Paper-arXiv-B31B1B"></a>
<a href="https://huggingface.co/datasets/SIA-IDE/MBHM"><img alt="Dataset" src="https://img.shields.io/badge/Dataset-๐ค-FFFDF5"></a>
<a href="https://github.com/SIA-IDE/BearLLM"><img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/SIA-IDE/BearLLM"></a>
</div>
<h4 align="center">
<p>
<b>English</b> |
<a href="https://github.com/SIA-IDE/BearLLM/blob/main/docs/README_zh.md">็ฎไฝไธญๆ</a>
</p>
</h4>
## ๐ฅ NEWS
- **[2025-04-11]** ๐ The [AAAI-25 Proceedings](https://aaai.org/proceeding/aaai-39-2025/) are now officially published! Our [conference paper](https://ojs.aaai.org/index.php/AAAI/article/view/34188) is included. We welcome you to read and cite it!
- **[2025-03-06]** ๐ The complete dataset and code are now officially open source!
- **[2024-12-11]** โซ We are now working on making the code of BearLLM public. Stay tuned!
- **[2024-12-10]** ๐ The BearLLM paper is accepted by the Thirty-Ninth AAAI Conference on Artificial Intelligence ([AAAI-25](https://aaai.org/conference/aaai/aaai-25/)).
- **[2024-08-21]** ๐ The preprint of the BearLLM paper is available on arXiv. Check the [paper page](https://arxiv.org/abs/2408.11281) for more details.
## ๐
TODO
- [ ] Improve related comments and documentation.
- [x] Upload the complete BearLLM demo code.
- [x] Upload the health management corpus of the MBHM dataset.
- [x] Collect the codes for pre-training and fine-tuning BearLLM.
- [x] Collect the codes of BearLLM's classification network and other comparison models.
- [x] Upload the vibration signal portion of the MBHM dataset.
## ๐ Introduction
The [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM) dataset is the first multimodal dataset designed for the study of bearing health management. It is divided into two parts: vibration signals and health management corpus. The vibration signals and condition information are derived from 9 publicly available datasets, and are still under continuous updating and improvement. The thousands of working conditions pose more difficult challenges for the identification model and better represent real-world usage scenarios.
[BearLLM](https://github.com/SIA-IDE/BearLLM) is a prior knowledge-enhanced bearing health management framework with a unified vibration signal representation. This framework transforms the signal to be tested into the frequency domain, enabling effective identification of spectral differences compared to the vibration signal under fault-free conditions. By aligning the vibration signal with the fault semantic embedding, we achieve a unified natural language response for various health management tasks through a fine-tuned language model with low computational overhead. Experiments demonstrate that this framework achieves leading performance under thousands of working conditions.
## ๐ป Requirements
The code is implemented in Python 3.12. The required packages are listed in the `requirements.txt` file. You can install the required packages by running the following command:
```bash
conda create --name bearllm python=3.12
conda activate bearllm
pip install -r requirements.txt
```
## ๐ Quick Start
### 1. Download Demo Data / Use Your Own Data
First, you need to download the `demo_data.json` from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset.
For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
Or, you can also build your own test data in the same format:
`instruction`: Text instruction for health management task.
`vib_data`: Vibration signal data to be identified, with a required duration of 1 second.
`ref_data`: Reference vibration signal data without faults, with a required duration of 1 second.
```json
{
"instruction": "xxx.",
"vib_data": [1.0, 0.0, 1.0, ...],
"ref_data": [1.0, 0.0, 1.0, ...],
}
```
### 2. Download Weights
You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
Additionally, you need to download the weights of [BearLLM](https://huggingface.co/SIA-IDE/BearLLM/tree/main).
### 3. Organize Files
It is recommended to organize the weights and test data as follows:
```
BearLLM/
โโโ qwen_weights/
โ โโโ model.safetensors
โ โโโ tokenizer.json
โ โโโ config.json
โ โโโ other files...
โโโ bearllm_weights/
โ โโโ vibration_adapter.pth
โ โโโ adapter_config.json
โ โโโ adapter_model.safetensors
โโโ mbhm_dataset/
โโโ demo_data.json
```
### 4. Run Code
First, copy the `.env.example` file to `.env` and modify the data paths inside.
Then, you can run the code using the following command:
```bash
python run_demo.py
```
## โ๏ธ Development
### 1. Download Dataset
First, you need to download the following files from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset. For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
- `data.hdf5`: Contains the vibration signal data.
- `corpus.json`: Contains the health management corpus.
- `metadata.sqlite`: Contains metadata information of the dataset.
### 2. Download Weights
You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
### 3. Modify Environment Variables
Copy the `.env.example` file to `.env` and modify the data paths inside.
### 4. Pre-train and Fine-tune Model
Pre-train according to `src/pre_training.py`.
Fine-tune according to `src/fine_tuning.py`.
## ๐ Citation
Please cite the following paper if you use this study in your research:
```
@article{pengBearLLMPriorKnowledgeEnhanced2025,
title = {{{BearLLM}}: {{A Prior Knowledge-Enhanced Bearing Health Management Framework}} with {{Unified Vibration Signal Representation}}},
author = {Peng, Haotian and Liu, Jiawei and Du, Jinsong and Gao, Jie and Wang, Wei},
year = {2025},
month = apr,
journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
volume = {39},
number = {19},
pages = {19866--19874},
issn = {2374-3468},
doi = {10.1609/aaai.v39i19.34188},
urldate = {2025-04-11},
}
``` |