| | ---
|
| | license: mit
|
| | datasets:
|
| | - SIA-IDE/MBHM
|
| | language:
|
| | - zho
|
| | - eng
|
| | - fra
|
| | - spa
|
| | - por
|
| | - deu
|
| | - ita
|
| | - rus
|
| | - jpn
|
| | - kor
|
| | - vie
|
| | - tha
|
| | - ara
|
| | base_model:
|
| | - Qwen/Qwen2.5-1.5B-Instruct
|
| | ---
|
| | <div align="center">
|
| | <a href="https://github.com/SIA-IDE/BearLLM">
|
| | <img src="https://raw.githubusercontent.com/SIA-IDE/BearLLM/refs/heads/main/docs/images/logo.svg" width="200" alt="logo"/>
|
| | </a>
|
| | <h1>BearLLM: A Prior Knowledge-Enhanced Bearing Health Management Framework with Unified Vibration Signal Representation</h1>
|
| |
|
| | <a href="https://www.python.org/"><img alt="Python" src="https://img.shields.io/badge/Python-3.12-blue"></a>
|
| | <a href="https://pytorch.org/"><img alt="PyTorch" src="https://img.shields.io/badge/Pytorch-latest-orange"></a>
|
| | <a href="https://arxiv.org/abs/2408.11281"><img alt="arXiv" src="https://img.shields.io/badge/Paper-arXiv-B31B1B"></a>
|
| | <a href="https://huggingface.co/datasets/SIA-IDE/MBHM"><img alt="Dataset" src="https://img.shields.io/badge/Dataset-๐ค-FFFDF5"></a>
|
| | <a href="https://github.com/SIA-IDE/BearLLM"><img alt="GitHub Repo stars" src="https://img.shields.io/github/stars/SIA-IDE/BearLLM"></a>
|
| | </div>
|
| |
|
| | <h4 align="center">
|
| | <p>
|
| | <b>English</b> |
|
| | <a href="https://github.com/SIA-IDE/BearLLM/blob/main/docs/README_zh.md">็ฎไฝไธญๆ</a>
|
| | </p>
|
| | </h4>
|
| |
|
| | ## ๐ฅ NEWS
|
| | - **[2025-04-11]** ๐ The [AAAI-25 Proceedings](https://aaai.org/proceeding/aaai-39-2025/) are now officially published! Our [conference paper](https://ojs.aaai.org/index.php/AAAI/article/view/34188) is included. We welcome you to read and cite it!
|
| | - **[2025-03-06]** ๐ The complete dataset and code are now officially open source!
|
| | - **[2024-12-11]** โซ We are now working on making the code of BearLLM public. Stay tuned!
|
| | - **[2024-12-10]** ๐ The BearLLM paper is accepted by the Thirty-Ninth AAAI Conference on Artificial Intelligence ([AAAI-25](https://aaai.org/conference/aaai/aaai-25/)).
|
| | - **[2024-08-21]** ๐ The preprint of the BearLLM paper is available on arXiv. Check the [paper page](https://arxiv.org/abs/2408.11281) for more details.
|
| |
|
| | ## ๐
TODO
|
| | - [ ] Improve related comments and documentation.
|
| | - [x] Upload the complete BearLLM demo code.
|
| | - [x] Upload the health management corpus of the MBHM dataset.
|
| | - [x] Collect the codes for pre-training and fine-tuning BearLLM.
|
| | - [x] Collect the codes of BearLLM's classification network and other comparison models.
|
| | - [x] Upload the vibration signal portion of the MBHM dataset.
|
| |
|
| | ## ๐ Introduction
|
| | The [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM) dataset is the first multimodal dataset designed for the study of bearing health management. It is divided into two parts: vibration signals and health management corpus. The vibration signals and condition information are derived from 9 publicly available datasets, and are still under continuous updating and improvement. The thousands of working conditions pose more difficult challenges for the identification model and better represent real-world usage scenarios.
|
| |
|
| | [BearLLM](https://github.com/SIA-IDE/BearLLM) is a prior knowledge-enhanced bearing health management framework with a unified vibration signal representation. This framework transforms the signal to be tested into the frequency domain, enabling effective identification of spectral differences compared to the vibration signal under fault-free conditions. By aligning the vibration signal with the fault semantic embedding, we achieve a unified natural language response for various health management tasks through a fine-tuned language model with low computational overhead. Experiments demonstrate that this framework achieves leading performance under thousands of working conditions.
|
| |
|
| | ## ๐ป Requirements
|
| |
|
| | The code is implemented in Python 3.12. The required packages are listed in the `requirements.txt` file. You can install the required packages by running the following command:
|
| |
|
| | ```bash
|
| | conda create --name bearllm python=3.12
|
| | conda activate bearllm
|
| | pip install -r requirements.txt
|
| | ```
|
| |
|
| |
|
| | ## ๐ Quick Start
|
| |
|
| | ### 1. Download Demo Data / Use Your Own Data
|
| |
|
| | First, you need to download the `demo_data.json` from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset.
|
| | For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
|
| |
|
| | Or, you can also build your own test data in the same format:
|
| | `instruction`: Text instruction for health management task.
|
| | `vib_data`: Vibration signal data to be identified, with a required duration of 1 second.
|
| | `ref_data`: Reference vibration signal data without faults, with a required duration of 1 second.
|
| |
|
| | ```json
|
| | {
|
| | "instruction": "xxx.",
|
| | "vib_data": [1.0, 0.0, 1.0, ...],
|
| | "ref_data": [1.0, 0.0, 1.0, ...],
|
| | }
|
| | ```
|
| |
|
| | ### 2. Download Weights
|
| |
|
| | You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
|
| |
|
| | Additionally, you need to download the weights of [BearLLM](https://huggingface.co/SIA-IDE/BearLLM/tree/main).
|
| |
|
| | ### 3. Organize Files
|
| |
|
| | It is recommended to organize the weights and test data as follows:
|
| |
|
| | ```
|
| | BearLLM/
|
| | โโโ qwen_weights/
|
| | โ โโโ model.safetensors
|
| | โ โโโ tokenizer.json
|
| | โ โโโ config.json
|
| | โ โโโ other files...
|
| | โโโ bearllm_weights/
|
| | โ โโโ vibration_adapter.pth
|
| | โ โโโ adapter_config.json
|
| | โ โโโ adapter_model.safetensors
|
| | โโโ mbhm_dataset/
|
| | โโโ demo_data.json
|
| | ```
|
| |
|
| | ### 4. Run Code
|
| | First, copy the `.env.example` file to `.env` and modify the data paths inside.
|
| | Then, you can run the code using the following command:
|
| |
|
| | ```bash
|
| | python run_demo.py
|
| | ```
|
| |
|
| | ## โ๏ธ Development
|
| |
|
| | ### 1. Download Dataset
|
| |
|
| | First, you need to download the following files from the [MBHM](https://huggingface.co/datasets/SIA-IDE/MBHM/tree/main) dataset. For users in mainland China, you can use the [mirror link](https://hf-mirror.com/datasets/SIA-IDE/MBHM/tree/main) to speed up the download:
|
| |
|
| | - `data.hdf5`: Contains the vibration signal data.
|
| | - `corpus.json`: Contains the health management corpus.
|
| | - `metadata.sqlite`: Contains metadata information of the dataset.
|
| |
|
| | ### 2. Download Weights
|
| |
|
| | You can download the pre-trained weights of [Qwen2.5-1.5B](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/tree/main) from Hugging Face.
|
| |
|
| | ### 3. Modify Environment Variables
|
| |
|
| | Copy the `.env.example` file to `.env` and modify the data paths inside.
|
| |
|
| | ### 4. Pre-train and Fine-tune Model
|
| |
|
| | Pre-train according to `src/pre_training.py`.
|
| | Fine-tune according to `src/fine_tuning.py`.
|
| |
|
| | ## ๐ Citation
|
| | Please cite the following paper if you use this study in your research:
|
| |
|
| | ```
|
| | @article{pengBearLLMPriorKnowledgeEnhanced2025,
|
| | title = {{{BearLLM}}: {{A Prior Knowledge-Enhanced Bearing Health Management Framework}} with {{Unified Vibration Signal Representation}}},
|
| | author = {Peng, Haotian and Liu, Jiawei and Du, Jinsong and Gao, Jie and Wang, Wei},
|
| | year = {2025},
|
| | month = apr,
|
| | journal = {Proceedings of the AAAI Conference on Artificial Intelligence},
|
| | volume = {39},
|
| | number = {19},
|
| | pages = {19866--19874},
|
| | issn = {2374-3468},
|
| | doi = {10.1609/aaai.v39i19.34188},
|
| | urldate = {2025-04-11},
|
| | }
|
| | ``` |