Spaces:

NMR-CeNT-UW
/

ShimNet

Sleeping

App Files Files Community

Marek Bukowicki commited on Apr 17, 2025

Commit

64b4096

1 Parent(s): 12f8c5c

add shimnet code

Browse files

Files changed (36) hide show

Dockerfile +13 -0
LICENSE +21 -0
README.md +221 -1
calibration_loop/maclib/shimator_loop +33 -0
calibration_loop/sweep_shims_lineshape_Z1Z2.py +39 -0
configs/shimnet_600.yaml +46 -0
configs/shimnet_700.yaml +46 -0
configs/shimnet_template.yaml +50 -0
download_files.py +95 -0
extract_scrf_from_fids.py +132 -0
predict-gui.py +197 -0
predict.py +89 -0
requirements-cpu.txt +9 -0
requirements-gpu.txt +9 -0
requirements-gui.txt +2 -0
sample_data/2ethylonaphthalene_bestshims_700MHz.csv +0 -0
sample_data/2ethylonaphthalene_up_1mm_700MHz.csv +0 -0
sample_data/Azarone_20ul_700MHz.csv +0 -0
sample_data/Azarone_X_supressed_600MHz.csv +0 -0
sample_data/Azarone_Z1Z2Z3Z4_supressed_600MHz.csv +0 -0
sample_data/Azarone_Z1Z2_supressed_600MHz.csv +0 -0
sample_data/Azarone_besteshims_supressed_600MHz.csv +0 -0
sample_data/Azarone_bestshims_700MHz.csv +0 -0
sample_data/CresolRed_after_styrene_600MHz.csv +0 -0
sample_data/CresolRed_after_styrene_cut_600MHz.csv +0 -0
sample_data/CresolRed_bestshims_600MHz.csv +0 -0
sample_data/CresolRed_cut_bestshims_600MHz.csv +0 -0
sample_data/Geraniol_bestshims_600MHz.csv +0 -0
sample_data/Geraniol_up_1mm_600MHz.csv +0 -0
sample_data/SodiumButyrate_after_glucose_cut_700MHz.csv +0 -0
sample_data/SodiumButyrate_cut_bestshims_700MHz.csv +0 -0
src/generators.py +280 -0
src/models.py +105 -0
train.py +141 -0
weights/shimnet_600MHz.pt +3 -0
weights/shimnet_700MHz.pt +3 -0

Dockerfile ADDED Viewed

	@@ -0,0 +1,13 @@

+FROM python:3.10
+WORKDIR /usr/src/app
+COPY requirements-cpu.txt requirements-gui.txt ./
+RUN pip install --no-cache-dir -r requirements-cpu.txt -r requirements-gui.txt --extra-index-url https://download.pytorch.org/whl/cpu
+COPY . .
+RUN python download_files.py --overwrite
+CMD [ "python", "./predict-gui.py", "--server_name", "0.0.0.0" ]

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2025 center4ml
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -9,4 +9,224 @@ license: mit
 short_description: ShimNet Spectra Correction
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 short_description: ShimNet Spectra Correction
 ---
+# ShimNet
+ShimNet is a data-driven AI solution to improve high-resolution nuclear magnetic resonance (NMR) spectra
+distorted by the inhomogeneous magnetic field (less than optimal shimming). To use it, the experimental training data has to be collected (see **Data collection** below).
+Example data can also be downloaded (see below).
+Paper: [ShimNet: A neural network for post-acquisition improvement of NMR spectra distorted by magnetic-field inhomogeneity](https://chemrxiv.org/engage/chemrxiv/article-details/67ef866)
+## Installation
+Python 3.9+ (3.10+ for GUI)
+GPU version (for training and inference)
+```
+pip install -r requirements-gpu.txt
+```
+CPU version (for inference, not recommended for training)
+```
+pip install -r requirements-cpu.txt --extra-index-url https://download.pytorch.org/whl/cpu
+```
+## Usage
+To correct spectra presented in the paper:
+1. download weights (model parameters):
+```
+python download_files.py
+```
+or directly from [Google Drive 700MHz](https://drive.google.com/uc?export=download&id=17fTNWl7YW6mPbbZWga0EfdoF_6S8fCke) and [Google Drive 600MHz](https://drive.google.com/uc?export=download&id=1_VxOpFGJcFsOa5DHOW2GJbP8RvHCmC1N) and place it in `weights` directory
+2. : run correction (e.g. `Azarone_20ul_700MHz.csv`):
+```
+python predict.py sample_data/Azarone_20ul_700MHz.csv -o output --config configs/shimnet_700.yaml --weights weights/shimnet_700MHz.pt
+```
+The output will be `output/Azarone_20ul_700MHz_processed.csv` file
+Multiple files may be processed using "*" syntax:
+```
+python predict.py sample_data/*700MHz.csv -o output --config configs/shimnet_700.yaml --weights weights/shimnet_700MH
+z.pt
+```
+For 600 MHz data use `--config configs/shimnet_600.yaml` and  `--weights weights/shimnet_600MHz.pt`, e.g.:
+```
+python predict.py sample_data/CresolRed_after_styrene_600MHz.csv -o output --config configs/shimnet_600.yaml --weights weights/shimnet_600MHz.pt
+```
+### input format
+The spectrum file for reconstruction should be in the format of two columns separated by a space and without the sign at the end of the line at the end of the file(example below):
+```csv
+-1.97134	0.0167137
+-1.97085	-0.00778748
+-1.97036	-0.0109595
+-1.96988	0.00825978
+-1.96939	0.0133886
+```
+## Train on your data
+For the model to function properly, it should be trained on calibration data from the spectrometer used for the measurements. To train a model on data from your spectrometer, please follow the instructions below.
+### Training data collection
+Below we describe the training data collection for Agilent/Varian spectrometers. For machines of other vendors similar procedure can be implemented.
+To collect ShimNet training data use Python script (sweep_shims_lineshape_Z1Z2.py) from the calibration_loop folder to drive the spectrometer:
+ 1. Install TReNDS package ( trends.spektrino.com )
+ 2. Open VnmrJ and type: 'listenon'
+ 3. Put the lineshape sample (1% CHCl3 in deuterated acetone), set standard PROTON parameters, and set nt=1 (do not modify sw and at!)
+ 4. Shim the sample and collect the data. Save the optimally shimmed dataset
+ 5. Edit the sweep_shims_lineshape_Z1Z2.py script
+ 6. Put optimum z1 and z2 shim values as optiz1 and optiz2 below
+ 7. Define the calibration range as range_z1 and range_z2 (default is ok)
+ 8. Start the python script:
+   ```
+     python3 ./sweep_shims_lineshape_Z1Z2.py
+   ```
+   The spectrometer will start collecting spectra
+### SCRF extraction
+Shim Coil Response Functions (SCRF) should be extracted from the spectra with `extract_scrf_from_fids.py` script.
+```
+python extract_scrf_from_fids.py
+```
+The script uses hardcoded paths to the NMR signals (fid-s) in Agilent/Varian format: a directory with optimal measurement (`opti_fid_path` available) and a directory with calibration loop measurements (`data_dir`):
+```python
+# input
+data_dir = "../../sample_run/loop"
+opti_fid_path = "../../sample_run/opti.fid"
+```
+The output files are also hardcoded:
+```python
+# output
+spectra_file = "../../sample_run/total.npy"
+spectra_file_names = "../../sample_run/total.csv"
+opi_spectrum_file = "../../sample_run/opti.npy"
+responses_file = "../../sample_run/scrf_61.pt"
+```
+where only the `responses_file` is used in ShimNet training.
+If the measurements are stored in a format other than Varian, you may need to change this line:
+```python
+dic, data = ng.varian.read(varian_fid_path)
+```
+(see nmrglue package documentation for details)
+### Training
+1. Download multiplets database:
+    ```
+    python download_files.py --multiplets
+    ```
+2. Configure run:
+  - create a run directory, e.g. `runs/my_lab_spectrometer_2025`
+  - create a configuration file:
+    1. copy `configs/shimnet_template.py` to the run directory and rename it to `config.yaml`
+       ```bash
+       cp configs/shimnet_template.py runs/my_lab_spectrometer_2025/config.yaml
+       ```
+    2. edit the SCRF in path in the config file:
+       ```yaml
+         response_functions_files:
+         - path/to/srcf_file
+       ```
+       e.g.
+       ```yaml
+         response_functions_files:
+         - ../../sample_run/scrf_61.pt
+       ```
+    3. adjust spectrometer frequency step `frq_step` to match your data (spectrometer range in Hz divided by number of points in spectrum):
+        ```yaml
+        frq_step: 0.34059797
+        ```
+    4. adjust spectromer frequency in the metadata
+        ```yaml
+        metadata: # additional metadata, not used in the training process
+          spectrometer_frequency: 700.0 # MHz
+        ```
+3. Run training:
+    ```
+    python train.py runs/my_lab_spectrometer_2025
+    ```
+    Training results will appear in `runs/my_lab_spectrometer_2025` directory.
+    Model parameters are stored in `runs/my_lab_spectrometer_2025/model.pt` file
+4. Use trained model:
+    use `--config runs/my_lab_spectrometer_2025/config.yaml` and  `--weights runs/my_lab_spectrometer_2025/model.pt` flags, e.g.
+    ```
+    python predict.py my_sample1.csv -o my_output --config runs/my_lab_spectrometer_2025/config.yaml --weights runs/my_lab_spectrometer_2025/model.pt
+    ```
+## Repeat training on our data
+If you want to train the network using the calibration data from our paper, follow the procedure below.
+1. Download multiplets database and our SCRF files:
+    ```
+    python download_files.py --multiplets --SCRF --no-weights
+    ```
+    or directly download from Google Drive and store in `data/` directory: [Response Functions 600MHz](https://drive.google.com/file/d/1J-DsPtaITXU3TFrbxaZPH800U1uIiwje/view?usp=sharing), [Response Functions 700MHz](https://drive.google.com/file/d/113al7A__yYALx_2hkESuzFIDU3feVtNY/view?usp=sharing), [Multiplets data](https://drive.google.com/file/d/1QGvV-Au50ZxaP1vFsmR_auI299Dw-Wrt/view?usp=sharing)
+2. Configure run
+    - For 600MHz spectrometer:
+      ```bash
+      mkdir -p runs/repeat_paper_training_600MHz
+      cp configs/shimnet_600.yaml runs/repeat_paper_training_600MHz/config.yaml
+      ```
+    - For 700 MHz spectrometer:
+      ```bash
+      mkdir -p runs/repeat_paper_training_700MHz
+      cp configs/shimnet_700.yaml runs/repeat_paper_training_700MHz/config.yaml
+      ```
+3. Run training:
+    ```
+    python train.py runs/repeat_paper_training_600MHz
+    ```
+    or
+    ```
+    python train.py runs/repeat_paper_training_700MHz
+    ```
+    Training results will appear in `runs/repeat_paper_training_600MHz` or `runs/repeat_paper_training_700MHz` directory.
+## GUI
+### Installation
+To use the ShimNet GUI, ensure you have Python 3.10 installed (not tested with Python 3.11+). After installing the ShimNet requirements (CPU/GPU), install the additional dependencies for the GUI:
+```bash
+pip install -r requirements-gui.txt
+```
+### Launching the GUI
+The ShimNet GUI is built using Gradio. To start the application, run:
+```bash
+python predict-gui.py
+```
+Once the application starts, open your browser and navigate to:
+```
+http://127.0.0.1:7860
+```
+to access the GUI locally.
+### Sharing the GUI
+To make the GUI accessible over the internet, use the `--share` flag:
+```bash
+python predict-gui.py --share
+```
+A public web address will be displayed in the terminal, which you can use to access the GUI remotely or share with others.

calibration_loop/maclib/shimator_loop ADDED Viewed

	@@ -0,0 +1,33 @@

+" macro shimnet loop "
+" arguments: optimal z1, optimal z2, step in z1 loop, step in z2 loop, half number of steps in z1, half number of steps in z2"
+$svfdir='/home/nmrbox/kkazimierczuk/shimnet/data/'
+$opti_z1=$1
+$opti_z2=$2
+$step_z1=$3
+$step_z2=$4
+$steps_z1=$5
+$steps_z2=$6
+$nn=''
+$bb=''
+$msg=''
+format((($steps_z1*2+1)*($steps_z2*2+1)*(d1+at)*nt)/3600,5,1):$msg
+$msg='Time: '+$msg+' h'
+banner($msg)
+$j=0
+repeat
+  $i=0
+  $z2=$opti_z2-$steps_z2*$step_z2+$j*$step_z2
+  format($z2,5,1):$bb
+   repeat
+	$z1=$opti_z1-$steps_z1*$step_z1+$i*$step_z1
+	"su"
+	"go"
+	format($z1,5,1):$nn
+	$filepath=$svfdir + 'z1_'+$nn+'z2_'+$bb+'.fid'
+	svf($filepath,'force')
+	$i=$i+1
+   until $z1>$opti_z1+$steps_z1*$step_z1-1
+   $j=$j+1
+until $z2>$opti_z2+$steps_z2*$step_z2-1

calibration_loop/sweep_shims_lineshape_Z1Z2.py ADDED Viewed

	@@ -0,0 +1,39 @@

+from TReND.send2vnmr import *
+import shutil
+import time
+import os
+import numpy as np
+from subprocess import call
+#This is the script to collect ShimNet trainig data on Agilent spectrometers. Execution:
+# 1. Open VnmrJ and type: 'listenon'
+# 2. Put the lineshape sample, set standard PROTON parameters and set one scan (do not modify sw and at!)
+# 3. Shim the sample and collect the data. Save the optimally shimmed datatset
+# 4. Put optimum z1 and z2 shim values as optiz1 and optiz2 below
+# 5. Define the calibration range as range_z1 and range_z2 (default is ok)
+# 6. Start the python script:
+# python3 ./sweep_shims_lineshape_Z1Z2.py
+# the spectrometer will start collecting spectra
+Classic_path = '/home/nmr700/shimnet6/lshp'
+optiz1= 8868 #put optimum shim values here
+optiz2=-297
+range_z1=100 #put optimum shim ranges here
+range_z2=100 #put optimum shim ranges here
+z1_sweep=np.arange(optiz1-range_z1,optiz1+range_z1+1,2.0)
+z2_sweep=np.arange(optiz2-range_z2,optiz2+range_z2+1,2.0)
+for i in range(1,np.shape(z1_sweep)[0]+1, 1):
+    for j in range(1,np.shape(z2_sweep)[0]+1, 1):
+        wait_until_idle()
+        Run_macro("sethw('z1',"+str(z1_sweep[i-1])+ ")")
+        Run_macro("sethw('z2',"+str(z2_sweep[j-1])+ ")")
+        go_if_idle()
+        wait_until_idle()
+        time.sleep(0.5)
+        Save_experiment(Classic_path + '_z1_'+ str(int(z1_sweep[i-1])) + '_z2_'+ str(int(z2_sweep[j-1]))+ '.fid')

configs/shimnet_600.yaml ADDED Viewed

	@@ -0,0 +1,46 @@

+model:
+  name: ShimNetWithSCRF
+  kwargs:
+    rensponse_length: 81
+    resnponse_head_dims:
+    - 128
+training:
+- batch_size: 64
+  learning_rate: 0.001
+  max_iters: 1600000
+- batch_size: 512
+  learning_rate: 0.001
+  max_iters: 25600000
+- batch_size: 512
+  learning_rate: 0.0005
+  max_iters: 12800000
+losses_weights:
+  clean: 1.0
+  noised: 1.0
+  response: 1.0
+data:
+  response_functions_files:
+  # Paste path to your SCRF file here
+  # - Can be absolute path
+  # - Can be relative to repository root
+  - data/scrf_81_600MHz.pt
+  atom_groups_data_file: data/multiplets_10000_parsed.txt
+  response_function_stretch_min: 1.0
+  response_function_stretch_max: 1.0
+  response_function_noise: 0.0
+  multiplicity_j1_min: 0.0
+  multiplicity_j1_max: 15
+  multiplicity_j2_min: 0.0
+  multiplicity_j2_max: 15
+  number_of_signals_min: 2
+  number_of_signals_max: 5
+  thf_min: 0.5
+  thf_max: 2
+  relative_height_min: 0.5
+  relative_height_max: 4
+  frq_step: 0.30048
+logging:
+  step: 1000000
+  num_plots: 32
+metadata:
+  spectrometer_frequency: 600.0

configs/shimnet_700.yaml ADDED Viewed

	@@ -0,0 +1,46 @@

+model:
+  name: ShimNetWithSCRF
+  kwargs:
+    rensponse_length: 61
+    resnponse_head_dims:
+    - 128
+training:
+- batch_size: 64
+  learning_rate: 0.001
+  max_iters: 1600000
+- batch_size: 512
+  learning_rate: 0.001
+  max_iters: 6400000
+- batch_size: 512
+  learning_rate: 0.0005
+  max_iters: 12800000
+- batch_size: 800
+  learning_rate: 0.0005
+  max_iters: 12800000
+losses_weights:
+  clean: 1.0
+  noised: 1.0
+  response: 1.0
+data:
+  response_functions_files:
+  - data/scrf_61_700MHz.pt
+  atom_groups_data_file: data/multiplets_10000_parsed.txt
+  response_function_stretch_min: 1.0
+  response_function_stretch_max: 1.0
+  response_function_noise: 0.0
+  multiplicity_j1_min: 0.0
+  multiplicity_j1_max: 15
+  multiplicity_j2_min: 0.0
+  multiplicity_j2_max: 15
+  number_of_signals_min: 2
+  number_of_signals_max: 5
+  thf_min: 0.5
+  thf_max: 2
+  relative_height_min: 0.5
+  relative_height_max: 4
+  frq_step: 0.34059797
+logging:
+  step: 1000000
+  num_plots: 32
+metadata:
+  spectrometer_frequency: 700.0

configs/shimnet_template.yaml ADDED Viewed

	@@ -0,0 +1,50 @@

+model:
+  name: ShimNetWithSCRF
+  kwargs:
+    rensponse_length: 61
+    resnponse_head_dims:
+    - 128
+training:
+- batch_size: 64
+  learning_rate: 0.001
+  max_iters: 1600000
+- batch_size: 512
+  learning_rate: 0.001
+  max_iters: 6400000
+- batch_size: 512
+  learning_rate: 0.0005
+  max_iters: 12800000
+- batch_size: 800
+  learning_rate: 0.0005
+  max_iters: 12800000
+losses_weights:
+  clean: 1.0
+  noised: 1.0
+  response: 1.0
+data:
+  response_functions_files:
+  # Specify the path to your SCRF file/files here.
+  # - It can be an absolute path.
+  # - It can be relative to the repository root.
+  # Multiple files can be listed in the following rows (YAML list format).
+  - path/to/srcf_file
+  atom_groups_data_file: data/multiplets_10000_parsed.txt
+  response_function_stretch_min: 1.0 # realtive
+  response_function_stretch_max: 1.0 # realtive
+  response_function_noise: 0.0 # arbitrary hight units
+  multiplicity_j1_min: 0.0 # Hz
+  multiplicity_j1_max: 15 # Hz
+  multiplicity_j2_min: 0.0 # Hz
+  multiplicity_j2_max: 15 # Hz
+  number_of_signals_min: 2
+  number_of_signals_max: 5
+  thf_min: 0.5 # arbitrary hight units
+  thf_max: 2 # arbitrary hight units
+  relative_height_min: 0.5 # realtive
+  relative_height_max: 4 # realtive
+  frq_step: 0.34059797	# Hz per point
+logging:
+  step: 1000000
+  num_plots: 32
+metadata: # additional metadata, not used in the training process
+  spectrometer_frequency: 700.0 # MHz

download_files.py ADDED Viewed

	@@ -0,0 +1,95 @@

+import urllib.request
+from pathlib import Path
+import argparse
+ALL_FILES_TO_DOWNLOAD = {
+    "weights": [{
+        "url": "https://drive.google.com/uc?export=download&id=17fTNWl7YW6mPbbZWga0EfdoF_6S8fCke",
+        "destination": "weights/shimnet_700MHz.pt"
+    },
+    {
+        "url": "https://drive.google.com/uc?export=download&id=1_VxOpFGJcFsOa5DHOW2GJbP8RvHCmC1N",
+        "destination": "weights/shimnet_600MHz.pt"
+    }],
+    "SCRF": [{
+        "url": "https://drive.google.com/uc?export=download&id=113al7A__yYALx_2hkESuzFIDU3feVtNY",
+        "destination": "data/scrf_61_700MHz.pt"
+    },
+    {
+        "url": "https://drive.google.com/uc?export=download&id=1J-DsPtaITXU3TFrbxaZPH800U1uIiwje",
+        "destination": "data/scrf_81_600MHz.pt"
+    }],
+    "mupltiplets": [{
+        "url": "https://drive.google.com/uc?export=download&id=1QGvV-Au50ZxaP1vFsmR_auI299Dw-Wrt",
+        "destination": "data/multiplets_10000_parsed.txt"
+    }],
+    "development": []
+}
+def parse_args():
+    parser = argparse.ArgumentParser(
+        description='Download files: weighs (default), SCRF (optional), multiplet data (optional)',
+    )
+    parser.add_argument('--overwrite', action='store_true', help='Overwrite existing files')
+    parser.add_argument(
+        '--weights',
+        action='store_true',
+        default=True,
+        help='Download weights file (default behavior). Use --no-weights to opt out.',
+    )
+    parser.add_argument(
+        '--no-weights',
+        action='store_false',
+        dest='weights',
+        help='Do not download weights file.',
+    )
+    parser.add_argument('--SCRF', action='store_true', help='Download SCRF files - Shim Coil Response Functions')
+    parser.add_argument('--multiplets', action='store_true', help='Download multiplets data file')
+    parser.add_argument('--development', action='store_true', help='Download development weights file')
+    parser.add_argument('--all', action='store_true', help='Download all available files')
+    args = parser.parse_args()
+    # Set all individual flags if --all is specified
+    if args.all:
+        args.weights = True
+        args.SCRF = True
+        args.multiplets = True
+        args.development = True
+    return args
+def download_file(url, target, overwrite=False):
+    target = Path(target)
+    if target.exists() and not overwrite:
+        response = input(f"File {target} already exists. Overwrite? (y/n): ")
+        if response.lower() != 'y':
+            print(f"Download of {target} cancelled")
+            return
+    target.parent.mkdir(parents=True, exist_ok=True)
+    try:
+        urllib.request.urlretrieve(url, target)
+        print(f"Downloaded {target}")
+    except Exception as e:
+        print(f"Failed to download file from {url}:\n {e}")
+if __name__ == "__main__":
+    args = parse_args()
+    main_dir = Path(__file__).parent
+    if args.weights:
+        for file_data in ALL_FILES_TO_DOWNLOAD["weights"]:
+            download_file(file_data["url"], main_dir / file_data["destination"], args.overwrite)
+    if args.SCRF:
+        for file_data in ALL_FILES_TO_DOWNLOAD["SCRF"]:
+            download_file(file_data["url"], main_dir / file_data["destination"], args.overwrite)
+    if args.multiplets:
+        for file_data in ALL_FILES_TO_DOWNLOAD["mupltiplets"]:
+            download_file(file_data["url"], main_dir / file_data["destination"], args.overwrite)
+    if args.development:
+        for file_data in ALL_FILES_TO_DOWNLOAD["development"]:
+            download_file(file_data["url"], main_dir / file_data["destination"], args.overwrite)

extract_scrf_from_fids.py ADDED Viewed

	@@ -0,0 +1,132 @@

+import os
+import shutil
+from pathlib import Path
+import matplotlib.pyplot as plt
+import numpy as np
+import pandas as pd
+import torch
+import nmrglue as ng
+from tqdm import tqdm
+# input
+data_dir = "../../../dane/600Hz_20241227/loop"
+opti_fid_path = "../../../dane/600Hz_20241227/opti"
+# output
+spectra_file = "data/600Hz_20241227-Lnative/total.npy"
+spectra_file_names = "data/600Hz_20241227-Lnative/total.csv"
+opi_spectrum_file = "data/600Hz_20241227-Lnative/opti.npy"
+responses_file = "data/600Hz_20241227-Lnative/scrf_61.pt"
+losses_file = "data/600Hz_20241227-Lnative/losses_scrf_61.pt"
+# create directories for output files
+for file_path in [spectra_file, spectra_file_names, opi_spectrum_file, responses_file]:
+    Path(file_path).parent.mkdir(parents=True, exist_ok=True)
+# settings: fid to spectrum
+ph0_correction =-190.49
+ph1_correction = 0
+autophase_fn="acme"
+target_length = None
+# settings: SCRF extraction
+calibration_peak_center = "auto"#12277 # the center of the calibration peak (peak of)
+calibration_window_halfwidth = 128 # the half width of the calibration window
+steps = 6000
+kernel_size = 61
+kernel_sqrt = False # True to allow only positive values
+# fid to spectra processing
+def fid_to_spectrum(varian_fid_path, ph0_correction, ph1_correction, autophase_fn, target_length=None, sin_pod=False):
+    dic, data = ng.varian.read(varian_fid_path)
+    data[0] *= 0.5
+    if sin_pod:
+        data = ng.proc_base.sp(data, end=0.98)
+    if target_length is not None:
+        if (pad_length := target_length - len(data)) > 0:
+            data = ng.proc_base.zf(data, pad_length)
+        else:
+            data = data[:target_length]
+    spec=ng.proc_base.fft(data)
+    spec = ng.process.proc_autophase.autops(spec, autophase_fn, p0=ph0_correction, p1=ph1_correction, disp=False)
+    return spec
+# process optimal measurement fid to spectrum
+opti_spectrum_full = fid_to_spectrum(opti_fid_path, ph0_correction, ph1_correction, autophase_fn, target_length=target_length)
+if calibration_peak_center == "auto":
+    calibration_peak_center = np.argmax(abs(opti_spectrum_full))
+fitting_range = (calibration_peak_center - calibration_window_halfwidth, calibration_peak_center+calibration_window_halfwidth+1)
+opti_spectrum = opti_spectrum_full[fitting_range[0]:fitting_range[1]]
+np.save(opi_spectrum_file, opti_spectrum)
+print(f"Optimal spectrum extracted to {opi_spectrum_file}")
+# process loop fids to spectra
+spec_list=[]
+spec_names=[]
+print("Extracting spectra from fids...")
+for fid_path in tqdm(list(Path(data_dir).rglob('*.fid'))):
+    spec = fid_to_spectrum(fid_path, ph0_correction, ph1_correction, autophase_fn, target_length=target_length)[fitting_range[0]:fitting_range[1]]
+    spec_list.append(spec)
+    spec_names.append(fid_path.name)
+total = np.array(spec_list)
+np.save(spectra_file, total)
+pd.DataFrame(spec_names).to_csv(spectra_file_names, header=False)
+# total = np.load(spectra_file)
+print(f"Spectra extracted to {spectra_file}")
+# process SCRF extraction
+def fit_kernel(base, target, kernel_size, kernel_sqrt=True, steps=20000, verbose=False):
+    kernel = torch.ones((1,1,kernel_size), dtype=base.dtype)
+    if kernel_sqrt:
+        kernel /= torch.sqrt(torch.sum(kernel**2))
+    else:
+        kernel /= kernel_size
+    kernel.requires_grad = True
+    optimizer = torch.optim.Adam([kernel])
+    for epoch in range(steps):
+        if kernel_sqrt:
+            spe_est = torch.conv1d(base, kernel**2, padding='same')
+        else:
+            spe_est = torch.conv1d(base, kernel, padding='same')
+        loss = torch.mean(abs(target - spe_est)**2) #torch.nn.functional.mse_loss(spe_est, target)
+        loss.backward()
+        optimizer.step()
+        optimizer.zero_grad()
+        if verbose and (epoch+1) % 100 == 0:
+            print(epoch, loss.item())
+    if kernel_sqrt:
+        return kernel.detach()**2, loss.item()
+    else:
+        return kernel.detach(), loss.item()
+responses = torch.empty(len(total), 1, 1, 1, 1, kernel_size)
+losses = torch.empty(len(total))
+base = torch.tensor(opti_spectrum.real).unsqueeze(0)
+targets = torch.tensor(total.real)
+# normalization
+base /= base.sum()
+targets /= targets.sum(dim=(-1,), keepdim=True)
+print("\nExtracting SCRFs...")
+for i, target in tqdm(enumerate(targets), total=len(targets)):
+    kernel, loss = fit_kernel(base, target.unsqueeze(0), kernel_size, kernel_sqrt=kernel_sqrt, steps=steps)
+    responses[i, 0, 0] = kernel
+    losses[i] = loss
+torch.save(responses, responses_file)
+torch.save(losses, losses_file)
+print(f"SCRFs extracted to {responses_file}, losses saved to {losses_file}")

predict-gui.py ADDED Viewed

	@@ -0,0 +1,197 @@

+import torch
+torch.set_grad_enabled(False)
+import numpy as np
+from pathlib import Path
+from omegaconf import OmegaConf
+import gradio as gr
+import plotly.graph_objects as go
+from src.models import ShimNetWithSCRF, Predictor
+from predict import Defaults, resample_input_spectrum, resample_output_spectrum, initialize_predictor
+# silent deprecation warnings
+import warnings
+warnings.filterwarnings('ignore', category=UserWarning, message='TypedStorage is deprecated')
+import argparse
+# Add argument parsing for server_name
+parser = argparse.ArgumentParser(description="Launch ShimNet Spectra Correction App")
+parser.add_argument(
+    "--server_name",
+    type=str,
+    default="127.0.0.1",
+    help="Server name to bind the app (default: 127.0.0.1). Use 0.0.0.0 for external access."
+)
+parser.add_argument(
+    "--share",
+    action="store_true",
+    help="If set, generates a public link to share the app."
+)
+args = parser.parse_args()
+def process_file(input_file, config_file, weights_file, input_spectrometer_frequency=None,reference_spectrum=None):
+    if input_spectrometer_frequency == 0:
+        input_spectrometer_frequency = None
+    # Load configuration and initialize predictor
+    config = OmegaConf.load(config_file)
+    model_ppm_per_point = config.data.frq_step / config.metadata.spectrometer_frequency
+    predictor = initialize_predictor(config, weights_file)
+    # Load input data
+    input_data = np.loadtxt(input_file)
+    input_freqs_input_ppm, input_spectrum = input_data[:, 0], input_data[:, 1]
+    # Convert input frequencies to model's frequency
+    if input_spectrometer_frequency is not None:
+        input_freqs_model_ppm = input_freqs_input_ppm * input_spectrometer_frequency / config.metadata.spectrometer_frequency
+    else:
+        input_freqs_model_ppm = input_freqs_input_ppm
+    # Resample input spectrum
+    freqs, spectrum = resample_input_spectrum(input_freqs_model_ppm, input_spectrum, model_ppm_per_point)
+    # Scale and process spectrum
+    spectrum_tensor = torch.tensor(spectrum).float()
+    scaling_factor = Defaults.SCALE / spectrum_tensor.max()
+    spectrum_tensor *= scaling_factor
+    prediction = predictor(spectrum_tensor).numpy()
+    prediction /= scaling_factor
+    # Resample output spectrum
+    output_prediction = resample_output_spectrum(input_freqs_model_ppm, freqs, prediction)
+    # Prepare output data for download
+    output_data = np.column_stack((input_freqs_input_ppm, output_prediction))
+    output_file = f"{Path(input_file).stem}_processed{Path(input_file).suffix}"
+    np.savetxt(output_file, output_data)
+    # Create Plotly figure
+    fig = go.Figure()
+    # Add Input Spectrum and Corrected Spectrum (always visible)
+    normalization_value = input_spectrum.max()
+    fig.add_trace(go.Scatter(x=input_freqs_input_ppm, y=input_spectrum/normalization_value, mode='lines', name='Input Spectrum', visible=True, line=dict(color='#EF553B'))) # red
+    fig.add_trace(go.Scatter(x=input_freqs_input_ppm, y=output_prediction/normalization_value, mode='lines', name='Corrected Spectrum', visible=True, line=dict(color='#00cc96'))) # green
+    if reference_spectrum is not None:
+        reference_spectrum_freqs, reference_spectrum_intensity = np.loadtxt(reference_spectrum).T
+        reference_spectrum_intensity /= reference_spectrum_intensity.max()
+        n_zooms = 50
+        zooms = np.geomspace(0.01, 100, 2 * n_zooms + 1)
+        # Add Reference Data traces (initially invisible)
+        for zoom in zooms:
+            fig.add_trace(
+                go.Scatter(
+                    x=reference_spectrum_freqs,
+                    y=reference_spectrum_intensity * zoom,
+                    mode='lines',
+                    name=f'Reference Data (Zoom: {zoom:.2f})',
+                    visible=False,
+                    line=dict(color='#636efa')
+                )
+            )
+        # Make the middle zoom level visible by default
+        fig.data[2 * n_zooms // 2 + 2].visible = True
+        # Create and add slider
+        steps = []
+        for i in range(2, len(fig.data)):  # Start from the reference data traces
+            step = dict(
+                method="update",
+                args=[{"visible": [True, True] + [False] * (len(fig.data) - 2)}],  # Keep first two traces visible
+            )
+            step["args"][0]["visible"][i] = True  # Toggle i'th reference trace to "visible"
+            steps.append(step)
+        sliders = [dict(
+            active=n_zooms,
+            currentvalue={"prefix": "Reference zoom: "},
+            pad={"t": 50},
+            steps=steps
+        )]
+        fig.update_layout(
+            sliders=sliders
+        )
+    fig.update_layout(
+        title="Spectrum Visualization",
+        xaxis_title="Frequency (ppm)",
+        yaxis_title="Intensity"
+    )
+    return fig, output_file
+# Gradio app
+with gr.Blocks() as app:
+    gr.Markdown("# ShimNet Spectra Correction")
+    gr.Markdown("[ShimNet: A neural network for post-acquisition improvement of NMR spectra distorted by magnetic-field inhomogeneity](https://chemrxiv.org/engage/chemrxiv/article-details/67ef86686dde43c90860d315)")
+    gr.Markdown("Upload your input file, configuration, and weights to process the NMR spectrum.")
+    with gr.Row():
+        with gr.Column():
+            model_selection = gr.Radio(
+                label="Select Model",
+                choices=["600 MHz", "700 MHz", "Custom"],
+                value="600 MHz"
+            )
+            config_file = gr.File(label="Custom Config File (.yaml)", visible=False, height=120)
+            weights_file = gr.File(label="Custom Weights File (.pt)", visible=False, height=120)
+        with gr.Column():
+            input_file = gr.File(label="Input File (.txt | .csv)", height=120)
+            input_spectrometer_frequency = gr.Number(label="Input Spectrometer Frequency (MHz) (0 or empty if the same as in the loaded model)", value=None)
+            gr.Markdown("Upload reference spectrum files (optional). Reference spectrum will be plotted for comparison.")
+            reference_spectrum_file = gr.File(label="Reference Spectra File (.txt | .csv)", height=120)
+    process_button = gr.Button("Process File")
+    plot_output = gr.Plot(label="Spectrum Visualization")
+    download_button = gr.File(label="Download Processed File", interactive=False, height=120)
+    # Update visibility of config and weights fields based on model selection
+    def update_visibility(selected_model):
+        if selected_model == "Custom":
+            return gr.update(visible=True), gr.update(visible=True)
+        else:
+            return gr.update(visible=False), gr.update(visible=False)
+    model_selection.change(
+        update_visibility,
+        inputs=[model_selection],
+        outputs=[config_file, weights_file]
+    )
+    # Process button click logic
+    def process_file_with_model(input_file, model_selection, config_file, weights_file, input_spectrometer_frequency, reference_spectrum_file):
+        if model_selection == "600 MHz":
+            config_file = "configs/shimnet_600.yaml"
+            weights_file = "weights/shimnet_600MHz.pt"
+        elif model_selection == "700 MHz":
+            config_file = "configs/shimnet_700.yaml"
+            weights_file = "weights/shimnet_700MHz.pt"
+        else:
+            config_file = config_file.name
+            weights_file = weights_file.name
+        return process_file(input_file.name, config_file, weights_file, input_spectrometer_frequency, reference_spectrum_file.name if reference_spectrum_file else None)
+    process_button.click(
+        process_file_with_model,
+        inputs=[input_file, model_selection, config_file, weights_file, input_spectrometer_frequency, reference_spectrum_file],
+        outputs=[plot_output, download_button]
+    )
+app.launch(share=args.share, server_name=args.server_name)
+# '#636efa',
+#  '#EF553B',
+#  '#00cc96',
+#  '#ab63fa',
+#  '#FFA15A',
+#  '#19d3f3',
+#  '#FF6692',
+#  '#B6E880',
+#  '#FF97FF',
+#  '#FECB52'

predict.py ADDED Viewed

	@@ -0,0 +1,89 @@

+import torch
+torch.set_grad_enabled(False)
+import numpy as np
+import argparse
+from pathlib import Path
+import sys, os
+from omegaconf import OmegaConf
+from src.models import ShimNetWithSCRF, Predictor
+# silent deprecation warnings
+# https://github.com/pytorch/pytorch/issues/97207#issuecomment-1494781560
+import warnings
+warnings.filterwarnings('ignore', category=UserWarning, message='TypedStorage is deprecated')
+class Defaults:
+    SCALE = 16.0
+def parse_args():
+    parser = argparse.ArgumentParser()
+    parser.add_argument("input_files", help="Input files", nargs="+")
+    parser.add_argument("--config", help="config file .yaml")
+    parser.add_argument("--weights", help="model weights")
+    parser.add_argument("-o", "--output_dir", default=".", help="Output directory")
+    parser.add_argument("--input_spectrometer_frequency", default=None, type=float, help="spectrometer frequency in MHz (input sample collection frequency). Empty if the same as in the training data")
+    args = parser.parse_args()
+    return args
+# functions
+def resample_input_spectrum(input_freqs, input_spectrum, Mhz_per_point):
+    """resample input spectrum to match the model's frequency range"""
+    freqs = np.arange(input_freqs.min(), input_freqs.max(), Mhz_per_point)
+    spectrum = np.interp(freqs, input_freqs, input_spectrum)
+    return freqs, spectrum
+def resample_output_spectrum(input_freqs, freqs, prediction):
+    """resample prediction to match the input spectrum's frequency range"""
+    prediction = np.interp(input_freqs, freqs, prediction)
+    return prediction
+def initialize_predictor(config, weights_file):
+    model = ShimNetWithSCRF(**config.model.kwargs)
+    predictor = Predictor(model, weights_file)
+    return predictor
+# run
+if __name__ == "__main__":
+    args = parse_args()
+    output_dir = Path(args.output_dir)
+    output_dir.mkdir(exist_ok=True, parents=True)
+    config = OmegaConf.load(args.config)
+    model_ppm_per_point = config.data.frq_step / config.metadata.spectrometer_frequency
+    predictor = initialize_predictor(config, args.weights)
+    for input_file in args.input_files:
+        print(f"processing {input_file} ...")
+        # load data
+        input_data = np.loadtxt(input_file)
+        input_freqs_input_ppm, input_spectrum = input_data[:,0], input_data[:,1]
+        # convert input frequencies to model's frequency - correct for zero filling ad spectrometer frequency
+        if args.input_spectrometer_frequency is not None:
+            input_freqs_model_ppm = input_freqs_input_ppm * args.input_spectrometer_frequency / config.metadata.spectrometer_frequency
+        else:
+            input_freqs_model_ppm = input_freqs_input_ppm
+        freqs, spectrum = resample_input_spectrum(input_freqs_model_ppm, input_spectrum, model_ppm_per_point)
+        spectrum = torch.tensor(spectrum).float()
+        # scale height of the spectrum
+        scaling_factor = Defaults.SCALE / spectrum.max()
+        spectrum *= scaling_factor
+        # correct spectrum
+        prediction = predictor(spectrum).numpy()
+        # rescale height
+        prediction /= scaling_factor
+        # resample the output to match the input spectrum
+        output_prediction = resample_output_spectrum(input_freqs_model_ppm, freqs, prediction)
+        # save result
+        output_file = output_dir / f"{Path(input_file).stem}_processed{Path(input_file).suffix}"
+        np.savetxt(output_file, np.column_stack((input_freqs_input_ppm, output_prediction)))
+        print(f"saved to {output_file}")

requirements-cpu.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+torch==2.4.1+cpu
+torchaudio==2.4.1+cpu
+nmrglue==0.11
+torchdata==0.9.0
+numpy==2.0.2
+matplotlib==3.9.3
+pandas==2.2.3
+tqdm==4.67.1
+hydra-core==1.3.2

requirements-gpu.txt ADDED Viewed

	@@ -0,0 +1,9 @@

+torch==2.4.1
+torchaudio==2.4.1
+nmrglue==0.11
+torchdata==0.9.0
+numpy==2.0.2
+matplotlib==3.9.3
+pandas==2.2.3
+tqdm==4.67.1
+hydra-core==1.3.2

requirements-gui.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ gradio==5.23.2
2	+ plotly==6.0.1

sample_data/2ethylonaphthalene_bestshims_700MHz.csv ADDED Viewed