IDPLab commited on
Commit
9f0ecbd
·
verified ·
1 Parent(s): 809d87a

Update Tesei-trained_Model/README.txt

Browse files
Files changed (1) hide show
  1. Tesei-trained_Model/README.txt +53 -53
Tesei-trained_Model/README.txt CHANGED
@@ -1,53 +1,53 @@
1
- This readme file was generated on 2024-07-23 by Lilianna Houston
2
-
3
- GENERAL INFORMATION
4
-
5
- Title of Project: PML, Tesei-trained Model
6
-
7
- Principal Investigator Information
8
- Name: Kingshuk Ghosh
9
- Institution: University of Denver
10
- Email: kingshuk.ghosh@du.edu
11
-
12
- Author Information
13
- Name: Lilianna Houston
14
- Institution: University of Denver
15
- Email: lili.houston@du.edu
16
-
17
- DATA & FILE OVERVIEW
18
-
19
- File List:
20
-
21
- "weights" -> Folder containing weights from the Tesei-trained CNN that predicts omega_2 from sequence. We trained the model 10 separate times on all omega_2 calculations from the Tesei 2023 dataset and provide all 10 resulting weights.
22
-
23
- "Tesei_w2_Ree_preds" -> CSV containing calculated and ML predicted omega_2 (w2) (predicted using 10 fold cross-validation), as well as reported and predicted R_ee for the Tesei 2023 dataset. Sequences were omega_2 calculation failed are omitted.
24
-
25
- "exper_seqs_master" -> CSV of our compiled experimental sequences, including source, sequnces, salt, pH, temperature, reported R_g and our predicted R_g (our value is averaged across the results of all 10 trained models). This is used as the input file for extract_w2.py [Use a different csv if you want to use a different sequence or set of sequences.]
26
-
27
- "extract_w2" -> .py file that extracts the omega_2s of a specified list of sequences using a specified set of weights. Make sure you use the correct input file if you want to change the current input file. Also change the output file at the end of the code if you change the input file.
28
-
29
- "exper_seqs_w2preds" -> CSV file. Same content as "exper_seqs_master," with the addition of predicted w2s using weights_0 from the "weights" folder. This is used as the input for extract_Rg
30
-
31
- "extract_Rg" -> .py file that extracts the x, R_ee, and R_gs of a specified list of sequences using omega_2. Currently w2 is obtained from exper_seqs_w2preds but use a different one if you used a different output above.
32
-
33
- "OBfmt_5-1500.npy" -> Helper file for "extract_Rg" containing precalulated terms.
34
-
35
- "theory_functions" -> .py helper file for "extract_Rg" containing constants and functions needed for R_g calculation.
36
-
37
- "environment.yml" -> yml file used to create conda environment in which to run "extract_w2".
38
-
39
- USAGE
40
-
41
- To run "extract_w2," you first must create a conda environment on Linex to ensure you have the necessary ML packages installed.
42
- Step 1) clone the Hugging Face repository: git clone https://huggingface.co/IDPLab/IDPconformation
43
- Step 2) create the conda environment: conda env create -f environment.yml
44
- Step 3) activate the environment: conda activate kingml
45
- Step 4) make sure to have the following modules loaded:
46
- module load compilers/anaconda-3.8-2020.11
47
- module load cuda11.8/toolkit/11.8.0
48
- module load libraries/cuDNN/7.6.5
49
- Step 5) run "extract_w2": python extract_w2
50
-
51
-
52
-
53
-
 
1
+ This readme file was generated on 2024-07-23 by Lilianna Houston
2
+
3
+ GENERAL INFORMATION
4
+
5
+ Title of Project: PML, Tesei-trained Model
6
+
7
+ Principal Investigator Information
8
+ Name: Kingshuk Ghosh
9
+ Institution: University of Denver
10
+ Email: kingshuk.ghosh@du.edu
11
+
12
+ Author Information
13
+ Name: Lilianna Houston
14
+ Institution: University of Denver
15
+ Email: lili.houston@du.edu
16
+
17
+ DATA & FILE OVERVIEW
18
+
19
+ File List:
20
+
21
+ "weights" -> Folder containing weights from the Tesei-trained CNN that predicts omega_2 from sequence. We trained the model 10 separate times on all omega_2 calculations from the Tesei 2023 dataset and provide all 10 resulting weights.
22
+
23
+ "Tesei_w2_Ree_preds" -> CSV containing calculated and ML predicted omega_2 (w2) (predicted using 10 fold cross-validation), as well as reported and predicted R_ee for the Tesei 2023 dataset. Sequences were omega_2 calculation failed are omitted.
24
+
25
+ "exper_seqs_master" -> CSV of our compiled experimental sequences, including source, sequnces, salt, pH, temperature, reported R_g and our predicted R_g (our value is averaged across the results of all 10 trained models). This is used as the input file for extract_w2.py [Use a different csv if you want to use a different sequence or set of sequences.]
26
+
27
+ "extract_w2" -> .py file that extracts the omega_2s of a specified list of sequences using a specified set of weights. Make sure you use the correct input file if you want to change the current input file. Also change the output file at the end of the code if you change the input file.
28
+
29
+ "exper_seqs_Rg_and_w2_preds_Tesei_model.csv" -> CSV file. Experimental sequences with the addition of average predicted w2s using weights "weights" folder and average Rgs calculated from predicted w2s using "extract_Rg".
30
+
31
+ "extract_Rg" -> .py file that extracts the x, R_ee, and R_gs of a specified list of sequences using omega_2.
32
+
33
+ "OBfmt_5-1500.npy" -> Helper file for "extract_Rg" containing precalulated terms.
34
+
35
+ "theory_functions" -> .py helper file for "extract_Rg" containing constants and functions needed for R_g calculation.
36
+
37
+ "environment.yml" -> yml file used to create conda environment in which to run "extract_w2".
38
+
39
+ USAGE
40
+
41
+ To run "extract_w2," you first must create a conda environment on Linex to ensure you have the necessary ML packages installed.
42
+ Step 1) clone the Hugging Face repository: git clone https://huggingface.co/IDPLab/IDPconformation
43
+ Step 2) create the conda environment: conda env create -f environment.yml
44
+ Step 3) activate the environment: conda activate kingml
45
+ Step 4) make sure to have the following modules loaded:
46
+ module load compilers/anaconda-3.8-2020.11
47
+ module load cuda11.8/toolkit/11.8.0
48
+ module load libraries/cuDNN/7.6.5
49
+ Step 5) run "extract_w2": python extract_w2
50
+
51
+
52
+
53
+