eacortes commited on
Commit
62cd458
·
verified ·
1 Parent(s): 506d698

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -149,19 +149,6 @@ fill = pipeline("fill-mask", model=model, tokenizer=tokenizer)
149
  print(fill("c1ccccc1[MASK]"))
150
  ```
151
 
152
- ## Intended Use
153
- * Primary: Research and development for molecular property prediction, experimentation with pooling strategies, and as a foundational model for downstream applications.
154
- * Appropriate for: Binary / multi-class classification (e.g., toxicity, activity) and single-task or multi-task regression (e.g., solubility, clearance) after fine-tuning.
155
- * Not intended for generating novel molecules.
156
-
157
- ## Limitations
158
- - Out-of-domain performance may degrade for: very long (>128 token) SMILES, inorganic / organometallic compounds, polymers, or charged / enumerated tautomers are not well represented in training.
159
- - No guarantee of synthesizability, safety, or biological efficacy.
160
-
161
- ## Ethical Considerations & Responsible Use
162
- - Potential biases arise from training corpora skewed to drug-like space.
163
- - Do not deploy in clinical or regulatory settings without rigorous, domain-specific validation.
164
-
165
  ## Architecture
166
  - Backbone: ModernBERT
167
  - Hidden size: 768
@@ -292,6 +279,19 @@ Optimal parameters (per dataset) for the `MLM + DAPT + TAFT OPT` merged model:
292
 
293
  </details>
294
 
 
 
 
 
 
 
 
 
 
 
 
 
 
295
  ## Hardware
296
  Training and experiments were performed on 2 NVIDIA RTX 3090 GPUs.
297
 
 
149
  print(fill("c1ccccc1[MASK]"))
150
  ```
151
 
 
 
 
 
 
 
 
 
 
 
 
 
 
152
  ## Architecture
153
  - Backbone: ModernBERT
154
  - Hidden size: 768
 
279
 
280
  </details>
281
 
282
+ ## Intended Use
283
+ * Primary: Research and development for molecular property prediction, experimentation with pooling strategies, and as a foundational model for downstream applications.
284
+ * Appropriate for: Binary / multi-class classification (e.g., toxicity, activity) and single-task or multi-task regression (e.g., solubility, clearance) after fine-tuning.
285
+ * Not intended for generating novel molecules.
286
+
287
+ ## Limitations
288
+ - Out-of-domain performance may degrade for: very long (>128 token) SMILES, inorganic / organometallic compounds, polymers, or charged / enumerated tautomers are not well represented in training.
289
+ - No guarantee of synthesizability, safety, or biological efficacy.
290
+
291
+ ## Ethical Considerations & Responsible Use
292
+ - Potential biases arise from training corpora skewed to drug-like space.
293
+ - Do not deploy in clinical or regulatory settings without rigorous, domain-specific validation.
294
+
295
  ## Hardware
296
  Training and experiments were performed on 2 NVIDIA RTX 3090 GPUs.
297