Update README.md
Browse files
README.md
CHANGED
|
@@ -68,19 +68,19 @@ Use the model as a reference for research support and hypothesis generation in p
|
|
| 68 |
|
| 69 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
| 70 |
|
| 71 |
-
The model
|
| 72 |
|
| 73 |
### Training Procedure
|
| 74 |
|
| 75 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 76 |
|
| 77 |
-
The model
|
| 78 |
-
Training
|
| 79 |
-
Fine-tuning
|
| 80 |
|
| 81 |
#### Training Hyperparameters
|
| 82 |
|
| 83 |
-
- **Training regime:**
|
| 84 |
|
| 85 |
|
| 86 |
## Evaluation
|
|
|
|
| 68 |
|
| 69 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
| 70 |
|
| 71 |
+
The model is trained on a curated collection of scientific literature, experimental datasets, and publicly available resources related to perovskite solar cell precursor additives. The dataset includes research articles and drug databases, focusing on synthesis, additive effects, and device performance. All training data has been uploaded and is documented for transparency and reproducibility.
|
| 72 |
|
| 73 |
### Training Procedure
|
| 74 |
|
| 75 |
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 76 |
|
| 77 |
+
The model is trained using a transformer-based architecture optimized for scientific text.
|
| 78 |
+
Training is performed on high-performance GPUs with gradient accumulation.
|
| 79 |
+
Fine-tuning is conducted on curated perovskite precursor additive datasets.
|
| 80 |
|
| 81 |
#### Training Hyperparameters
|
| 82 |
|
| 83 |
+
- **Training regime:** During instruction tuning, we use QwQ-32B as the base model and apply LoRA on all weight matrices with rank 16, alpha 32, and dropout 0.1. Training is conducted in bfloat16 precision with a per-device batch size of 1 and gradient accumulation of 8, an initial learning rate of 1e-4 decays via cosine annealing with a 5% warmup, for a total of 10 epochs. FlashAttention2 is employed to improve efficiency and memory usage. <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
|
| 84 |
|
| 85 |
|
| 86 |
## Evaluation
|