Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,6 @@ library_name: peft
|
|
| 8 |
|
| 9 |
The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
|
| 10 |
|
| 11 |
-
|
| 12 |
## Model Details
|
| 13 |
- Developed by: [Roman Kochnev / ABrain]
|
| 14 |
- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
|
|
@@ -28,5 +27,16 @@ tokenizer = AutoTokenizer.from_pretrained(model_path)
|
|
| 28 |
model = AutoModelForCausalLM.from_pretrained(model_path)
|
| 29 |
```
|
| 30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
## Model Sources
|
| 32 |
Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
|
|
|
|
| 8 |
|
| 9 |
The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
|
| 10 |
|
|
|
|
| 11 |
## Model Details
|
| 12 |
- Developed by: [Roman Kochnev / ABrain]
|
| 13 |
- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
|
|
|
|
| 27 |
model = AutoModelForCausalLM.from_pretrained(model_path)
|
| 28 |
```
|
| 29 |
|
| 30 |
+
# Prompt Example
|
| 31 |
+
```python
|
| 32 |
+
"""
|
| 33 |
+
Generate only the values (do not provide any explanation) of the hyperparameters ({prm_names}) of a given model:
|
| 34 |
+
{entry['metric']} for the task: {entry['task']} on dataset: {entry['dataset']}, with transformation: {entry['transform_code']},
|
| 35 |
+
so that the model achieves accuracy = {entry['accuracy']} with number of training epochs = {entry['epoch']}.
|
| 36 |
+
Code of that model: {entry['nn_code']}
|
| 37 |
+
"""
|
| 38 |
+
```
|
| 39 |
+
Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
|
| 40 |
+
|
| 41 |
## Model Sources
|
| 42 |
Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
|