Update README.md
Browse files
README.md
CHANGED
|
@@ -8,13 +8,6 @@ library_name: peft
|
|
| 8 |
|
| 9 |
The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
|
| 10 |
|
| 11 |
-
## Model Details
|
| 12 |
-
- Developed by: [Roman Kochnev / ABrain]
|
| 13 |
-
- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
|
| 14 |
-
- Model type: Causal Language Model (Transformer-based)
|
| 15 |
-
- Language(s) (NLP): Primarily English (or multilingual, if applicable)
|
| 16 |
-
- License: MIT
|
| 17 |
-
|
| 18 |
A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
|
| 19 |
|
| 20 |
# How to Use
|
|
@@ -38,5 +31,12 @@ Code of that model: {entry['nn_code']}
|
|
| 38 |
```
|
| 39 |
Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
|
| 40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 41 |
## Model Sources
|
| 42 |
Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
|
|
|
|
| 8 |
|
| 9 |
The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
|
| 12 |
|
| 13 |
# How to Use
|
|
|
|
| 31 |
```
|
| 32 |
Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
|
| 33 |
|
| 34 |
+
## Model Details
|
| 35 |
+
- Developed by: [Roman Kochnev / ABrain]
|
| 36 |
+
- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
|
| 37 |
+
- Model type: Causal Language Model (Transformer-based)
|
| 38 |
+
- Language(s) (NLP): Primarily English (or multilingual, if applicable)
|
| 39 |
+
- License: MIT
|
| 40 |
+
|
| 41 |
## Model Sources
|
| 42 |
Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
|