mind2cloud commited on
Commit
ae3c46c
·
verified ·
1 Parent(s): dbd64c9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -8,13 +8,6 @@ library_name: peft
8
 
9
  The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
10
 
11
- ## Model Details
12
- - Developed by: [Roman Kochnev / ABrain]
13
- - Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
14
- - Model type: Causal Language Model (Transformer-based)
15
- - Language(s) (NLP): Primarily English (or multilingual, if applicable)
16
- - License: MIT
17
-
18
  A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
19
 
20
  # How to Use
@@ -38,5 +31,12 @@ Code of that model: {entry['nn_code']}
38
  ```
39
  Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
40
 
 
 
 
 
 
 
 
41
  ## Model Sources
42
  Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
 
8
 
9
  The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
10
 
 
 
 
 
 
 
 
11
  A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
12
 
13
  # How to Use
 
31
  ```
32
  Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
33
 
34
+ ## Model Details
35
+ - Developed by: [Roman Kochnev / ABrain]
36
+ - Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
37
+ - Model type: Causal Language Model (Transformer-based)
38
+ - Language(s) (NLP): Primarily English (or multilingual, if applicable)
39
+ - License: MIT
40
+
41
  ## Model Sources
42
  Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R