mind2cloud commited on
Commit
2b6bb46
·
verified ·
1 Parent(s): 2f67fe5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -8
README.md CHANGED
@@ -4,6 +4,15 @@ base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
4
  library_name: peft
5
  ---
6
 
 
 
 
 
 
 
 
 
 
7
  A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
8
 
9
  # Model Card for DeepSeek-R1-Distill-Qwen-7B-R
@@ -17,13 +26,6 @@ tokenizer = AutoTokenizer.from_pretrained(model_path)
17
  model = AutoModelForCausalLM.from_pretrained(model_path)
18
  ```
19
 
20
- ## Model Details
21
- The fine-tuning of the DeepSeek-R1-Distill-Qwen-7B model was conducted to explore the feasibility of using large language models (LLMs) for hyperparameter optimization in deep learning. The goal was to assess whether LLMs can effectively predict optimal hyperparameters for various neural network architectures, providing a competitive alternative to traditional optimization methods like Optuna.
22
- - Developed by: [Roman Kochnev / ABrain]
23
- - Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
24
- - Model type: Causal Language Model (Transformer-based)
25
- - Language(s) (NLP): Primarily English (or multilingual, if applicable)
26
- - License: MIT
27
-
28
  ## Model Sources
29
  Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
 
 
4
  library_name: peft
5
  ---
6
 
7
+ The fine-tuning of the DeepSeek-R1-Distill-Qwen-7B model was conducted to explore the feasibility of using large language models (LLMs) for hyperparameter optimization in deep learning. The goal was to assess whether LLMs can effectively predict optimal hyperparameters for various neural network architectures, providing a competitive alternative to traditional optimization methods like Optuna Framework.
8
+
9
+ ## Model Details
10
+ - Developed by: [Roman Kochnev / ABrain]
11
+ - Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
12
+ - Model type: Causal Language Model (Transformer-based)
13
+ - Language(s) (NLP): Primarily English (or multilingual, if applicable)
14
+ - License: MIT
15
+
16
  A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
17
 
18
  # Model Card for DeepSeek-R1-Distill-Qwen-7B-R
 
26
  model = AutoModelForCausalLM.from_pretrained(model_path)
27
  ```
28
 
 
 
 
 
 
 
 
 
29
  ## Model Sources
30
  Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
31
+