mind2cloud commited on
Commit
bbc7cac
·
verified ·
1 Parent(s): 2b6bb46

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -3
README.md CHANGED
@@ -4,7 +4,7 @@ base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
4
  library_name: peft
5
  ---
6
 
7
- The fine-tuning of the DeepSeek-R1-Distill-Qwen-7B model was conducted to explore the feasibility of using large language models (LLMs) for hyperparameter optimization in deep learning. The goal was to assess whether LLMs can effectively predict optimal hyperparameters for various neural network architectures, providing a competitive alternative to traditional optimization methods like Optuna Framework.
8
 
9
  ## Model Details
10
  - Developed by: [Roman Kochnev / ABrain]
@@ -27,5 +27,4 @@ model = AutoModelForCausalLM.from_pretrained(model_path)
27
  ```
28
 
29
  ## Model Sources
30
- Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R
31
-
 
4
  library_name: peft
5
  ---
6
 
7
+ The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. By leveraging the power of large language models (LLMs), this fine-tuned version can analyze neural network architectures and generate optimal hyperparameter configurations such as learning rate, batch size, dropout, and momentum, etc. — for a given task. This approach provides a competitive alternative to traditional optimization methods like Optuna Framework.
8
 
9
  ## Model Details
10
  - Developed by: [Roman Kochnev / ABrain]
 
27
  ```
28
 
29
  ## Model Sources
30
+ Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R