ABrain
/

HPGPT-DeepSeek-R1-Distill-Qwen-7B-R

Model card Files Files and versions

mind2cloud commited on Apr 3, 2025

Commit

bbc7cac

·

verified ·

1 Parent(s): 2b6bb46

Update README.md

Files changed (1) hide show

README.md +2 -3

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 library_name: peft
 ---
-The fine-tuning of the DeepSeek-R1-Distill-Qwen-7B model was conducted to explore the feasibility of using large language models (LLMs) for hyperparameter optimization in deep learning. The goal was to assess whether LLMs can effectively predict optimal hyperparameters for various neural network architectures, providing a competitive alternative to traditional optimization methods like Optuna Framework.
 ## Model Details
 - Developed by: [Roman Kochnev / ABrain]
@@ -27,5 +27,4 @@ model = AutoModelForCausalLM.from_pretrained(model_path)
 ```
 ## Model Sources
-Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R

 library_name: peft
 ---
+The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. By leveraging the power of large language models (LLMs), this fine-tuned version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, and momentum, etc. — for a given task. This approach provides a competitive alternative to traditional optimization methods like Optuna Framework.
 ## Model Details
 - Developed by: [Roman Kochnev / ABrain]
 ```
 ## Model Sources
+Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R