ABrain
/

HPGPT-DeepSeek-R1-Distill-Qwen-7B-R

Model card Files Files and versions

mind2cloud commited on Apr 3, 2025

Commit

2b6bb46

·

verified ·

1 Parent(s): 2f67fe5

Update README.md

Files changed (1) hide show

README.md +10 -8

README.md CHANGED Viewed

@@ -4,6 +4,15 @@ base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 library_name: peft
 ---
 A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
 # Model Card for DeepSeek-R1-Distill-Qwen-7B-R
@@ -17,13 +26,6 @@ tokenizer = AutoTokenizer.from_pretrained(model_path)
 model = AutoModelForCausalLM.from_pretrained(model_path)
 ```
-## Model Details
-The fine-tuning of the DeepSeek-R1-Distill-Qwen-7B model was conducted to explore the feasibility of using large language models (LLMs) for hyperparameter optimization in deep learning. The goal was to assess whether LLMs can effectively predict optimal hyperparameters for various neural network architectures, providing a competitive alternative to traditional optimization methods like Optuna.
-- Developed by: [Roman Kochnev / ABrain]
-- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
-- Model type: Causal Language Model (Transformer-based)
-- Language(s) (NLP): Primarily English (or multilingual, if applicable)
-- License: MIT
 ## Model Sources
 Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R

 library_name: peft
 ---
+The fine-tuning of the DeepSeek-R1-Distill-Qwen-7B model was conducted to explore the feasibility of using large language models (LLMs) for hyperparameter optimization in deep learning. The goal was to assess whether LLMs can effectively predict optimal hyperparameters for various neural network architectures, providing a competitive alternative to traditional optimization methods like Optuna Framework.
+## Model Details
+- Developed by: [Roman Kochnev / ABrain]
+- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
+- Model type: Causal Language Model (Transformer-based)
+- Language(s) (NLP): Primarily English (or multilingual, if applicable)
+- License: MIT
 A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
 # Model Card for DeepSeek-R1-Distill-Qwen-7B-R
 model = AutoModelForCausalLM.from_pretrained(model_path)
 ```
 ## Model Sources
 Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R