ABrain
/

HPGPT-DeepSeek-R1-Distill-Qwen-7B-R

Model card Files Files and versions

mind2cloud commited on Apr 3, 2025

Commit

ae3c46c

·

verified ·

1 Parent(s): dbd64c9

Update README.md

Files changed (1) hide show

README.md +7 -7

README.md CHANGED Viewed

@@ -8,13 +8,6 @@ library_name: peft
 The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
-## Model Details
-- Developed by: [Roman Kochnev / ABrain]
-- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
-- Model type: Causal Language Model (Transformer-based)
-- Language(s) (NLP): Primarily English (or multilingual, if applicable)
-- License: MIT
 A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
 # How to Use
@@ -38,5 +31,12 @@ Code of that model: {entry['nn_code']}
 ```
 Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
 ## Model Sources
 Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R

 The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
 A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
 # How to Use
 ```
 Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
+## Model Details
+- Developed by: [Roman Kochnev / ABrain]
+- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
+- Model type: Causal Language Model (Transformer-based)
+- Language(s) (NLP): Primarily English (or multilingual, if applicable)
+- License: MIT
 ## Model Sources
 Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R