ABrain
/

HPGPT-DeepSeek-R1-Distill-Qwen-7B-R

Model card Files Files and versions

mind2cloud commited on Apr 3, 2025

Commit

dbd64c9

·

verified ·

1 Parent(s): c0d2305

Update README.md

Files changed (1) hide show

README.md +11 -1

README.md CHANGED Viewed

@@ -8,7 +8,6 @@ library_name: peft
 The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
 ## Model Details
 - Developed by: [Roman Kochnev / ABrain]
 - Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
@@ -28,5 +27,16 @@ tokenizer = AutoTokenizer.from_pretrained(model_path)
 model = AutoModelForCausalLM.from_pretrained(model_path)
 ```
 ## Model Sources
 Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R

 The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
 ## Model Details
 - Developed by: [Roman Kochnev / ABrain]
 - Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 model = AutoModelForCausalLM.from_pretrained(model_path)
 ```
+# Prompt Example
+```python
+"""
+Generate only the values (do not provide any explanation) of the hyperparameters ({prm_names}) of a given model:
+{entry['metric']} for the task: {entry['task']} on dataset: {entry['dataset']}, with transformation: {entry['transform_code']},
+so that the model achieves accuracy = {entry['accuracy']} with number of training epochs = {entry['epoch']}.
+Code of that model: {entry['nn_code']}
+"""
+```
+Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
 ## Model Sources
 Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R