ABrain
/

HPGPT-DeepSeek-Coder-1.3b-Base-R

PEFT

Safetensors

llama

Model card Files Files and versions

xet

Community

mind2cloud commited on Apr 3, 2025

Commit

dbc2c38

verified ·

1 Parent(s): 84d71e9

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -6

README.md CHANGED Viewed

@@ -1,13 +1,16 @@
 ---
 license: mit
-base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 library_name: peft
 ---
-A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
-# Model Card for DeepSeek-Coder-1.3b-Base-R
 This repository provides a **fine-tuned version** of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) using the [PEFT](https://github.com/huggingface/peft) library with LoRA. The final model is **merged** so it can be loaded in one step via:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -17,13 +20,23 @@ tokenizer = AutoTokenizer.from_pretrained(model_path)
 model = AutoModelForCausalLM.from_pretrained(model_path)
 ```
 ## Model Details
-The fine-tuning of the deepseek-ai/deepseek-coder-1.3b-base model was conducted to explore the feasibility of using large language models (LLMs) for hyperparameter optimization in deep learning. The goal was to assess whether LLMs can effectively predict optimal hyperparameters for various neural network architectures, providing a competitive alternative to traditional optimization methods like Optuna.
 - Developed by: [Roman Kochnev / ABrain]
-- Finetuned from model: deepseek-ai/deepseek-ai/deepseek-coder-1.3b-base
 - Model type: Causal Language Model (Transformer-based)
 - Language(s) (NLP): Primarily English (or multilingual, if applicable)
 - License: MIT
 ## Model Sources
-Repository: ABrain/DeepSeek-Coder-1.3b-Base-R

 ---
 license: mit
+base_model: deepseek-ai/DeepSeek-Coder-1.3b-Base-R
 library_name: peft
 ---
+# DeepSeek-Coder-1.3b-Base-R
+The DeepSeek-Coder-1.3b-Base model has been fine-tuned **to predict hyperparameters for neural network models**. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.
+A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>
+# How to Use
 This repository provides a **fine-tuned version** of [deepseek-ai/deepseek-coder-1.3b-base](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) using the [PEFT](https://github.com/huggingface/peft) library with LoRA. The final model is **merged** so it can be loaded in one step via:
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
 model = AutoModelForCausalLM.from_pretrained(model_path)
 ```
+# Prompt Example
+```python
+"""
+Generate only the values (do not provide any explanation) of the hyperparameters ({prm_names}) of a given model:
+{entry['metric']} for the task: {entry['task']} on dataset: {entry['dataset']}, with transformation: {entry['transform_code']},
+so that the model achieves accuracy = {entry['accuracy']} with number of training epochs = {entry['epoch']}.
+Code of that model: {entry['nn_code']}
+"""
+```
+Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.
 ## Model Details
 - Developed by: [Roman Kochnev / ABrain]
+- Finetuned from model: deepseek-ai/deepseek-coder-1.3b-base
 - Model type: Causal Language Model (Transformer-based)
 - Language(s) (NLP): Primarily English (or multilingual, if applicable)
 - License: MIT
 ## Model Sources
+Repository: ABrain/DeepSeek-Coder-1.3b-Base-R