ABrain
/

HPGPT-DeepSeek-R1-Distill-Qwen-7B-R

Model card Files Files and versions

HPGPT-DeepSeek-R1-Distill-Qwen-7B-R / README.md

mind2cloud's picture

Update README.md

8bf17e7 verified 10 months ago

|

history blame contribute delete

2.27 kB

	---
	license: mit
	base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	library_name: peft
	---

	# DeepSeek-R1-Distill-Qwen-7B-R

	The DeepSeek-R1-Distill-Qwen-7B model has been fine-tuned to predict hyperparameters for neural network models. Leveraging the power of large language models (LLMs), this version can analyze neural network architectures and generate optimal hyperparameter configurations — such as learning rate, batch size, dropout, momentum, and so on — for a given task. This approach offers a competitive alternative to traditional optimization methods like the Optuna Framework.

	A large language model used in the <a href='https://github.com/ABrain-One/NN-GPT'>NNGPT</a> project for generating training hyperparameters for neural networks from the <a href='https://github.com/ABrain-One/NN-Dataset'>LEMUR NN Dataset</a>

	# How to Use
	This repository provides a fine-tuned version of [deepseek-ai/DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) using the [PEFT](https://github.com/huggingface/peft) library with LoRA. The final model is merged so it can be loaded in one step via:
	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_path = "ABrain/HPGPT-DeepSeek-R1-Distill-Qwen-7B-R"
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	model = AutoModelForCausalLM.from_pretrained(model_path)
	```

	# Prompt Example
	```python
	"""
	Generate only the values (do not provide any explanation) of the hyperparameters ({prm_names}) of a given model:
	{entry['metric']} for the task: {entry['task']} on dataset: {entry['dataset']}, with transformation: {entry['transform_code']},
	so that the model achieves the HIGHEST accuracy with number of training epochs = {entry['epoch']}.
	Code of that model: {entry['nn_code']}
	"""
	```
	Replace placeholders such as `{entry['name']}`, `{entry['task']}`, `{entry['dataset']}`, etc., with your actual values.

	## Model Details
	- Developed by: [Roman Kochnev / ABrain]
	- Finetuned from model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	- Model type: Causal Language Model (Transformer-based)
	- Language(s) (NLP): Primarily English (or multilingual, if applicable)
	- License: MIT

	## Model Sources
	Repository: ABrain/DeepSeek-R1-Distill-Qwen-7B-R