thesven
/

Phi-nut-Butter-Codebagel-v1

Text Generation

text-generation-inference

Model card Files Files and versions

Phi-nut-Butter-Codebagel-v1 / README.md

thesven's picture

Update README.md

796bc7a verified almost 2 years ago

|

history blame contribute delete

2.37 kB

	---
	license: mit
	datasets:
	- Replete-AI/code_bagel
	---
	# Phi-nut-Butter-Codebagel-v1

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/6324ce4d5d0cf5c62c6e3c5a/ayrvhUhdbawRVfNiqoOP7.png)

	## Model Details

	Model Name: Phi-nut-Butter-Codebagel-v1
	Base Model: [microsoft/Phi-3-mini-128k-instruct](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)
	Fine-tuning Method: Supervised Fine-Tuning (SFT)
	Dataset: [Code Bagel](https://huggingface.co/datasets/Replete-AI/code_bagel)
	Training Data: 75,000 randomly selected rows from Code Bagel dataset
	Training Duration: 23 hours
	Hardware: Nvidia RTX A4500
	Epochs: 3

	## Training Procedure

	This model was fine-tuned to provide better instructions on code.

	The training was conducted using PEFT and SFTTrainer on the Code Bagel dataset.
	Training was completed in 3 epochs over a span of 23 hours on an Nvidia A4500 GPU.

	## Intended Use

	This model is designed to improve instruction-following capabilities, particularly for code-related tasks.

	## Getting Started

	## Instruct Template
	```bash
	<\|system\|>
	{system_message} <\|end\|>
	<\|user\|>
	{Prompt) <\|end\|>
	<\|assistant\|>
	```

	### Transfromers

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig

	model_name_or_path = "thesven/Phi-nut-Butter-Codebagel-v1"

	# BitsAndBytesConfig for loading the model in 4-bit precision
	bnb_config = BitsAndBytesConfig(
	load_in_4bit=True,
	bnb_4bit_quant_type="nf4",
	bnb_4bit_compute_dtype="float16",
	)

	tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, use_fast=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_name_or_path,
	device_map="auto",
	trust_remote_code=False,
	revision="main",
	quantization_config=bnb_config
	)
	model.pad_token = model.config.eos_token_id

	prompt_template = '''
	<\|system\|>
	You are an expert developer. Please help me with any coding questions.<\|end\|>
	<\|user\|>
	Create a function to get the total sum from an array of ints.<\|end\|>
	<\|assistant\|>
	'''

	input_ids = tokenizer(prompt_template, return_tensors='pt').input_ids.cuda()
	output = model.generate(inputs=input_ids, temperature=0.1, do_sample=True, top_p=0.95, top_k=40, max_new_tokens=256)

	generated_text = tokenizer.decode(output[0, len(input_ids[0]):], skip_special_tokens=True)
	print(generated_text)
	```