jpnarayana
/

jay

Text Generation

Model card Files Files and versions

jay / README.md

jpnarayana's picture

Upload folder using huggingface_hub

3e007ba verified 12 months ago

|

history blame contribute delete

2.81 kB

	---
	library_name: transformers
	tags:
	- code
	license: mit
	datasets:
	- iamtarun/python_code_instructions_18k_alpaca
	pipeline_tag: text-generation
	language:
	- en
	---

	# PyCodeGen 350M

	<!-- Provide a quick summary of what the model is/does. -->

	This model is finetuned version of [codegen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono) by Salesforce trained on python code [dataset](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca) using QLORA method.


	## Pretrained model description

	[codegen-350M-mono](https://huggingface.co/Salesforce/codegen-350M-mono)

	Codegen-350M-mono comes from the family of autoregressive models for program synthesis developed by Salesforce.
	This model was first trained on ThePile dataset which is 825.18 GiB English text corpus.
	It was then adapted to generate code by training on a set of GitQuery with source codes.
	Finally model has been adapted to the Python language by training on the BigPython dataset.


	## Training Data

	[python_code_instructions_18k_alpaca](https://huggingface.co/datasets/iamtarun/python_code_instructions_18k_alpaca)

	The dataset contains problem descriptions and code in python language.
	This dataset is taken from sahil2801/code_instructions_120k, which adds a prompt column in alpaca style.

	## Intended uses

	The model can be used to generate python code that solves task with optionally given input data.


	## Example of usage

	```py
	from transformers import AutoModelForCausalLM, AutoTokenizer


	model = AutoModelForCausalLM.from_pretrained('chincyk/PyCodeGen')
	tokenizer = AutoTokenizer.from_pretrained('chincyk/PyCodeGen')

	instruction = "Write a python class that represents a calculator, then use it to add two numbers."
	input = "a = 5, b = 2"

	prompt = f"""
	### Instruction:
	Use the Task below and the Input given to write the Response, which is a programming code that can solve the Task.

	### Task:
	{instruction}

	### Input:
	{input}

	### Response:
	"""

	input_ids = tokenizer(prompt, truncation=True, return_tensors="pt")['input_ids']
	output = model.generate(input_ids=input_ids, max_length=200)

	print(tokenizer.decode(output[0], skip_special_tokens=True))

	```

	## Training parameters

	BitsAndBytes:
	- load_in_4bit: True,
	- bnb_4bit_quant_type: nf4,
	- bnb_4bit_use_double_quant: True,
	- bnb_4bit_compute_dtype: torch.bfloat16

	LoraConfig:
	- r: 32,
	- lora_alpha: 16,
	- target_modules: all-linear,
	- lora_dropout: 0.1,
	- bias: none,
	- task_type: CASUAL_LM

	Finetuning:
	- num_epochs: 15
	- train_batch_size: 4
	- eval_batch_size: 8
	- gradient_accumulation_steps: 8
	- learning_rate: 3e-4
	- weight_decay: 0.01
	- lr_scheduler_name: cosine
	- num_warmup_steps: 190