EpistemeAI
/

Athena-codegemma-2-9b-v1

Text Generation

text-generation-inference

Model card Files Files and versions

Athena-codegemma-2-9b-v1 / README.md

legolasyiu's picture

Update README.md

3859d39 verified over 1 year ago

|

history blame contribute delete

3.67 kB

	---
	base_model: EpistemeAI/Athena-codegemma-2-9b
	language:
	- en
	license: apache-2.0
	tags:
	- text-generation-inference
	- transformers
	- unsloth
	- gemma2
	- trl
	pipeline_tag: text-generation
	---

	# How to use
	This repository contains Athena-codegemma-2-9b-v1, for use with transformers and with the original llama codebase.

	Use with transformers
	Starting with transformers >= 4.43.0 onward, you can run conversational inference using the Transformers pipeline abstraction or by leveraging the Auto classes with the generate() function.

	Make sure to update your transformers installation via pip install --upgrade transformers.

	## Best use to test or prompt:

	You need to prepare prompt in alpaca format to generate properly:

	### Basic
	```python
	f"""Below is an instruction that describes a task. \
	Write a response that appropriately completes the request.

	### Instruction:
	{x['instruction']}

	### Input:
	{x['input']}

	### Response:
	"""
	```

	### Here is example:

	```python
	def format_test(x):

	if x['input']:
	formatted_text = f"""Below is an instruction that describes a task. \
	Write a response that appropriately completes the request.

	### Instruction:
	{x['instruction']}

	### Input:
	{x['input']}

	### Response:
	"""

	else:
	formatted_text = f"""Below is an instruction that describes a task. \
	Write a response that appropriately completes the request.

	### Instruction:
	{x['instruction']}

	### Response:
	"""

	return formatted_text

	# using code_instructions_122k_alpaca dataset
	Prompt = format_test(data[155])
	print(Prompt)

	```
	- huggingface transformers method:
	```python
	from transformers import TextStreamer

	FastLanguageModel.for_inference(model) # Enable native 2x faster inference
	inputs = tokenizer(
	[
	Prompt
	], return_tensors = "pt").to("cuda")

	text_streamer = TextStreamer(tokenizer)
	_ = model.generate(**inputs, streamer = text_streamer, max_new_tokens = 512)
	```


	- unsloth method
	```python
	from unsloth import FastLanguageModel

	model, tokenizer = FastLanguageModel.from_pretrained(
	model_name = "EpistemeAI/Athena-codegemma-2-9b-v1", # YOUR MODEL YOU USED FOR TRAINING
	max_seq_length = max_seq_length,
	dtype = dtype,
	load_in_4bit = load_in_4bit,
	)
	FastLanguageModel.for_inference(model) # Enable native 2x faster inference

	# alpaca_prompt = You MUST copy from above!

	inputs = tokenizer(
	[
	alpaca_prompt.format(
	"Create a function to calculate the sum of a sequence of integers.", # instruction
	"", # input
	"", # output - leave this blank for generation!
	)
	], return_tensors = "pt").to("cuda")

	outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
	tokenizer.batch_decode(outputs)
	```

	--

	### Inputs and outputs

	* Input: Text string, such as a question, a prompt, or a document to be
	summarized.
	* Output: Generated English-language text in response to the input, such
	as an answer to a question, or a summary of a document.
	### Citation

	```none
	@article{gemma_2024,
	title={Gemma},
	url={https://www.kaggle.com/m/3301},
	DOI={10.34740/KAGGLE/M/3301},
	publisher={Kaggle},
	author={Gemma Team},
	year={2024}
	}
	```

	# Uploaded model

	- Developed by: EpistemeAI
	- License: apache-2.0
	- Finetuned from model : EpistemeAI/Athena-codegemma-2-9b

	This gemma2 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

	[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)