GenVRadmin
/

AryaBhatta-GemmaOrca-2-Merged

Text Generation

text-generation-inference

Model card Files Files and versions

AryaBhatta-GemmaOrca-2-Merged / README.md

GenVRadmin's picture

Update README.md

0a86455 verified almost 2 years ago

|

history blame contribute delete

2.85 kB

	---
	license: mit

	---


	This model is a part of two model series, AryaBhatta-1 and AryaBhatta-2 and is finetuned from HuggingFaceH4/zephyr-7b-gemma-v0.1 or Google/gemma and is finetuned on 9 Indian languages (Hindi, Tamil, Punjabi, Bengali, Gujarati, Oriya, Telugu, Kannada, Malayalam) plus English.

	There are two models. One finetuned on Google's Gemma and one fine-tuned on Zephyr's Gemma base. Repo for other one (Google one): GenVRadmin/AryaBhatta-GemmaOrca-Merged

	To improve the resoning and maths skills, we first SFT tune the gemma on Microsoft's Orca datasets.

	We utilize Orca maths Hindi dataset: GenVRadmin/Aryabhatta-Orca-Maths-Hindi \
	And original Orca maths dataset: microsoft/orca-math-word-problems-200k

	This pushes the MATHS score from 24.3 in Gemma-7B to 25.5 in Zephyr-Gemma and 31.6 in GemmaOrca.

	The model is then finetuned on GenVR's Samvaad datasets (GenVRadmin/Samvaad-Indic-Positive and GenVRadmin/Samvaad-Tamil-Mixtral and a subset of GenVRadmin/Samvaad-Mixed-Language-3).

	This is then finetuned on various open sourced datasets like:

	Telugu-LLM-Labs/yahma_alpaca_cleaned_telugu_filtered_and_romanized \
	Telugu-LLM-Labs/teknium_GPTeacher_general_instruct_telugu_filtered_and_romanized \
	abhinand/tamil-alpaca \
	Tensoic/airoboros-3.2_kn \
	Tensoic/gpt-teacher_kn \
	Tensoic/Alpaca-Gujarati \
	HydraIndicLM/bengali_alpaca_dolly_67k \
	Open-Orca/OpenOrca \
	pankajmathur/alpaca_orca \
	OdiaGenAI/Odia_Alpaca_instructions_52k \
	OdiaGenAI/gpt-teacher-roleplay-odia-3k \
	GenVRadmin/Samvaad-Punjabi-Mini \
	pankajmathur/WizardLM_Orca

	The model achieves following scores on benchmarks:

	Model AGIEval GPT4All TruthfulQA BigBench Average ⬇️ \
	AryaBhatta-GemmaOrca 35.9 72.26 53.85 40.35 50.59 \
	zephyr-7b-beta 37.52 71.77 55.26 39.77 51.08 \
	zephyr-7b-gemma-v0.1 34.22 66.37 52.19 37.10 47.47 \
	mlabonne/Gemmalpaca-7B 21.6 40.87 44.85 30.49 34.45 \
	google/gemma-7b-it 21.33 40.84 41.70 30.25 33.53





	How to use:-
	```
	from peft import AutoPeftModelForCausalLM
	from transformers import AutoTokenizer

	model = AutoPeftModelForCausalLM.from_pretrained(
	"GenVRadmin/AryaBhatta-GemmaOrca",
	load_in_4bit = False,
	token = hf_token
	)
	tokenizer = AutoTokenizer.from_pretrained("GenVRadmin/AryaBhatta-GemmaOrca")

	input_prompt = """
	### Instruction:
	{}

	### Input:
	{}

	### Response:
	{}"""

	input_text = input_prompt.format(
	"Answer this question about India.", # instruction
	"Who is the Prime Minister of India", # input
	"", # output - leave this blank for generation!
	)

	inputs = tokenizer([input_text], return_tensors = "pt").to("cuda")

	outputs = model.generate(**inputs, max_new_tokens = 300, use_cache = True)
	response = tokenizer.batch_decode(outputs)[0]
	```