appvoid
/

palmer-002-2401

Text Generation

text-generation-inference

Model card Files Files and versions

palmer-002-2401 / README.md

appvoid's picture

Update README.md

fd7a4f1 verified about 2 years ago

|

history blame contribute delete

1.26 kB

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	datasets:
	- appvoid/no-prompt-50k
	---
	![palmer](https://huggingface.co/appvoid/palmer-001/resolve/main/new-logo.jpg)
	# palmer
	### a better base model
	This is a small improvement over a (now un-prompted zyte) tinyllama model

	### evaluation 🧪
	note that this is a zero-shot setting as opposite to open llm leaderboard's few-shot evals
	```
	model ARC-C OBQA HellaSwag PIQA Winogrande Average
	tinyllama \| 0.3029 \| 0.3600 \| 0.5935 \| 0.7329 \| 0.5959 \| 0.5170 \|
	palmer-002 \| 0.3242 \| 0.3700 \| 0.5956 \| 0.7345 \| 0.5888 \| 0.5226 \|
	palmer-002-2401 \| 0.3294 \| 0.3700 \| 0.5950 \| 0.7399 \| 0.5896 \| 0.5247 \| (this)
	babbage-002 \| 0.3285 \| 0.3620 \| 0.6380 \| 0.7606 \| 0.6085 \| 0.5395 \|
	```

	### training 🦾
	Training took ~1 A100 gpu hour. It was trained on 50,000 gpt-4 shuffled samples. palmer was fine-tuned using lower learning rates ensuring it keeps as much general knowledge as possible.

	### prompt 📝
	```
	no prompt 🚀
	```
	<a href="https://ko-fi.com/appvoid" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 48px !important;width: 180px !important; filter: invert(70%);" ></a>