samhog
/

psychology-alpaca-rm

Model card Files Files and versions

samhog commited on May 11, 2023

Commit

c606b3d

·

1 Parent(s): 440df6f

Create README.md

Files changed (1) hide show

README.md +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,3 @@

+## Psychology-Alpaca-RM
+- PEFT adapter layers for a reward model based on ``decapoda-research/llama-7b-hf``.
+- Trained with a small subset (110 data points) of ``samhog/cgpt-pairs`` with 10K prompts, each with two answers (one 'good', one 'bad')