samhog commited on
Commit
c606b3d
·
1 Parent(s): 440df6f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -0
README.md ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ ## Psychology-Alpaca-RM
2
+ - PEFT adapter layers for a reward model based on ``decapoda-research/llama-7b-hf``.
3
+ - Trained with a small subset (110 data points) of ``samhog/cgpt-pairs`` with 10K prompts, each with two answers (one 'good', one 'bad')