linden713's picture
Update README.md
e749432 verified
metadata
license: mit
language:
  - en
base_model:
  - google/gemma-3-4b-it

con_learn highrl

Continue learning with rl:8e-5 and r:64, max epoch:1

con learn r16

Continue learning with rl:2e-5 and r:16, max epoch:1

con learn r64

Continue learning with rl:2e-5 and r:64, max epoch:1

finetune_firefly

Finetune on firefly with rl:1e-4 and r:16, max epoch:5