Commit History

Improve reward function to break refuse-everything local minimum and scale training
bd8220a
unverified

Claude commited on

Include messages in ConversationLog.to_dict() for report conversation examples
d831d96
unverified

Claude commited on

Implement self-improving AI oversight system with nested RL environments
e6b0e2f
unverified

Claude commited on