metadata
license: mit
Model shards contain fine-tuned base model, and probes.pt contains the mlp probes that are meant to attach to layers 10 and 20 in order to screen for harmful model generations.
license: mit
Model shards contain fine-tuned base model, and probes.pt contains the mlp probes that are meant to attach to layers 10 and 20 in order to screen for harmful model generations.