Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
anirvankrishna 's Collections
Parameter Space vs. Activation Space Interventions
A Comparative Analysis between RLHF PPO and DPO

Parameter Space vs. Activation Space Interventions

updated 9 days ago

This collection consists of the relevant models for the second assignment of the course CS60216: Safety Fundamentals for Generative AI.

Upvote
-

  • anirvankrishna/model_sft_lora

    Updated 18 days ago

  • anirvankrishna/model_sft_lora_fused

    Text Generation • 2B • Updated 9 days ago • 1.1k

  • anirvankrishna/model_sft_dare

    Text Generation • 2B • Updated 18 days ago • 278

  • anirvankrishna/model_harmful_lora

    Updated 15 days ago

  • anirvankrishna/model_sft_resta

    Text Generation • 2B • Updated 15 days ago • 258

  • anirvankrishna/model_delta_safe

    Text Generation • 2B • Updated 15 days ago • 203

  • anirvankrishna/model_sft_resta_dare

    Text Generation • 2B • Updated 15 days ago • 249

  • anirvankrishna/model_harmful_lora_fused

    Text Generation • 2B • Updated 15 days ago • 119
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs