Likhith003
/

dpo-pairrm-lora-adapter

The model is fantastic! We would like to contribute by updating the model README to include the base_model information. This is to address the missing model card. Thank you for your consideration!

Files changed (1) hide show

README.md +9 -8

README.md CHANGED Viewed

@@ -1,16 +1,17 @@
 ---
 license: apache-2.0
 language:
-  - en
 library_name: transformers
 tags:
-  - llama
-  - dpo
-  - preference-optimization
-  - PEFT
-  - instruction-tuning
 pipeline_tag: text-generation
 ---
 # DPO Fine-Tuned Adapter - PairRM Dataset
@@ -37,4 +38,4 @@ pipeline_tag: text-generation
 - Size: 500 instructions with `prompt`, `chosen`, and `rejected` columns
 ## 📂 Output
-- Adapter saved and uploaded as `Likhith003/dpo-pairrm-lora-adapter`

 ---
 license: apache-2.0
 language:
+- en
 library_name: transformers
 tags:
+- llama
+- dpo
+- preference-optimization
+- PEFT
+- instruction-tuning
 pipeline_tag: text-generation
+base_model:
+- meta-llama/Llama-3.2-1B-Instruct
 ---
 # DPO Fine-Tuned Adapter - PairRM Dataset
 - Size: 500 instructions with `prompt`, `chosen`, and `rejected` columns
 ## 📂 Output
+- Adapter saved and uploaded as `Likhith003/dpo-pairrm-lora-adapter`