zhenchonghu's picture
Upload LoRA adapter
0180a89 verified
metadata
base_model: Qwen/Qwen3-1.7B
library_name: peft
pipeline_tag: text-generation
tags:
  - lora
  - sft
  - dsaa6000q
  - instruction-backtranslation

DSAA6000Q Backward LoRA Adapter

This repository contains a LoRA adapter produced for DSAA6000Q Assignment 3 using the self-alignment pipeline from Self Alignment with Instruction Backtranslation.

Summary

Backward model p(instruction | response) trained on OpenAssistant Guanaco.

Base model

  • Qwen/Qwen3-1.7B

Training data

  • Dataset: timdettmers/openassistant-guanaco split train
  • Max train samples: 2000
  • Max eval samples: 200

Notes

  • This is a PEFT LoRA adapter, not a fully merged checkpoint.
  • The model was trained with a standalone Kaggle-compatible notebook.