zhenchonghu
/

dsaa6000q-a3-backward

Text Generation

instruction-backtranslation

Model card Files Files and versions

DSAA6000Q Backward LoRA Adapter

This repository contains a LoRA adapter produced for DSAA6000Q Assignment 3 using the self-alignment pipeline from Self Alignment with Instruction Backtranslation.

Summary

Backward model p(instruction | response) trained on OpenAssistant Guanaco.

Base model

Qwen/Qwen3-1.7B

Training data

Dataset: timdettmers/openassistant-guanaco split train
Max train samples: 2000
Max eval samples: 200

Notes

This is a PEFT LoRA adapter, not a fully merged checkpoint.
The model was trained with a standalone Kaggle-compatible notebook.

Downloads last month: -

Model tree for zhenchonghu/dsaa6000q-a3-backward

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

Qwen/Qwen3-1.7B

Adapter

(508)

this model