DSAA6000Q HW3

This repository contains my Assignment 3 submission for self-alignment with instruction backtranslation on Qwen3-1.7B. The models uploaded here are LoRA adapters trained on top of the base model, so the files are much smaller than full model checkpoints.

The backward_model adapter is trained for response-to-instruction generation using the OpenAssistant-Guanaco training data. It is used in the augmentation stage to generate new instructions from LIMA responses.

The final_model adapter is trained for instruction-following on the curated high-quality dataset produced after self-curation. For inference, load the base model Qwen/Qwen3-1.7B first and then apply the adapter in this repository.

This repository stores only adapter artifacts and tokenizer-related files needed to reproduce the homework pipeline.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support