BEncoderRT
/

Pythia-QLoRA-Instruction-Tuning

Text Generation

Instruction-Tuning

Model card Files Files and versions

BEncoderRT commited on Jan 8

Commit

4bc9f2a

·

verified ·

1 Parent(s): 4a1f78e

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -12,8 +12,16 @@ tags:
 - Instruction-Tuning
 - peft
 ---
 # QLoRA Instruction Tuning on Pythia-1B
 This repository provides a **Hugging Face–compatible LoRA adapter** trained via **QLoRA (4-bit quantization + LoRA adapters)** on the **EleutherAI Pythia-1B-deduped** base model.
 The project focuses on **producing and publishing a reusable LoRA adapter** using a modern, memory-efficient instruction-tuning pipeline built with Hugging Face Transformers, PEFT, and BitsAndBytes. It is designed for **learning, experimentation, and small-GPU environments (e.g. Colab)**.

 - Instruction-Tuning
 - peft
 ---
+“Predict the next token”
+not
+“Obey the instruction”
 # QLoRA Instruction Tuning on Pythia-1B
 This repository provides a **Hugging Face–compatible LoRA adapter** trained via **QLoRA (4-bit quantization + LoRA adapters)** on the **EleutherAI Pythia-1B-deduped** base model.
 The project focuses on **producing and publishing a reusable LoRA adapter** using a modern, memory-efficient instruction-tuning pipeline built with Hugging Face Transformers, PEFT, and BitsAndBytes. It is designed for **learning, experimentation, and small-GPU environments (e.g. Colab)**.