Update README.md
Browse files
README.md
CHANGED
|
@@ -402,60 +402,4 @@ Artificial intelligence has made it possible for computers to think. This has cr
|
|
| 402 |
The first
|
| 403 |
|
| 404 |
```
|
| 405 |
-
---
|
| 406 |
-
|
| 407 |
-
## ๐ Why QLoRA?
|
| 408 |
-
|
| 409 |
-
Compared to full fine-tuning:
|
| 410 |
-
|
| 411 |
-
* โ
~10ร lower GPU memory usage
|
| 412 |
-
* โ
Faster experimentation
|
| 413 |
-
* โ
No catastrophic forgetting
|
| 414 |
-
* โ
Easy adapter reuse and sharing
|
| 415 |
-
|
| 416 |
-
This approach mirrors how many modern instruction-tuned LLMs are trained at scale.
|
| 417 |
-
|
| 418 |
-
---
|
| 419 |
-
|
| 420 |
-
## ๐ Expected Behavior When Using This Adapter
|
| 421 |
-
|
| 422 |
-
After training, the model should:
|
| 423 |
-
|
| 424 |
-
* Follow instructions more directly
|
| 425 |
-
* Produce more structured and task-aligned responses
|
| 426 |
-
* Show clear behavioral differences **with vs without** LoRA adapters
|
| 427 |
-
|
| 428 |
-
Adapter ablation (disabling LoRA) should revert behavior close to the base model.
|
| 429 |
-
|
| 430 |
-
---
|
| 431 |
-
|
| 432 |
-
## ๐ฎ Possible Extensions
|
| 433 |
-
|
| 434 |
-
* Mask loss to train **response-only instruction tuning**
|
| 435 |
-
* Train multiple LoRA adapters for different tasks
|
| 436 |
-
* Merge or switch adapters at inference time
|
| 437 |
-
* Combine with evaluation datasets
|
| 438 |
-
* Compare different LoRA ranks (`r=8`, `r=16`, `r=32`)
|
| 439 |
-
|
| 440 |
-
---
|
| 441 |
-
|
| 442 |
-
## ๐ ๏ธ Requirements
|
| 443 |
-
|
| 444 |
-
* Python 3.9+
|
| 445 |
-
* PyTorch
|
| 446 |
-
* transformers
|
| 447 |
-
* peft
|
| 448 |
-
* bitsandbytes
|
| 449 |
-
* accelerate
|
| 450 |
-
|
| 451 |
-
---
|
| 452 |
-
|
| 453 |
-
## ๐ License & Usage Notes
|
| 454 |
-
|
| 455 |
-
This repository publishes **only LoRA adapter weights** and configuration files. The base model must be obtained separately under its original license.
|
| 456 |
-
|
| 457 |
-
This adapter is intended for **research, experimentation, and non-production use** unless further evaluated.
|
| 458 |
-
|
| 459 |
-
---
|
| 460 |
-
|
| 461 |
-
This repository provides a **clean, minimal reference implementation** of QLoRA-based instruction tuning on a 1B-scale language model.
|
|
|
|
| 402 |
The first
|
| 403 |
|
| 404 |
```
|
| 405 |
+
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|