Abdurrahmanesc's picture
Update README.md
c4c06a8 verified
---
base_model: gpt2
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:gpt2
- lora
- transformers
license: apache-2.0
datasets:
- Abdurrahmanesc/textgen-synthetic
language:
- en
metrics:
- rouge
- perplexity
- bleu
- bertscore
---
# Model Card for Model ID
This repository contains a LoRA-fine-tuned version of a base language model trained on a custom dataset focused on improving response coherence, text quality, and task-specific alignment.
The fine-tuning process was optimized for low-resource environments (CPU/TPU-friendly) while maintaining efficient training and strong post-training evaluation.
This project is part of a broader effort to build an open-source AI fine-tuning tool offering full customization, dataset controls, and multi-platform support.
### Model Description
| Property | Details |
| ---------------------- | ------------------------------------------------- |
| **Base Model** | (Your Base Model Name Here) |
| **Fine-Tuning Method** | LoRA / QLoRA |
| **Dataset** | Custom curated dataset (JSONL) |
| **Task Type** | Instruction following / text generation |
| **Intended Use** | Experimentation, research, downstream fine-tuning |
## Goals of This Fine-Tuning
Improve language generation quality
Reduce perplexity
Enhance alignment on user-style tasks
Maintain generalization while improving dataset-specific behavior
Validate training pipeline for the upcoming Open-Source Fine-Tuning Suite
### Model Sources [optional]
```
yaml
=== TRAIN METRICS (BEFORE vs AFTER) ===
ROUGE-L:
Before : 0.2726
After : 0.2726
Change : +0.0000
BLEU:
Before : 19.9785
After : 19.9744
Change : -0.0041
Perplexity:
Before : 23.67
After : 3.02
Change : -20.65 (major improvement)
(Additional metrics shown in your logs)
```
## Summary
ROUGE-L β†’ Stable
BLEU β†’ No significant change
Perplexity β†’ Massive improvement, indicating better fluency and internal consistency
Other metrics followed similar minor/no-change trends, indicating:
Minimal overfitting
Stable behavior
Improved confidence in generation
### Visualization
The repository includes:
Before/after metric graphs
Automatic metric logs
Training configuration dumps
These help track performance over time and compare fine-tuning strategies.
### Train Configuration
LoRA Rank: r= (fill)
LoRA Alpha: (fill)
Target Modules: (fill)
Batch Size: (fill)
Gradient Accumulation: (fill)
Max Seq Length: (fill)
Optimizer: (fill)
Learning Rate: (fill)
Eval Strategy: Before/After automated benchmark
### Repository Structure
```
β”œβ”€β”€ adapter_model.bin
β”œβ”€β”€ adapter_config.json
β”œβ”€β”€ training_args.json
β”œβ”€β”€ eval_before.json
β”œβ”€β”€ eval_after.json
β”œβ”€β”€ plots/
β”‚ β”œβ”€β”€ before_after_graph.png
β”‚ └── (others)
└── README.md
```
## Limitations
Not suitable for safety-critical applications
Fine-tuning dataset may shape generation style
Further RLHF or SFT may be required for production-level behavior
### Acknowledgements
Thanks to the HuggingFace Transformers, PEFT, and the open-source community for enabling lightweight fine-tuning on low-compute environments.
### Framework versions
- PEFT 0.18.0