Abdurrahmanesc
/

finetuning-infinite-workflow

Text Generation

Model card Files Files and versions

finetuning-infinite-workflow / README.md

Abdurrahmanesc's picture

Update README.md

c4c06a8 verified 5 months ago

|

history blame contribute delete

3.4 kB

	---
	base_model: gpt2
	library_name: peft
	pipeline_tag: text-generation
	tags:
	- base_model:adapter:gpt2
	- lora
	- transformers
	license: apache-2.0
	datasets:
	- Abdurrahmanesc/textgen-synthetic
	language:
	- en
	metrics:
	- rouge
	- perplexity
	- bleu
	- bertscore
	---

	# Model Card for Model ID

	This repository contains a LoRA-fine-tuned version of a base language model trained on a custom dataset focused on improving response coherence, text quality, and task-specific alignment.

	The fine-tuning process was optimized for low-resource environments (CPU/TPU-friendly) while maintaining efficient training and strong post-training evaluation.

	This project is part of a broader effort to build an open-source AI fine-tuning tool offering full customization, dataset controls, and multi-platform support.


	### Model Description

	\| Property \| Details \|
	\| ---------------------- \| ------------------------------------------------- \|
	\| Base Model \| (Your Base Model Name Here) \|
	\| Fine-Tuning Method \| LoRA / QLoRA \|
	\| Dataset \| Custom curated dataset (JSONL) \|
	\| Task Type \| Instruction following / text generation \|
	\| Intended Use \| Experimentation, research, downstream fine-tuning \|

	## Goals of This Fine-Tuning

	Improve language generation quality

	Reduce perplexity

	Enhance alignment on user-style tasks

	Maintain generalization while improving dataset-specific behavior

	Validate training pipeline for the upcoming Open-Source Fine-Tuning Suite

	### Model Sources [optional]
	```
	yaml
	=== TRAIN METRICS (BEFORE vs AFTER) ===

	ROUGE-L:
	Before : 0.2726
	After : 0.2726
	Change : +0.0000

	BLEU:
	Before : 19.9785
	After : 19.9744
	Change : -0.0041

	Perplexity:
	Before : 23.67
	After : 3.02
	Change : -20.65 (major improvement)

	(Additional metrics shown in your logs)
	```

	## Summary

	ROUGE-L → Stable

	BLEU → No significant change

	Perplexity → Massive improvement, indicating better fluency and internal consistency

	Other metrics followed similar minor/no-change trends, indicating:

	Minimal overfitting

	Stable behavior

	Improved confidence in generation





	### Visualization

	The repository includes:

	Before/after metric graphs

	Automatic metric logs

	Training configuration dumps

	These help track performance over time and compare fine-tuning strategies.

	### Train Configuration

	LoRA Rank: r= (fill)

	LoRA Alpha: (fill)

	Target Modules: (fill)

	Batch Size: (fill)

	Gradient Accumulation: (fill)

	Max Seq Length: (fill)

	Optimizer: (fill)

	Learning Rate: (fill)

	Eval Strategy: Before/After automated benchmark

	### Repository Structure
	```
	├── adapter_model.bin
	├── adapter_config.json
	├── training_args.json
	├── eval_before.json
	├── eval_after.json
	├── plots/
	│ ├── before_after_graph.png
	│ └── (others)
	└── README.md
	```

	## Limitations

	Not suitable for safety-critical applications

	Fine-tuning dataset may shape generation style

	Further RLHF or SFT may be required for production-level behavior

	### Acknowledgements

	Thanks to the HuggingFace Transformers, PEFT, and the open-source community for enabling lightweight fine-tuning on low-compute environments.


	### Framework versions

	- PEFT 0.18.0