lora-output

This model is a fine-tuned version of distilgpt2 on an unknown dataset.

Model description

This model is a parameter-efficient fine-tuned version of distilgpt2, trained using LoRA (Low-Rank Adaptation) on a small demonstration dataset inside a Google Colab Free Tier GPU environment. The goal is to provide a lightweight, fast, reproducible, and beginner-friendly template for fine-tuning nano-scale language models.

The base model (distilgpt2) is a distilled version of GPT-2, making it significantly smaller and more efficient while retaining good generative capability. LoRA makes training accessible on limited hardware by training only a small set of additional low-rank parameters.

Intended uses & limitations

This model is intended for:

Educational demonstration of nano-LLM fine-tuning

Research on lightweight parameter-efficient training

Small-scale text generation tasks

Custom FAQ or conversational agents

Prototyping ML workflows in Google Colab Free Tier

Not intended for:

High-risk decision-making

Medical, legal, financial, or political applications

Producing factual or authoritative information

Any use that requires accuracy beyond small toy datasets

Training and evaluation data

Hardware

Google Colab Free Tier

NVIDIA T4 GPU (or similar)

12–15GB RAM

Max runtime: <3 hours (safe for free tier limits)

Training Framework

The model was trained using:

Hugging Face Transformers (model / trainer)

Hugging Face Datasets (data loading)

PEFT (LoRA) for parameter-efficient fine-tuning

Accelerate (device handling)

Training Objective

Causal Language Modeling (next-token prediction), using the standard GPT-2 loss.

Hyperparameters

Epochs: 3

Batch size: 2 (gradient accumulation ×8)

Learning rate: 2e-4

Max sequence length: 512 tokens

Precision: fp32 (for Colab stability)

Optimizer: AdamW

Dataset

A small demonstration dataset was created in JSONL format for testing purposes. Each example used a simple prompt → answer conversational style. This dataset is only illustrative and should be replaced for real applications.

Example format:

Q: A:

Data Size

Very small (<10 samples in demo)

Not suitable for production

Only for demonstrating the workflow from data → fine-tuned model

Evaluation

No separate validation set was used due to the tiny dataset. Evaluation strategy was set to "no" to reduce compute cost.

This model should not be evaluated as a general-purpose language model — it is a workflow demonstration.

Limitations

Limited training data → high risk of overfitting

Not instruction-tuned or alignment-tuned

Base model (distilgpt2) has known limitations inherited from GPT-2, including outdated knowledge

Demo dataset restricts conversational breadth

Not suitable for factual tasks

Potential Risks

May generate inaccurate or unsafe text if prompted incorrectly

May hallucinate or invent answers

Should not be used for impactful real-world decisions

Demo dataset may introduce unintended biases

Always supervise outputs when using in interactive environments.

Training procedure

How to Use

Load with LoRA adapter from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline from peft import PeftModel

tokenizer = AutoTokenizer.from_pretrained("your-username/your-model") base = AutoModelForCausalLM.from_pretrained("distilgpt2") model = PeftModel.from_pretrained(base, "your-username/your-model")

generator = pipeline("text-generation", model=model, tokenizer=tokenizer)

generator("Q: Give a friendly greeting.\nA:", max_length=120)

Or use merged full model (if uploaded) from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline

model = AutoModelForCausalLM.from_pretrained("your-username/your-model-full") tokenizer = AutoTokenizer.from_pretrained("your-username/your-model-full")

pipe = pipeline("text-generation", model=model, tokenizer=tokenizer) pipe("Hello, I am your assistant!", max_length=150)

Reproducibility

This model was built following the official Hugging Face training workflows and Colab notebook best practices. More details can be found in the Hugging Face “Finetuning GPT-2” & “PEFT/LoRA” examples:

Transformers notebooks and tutorials

Trainer API documentation

PEFT (LoRA) docs and examples

Citation

If you use this model or training template, please cite the original libraries:

@misc{huggingface2023transformers, title={Transformers: State-of-the-art Natural Language Processing}, author={The HuggingFace Team}, year={2023}, publisher={HuggingFace}, }

@misc{hu2021lora, title={LoRA: Low-Rank Adaptation of Large Language Models}, author={Hu, Edward and others}, year={2021}, }

Model Creator

This model was prepared and fine-tuned by Abdur Rahman in a Google Colab environment with step-by-step guidance provided by ChatGPT.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3

Framework versions

  • PEFT 0.17.1
  • Transformers 4.57.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Abdurrahmanesc/lora-output

Adapter
(71)
this model

Evaluation results