Update README.md

18e6d57 verified about 1 month ago

6.34 kB

	---
	library_name: peft
	license: mit
	base_model: microsoft/Phi-4-mini-instruct
	tags:
	- base_model:adapter:microsoft/Phi-4-mini-instruct
	- lora
	- transformers
	pipeline_tag: text-generation
	model-index:
	- name: phi4_african_history_lora_ds2
	results: []
	datasets:
	- DannyAI/African-History-QA-Dataset
	language:
	- en
	metrics:
	- bertscore
	---

	# Model Card for Model ID

	This is a LoRA fine-tuned version of microsoft/Phi-4-mini-instruct for African History using the DannyAI/African-History-QA-Dataset dataset.
	It achieves a loss value of 1.5099 on the validation set

	## Model Details

	### Model Description

	- Developed by: Daniel Ihenacho
	- Funded by: Daniel Ihenacho
	- Shared by: Daniel Ihenacho
	- Model type: Text Generation
	- Language(s) (NLP): English
	- License: mit
	- Finetuned from model: microsoft/Phi-4-mini-instruct

	## Uses

	This can be used for QA datasets about African History

	### Out-of-Scope Use

	Can be used beyond African History but should not.

	## How to Get Started with the Model
	```python
	from transformers import pipeline
	from transformers import (
	AutoTokenizer,
	AutoModelForCausalLM)
	from peft import PeftModel


	model_id = "microsoft/Phi-4-mini-instruct"

	tokeniser = AutoTokenizer.from_pretrained(model_id)

	# load base model
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	device_map = "auto",
	torch_dtype = torch.bfloat16,
	trust_remote_code = False
	)

	# Load the fine-tuned LoRA model
	lora_id = "DannyAI/phi4_african_history_lora_ds2"
	lora_model = PeftModel.from_pretrained(
	model,lora_id
	)

	generator = pipeline(
	"text-generation",
	model=lora_model,
	tokenizer=tokeniser,
	)
	question = "What is the significance of African feminist scholarly activism in contemporary resistance movements?"
	def generate_answer(question)->str:
	"""Generates an answer for the given question using the fine-tuned LoRA model.
	"""
	messages = [
	{"role": "system", "content": "You are a helpful AI assistant specialised in African history which gives concise answers to questions asked."},
	{"role": "user", "content": question}
	]

	output = generator(
	messages,
	max_new_tokens=2048,
	temperature=0.1,
	do_sample=False,
	return_full_text=False
	)
	return output[0]['generated_text'].strip()
	```
	```
	# Example output
	African feminist scholarly activism is significant in contemporary resistance movements as it provides a critical framework for understanding and addressing the specific challenges faced by African women in the context of global capitalism, neocolonialism, and patriarchal structures.
	```

	## Training Details

	### Training Data

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:------:\|:----:\|:---------------:\|
	\| 1.6515 \| 0.3784 \| 100 \| 1.6736 \|
	\| 1.5844 \| 0.7569 \| 200 \| 1.6175 \|
	\| 1.6068 \| 1.1325 \| 300 \| 1.5855 \|
	\| 1.6075 \| 1.5109 \| 400 \| 1.5679 \|
	\| 1.5188 \| 1.8893 \| 500 \| 1.5525 \|
	\| 1.4248 \| 2.2649 \| 600 \| 1.5423 \|
	\| 1.5465 \| 2.6433 \| 700 \| 1.5363 \|
	\| 1.454 \| 3.0189 \| 800 \| 1.5331 \|
	\| 1.5759 \| 3.3974 \| 900 \| 1.5275 \|
	\| 1.4626 \| 3.7758 \| 1000 \| 1.5268 \|
	\| 1.4861 \| 4.1514 \| 1100 \| 1.5230 \|
	\| 1.4863 \| 4.5298 \| 1200 \| 1.5232 \|
	\| 1.4312 \| 4.9082 \| 1300 \| 1.5185 \|
	\| 1.5311 \| 5.2838 \| 1400 \| 1.5193 \|
	\| 1.5135 \| 5.6623 \| 1500 \| 1.5179 \|
	\| 1.4092 \| 6.0378 \| 1600 \| 1.5144 \|
	\| 1.5621 \| 6.4163 \| 1700 \| 1.5145 \|
	\| 1.485 \| 6.7947 \| 1800 \| 1.5147 \|
	\| 1.4301 \| 7.1703 \| 1900 \| 1.5109 \|
	\| 1.5346 \| 7.5487 \| 2000 \| 1.5156 \|
	\| 1.4597 \| 7.9272 \| 2100 \| 1.5124 \|
	\| 1.4548 \| 8.3027 \| 2200 \| 1.5118 \|
	\| 1.4485 \| 8.6812 \| 2300 \| 1.5108 \|
	\| 1.4466 \| 9.0568 \| 2400 \| 1.5116 \|
	\| 1.4672 \| 9.4352 \| 2500 \| 1.5132 \|
	\| 1.4881 \| 9.8136 \| 2600 \| 1.5099 \|

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- distributed_type: multi-GPU
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 8
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Deep Speed Configuration
	```json
	{
	"fp16": { "enabled": false },
	"bf16": { "enabled": true },
	"zero_optimization": {
	"stage": 2,
	"offload_optimizer": {
	"device": "cpu",
	"pin_memory": true
	},
	"overlap_comm": true,
	"contiguous_gradients": true,
	"reduce_bucket_size": "auto"
	},
	"gradient_accumulation_steps": "auto",
	"gradient_clipping": "auto",
	"train_batch_size": "auto",
	"train_micro_batch_size_per_gpu": "auto"
	}
	```

	### Lora Configuration
	- r: 8
	- lora_alpha: 16
	- target_modules: ["q_proj", "v_proj", "k_proj", "o_proj"]
	- lora_dropout: 0.05 # dataset is small, hence a low dropout value
	- bias: "none"
	- task_type: "CAUSAL_LM"

	## Evaluation

	#### Metrics
	\| Models \| Bert Score \| TinyMMLU\| TinyTrufulQA
	\|------\|--------------\|----------------\|----------------\|
	\| Base model \| 0.88868 \| 0.6837 \|0.49745\|
	\| Fine tuned Model \| 0.90726 \| 0.67788 \|0.43822\|


	## Compute Infrastructure

	[Runpod](https://console.runpod.io/).

	### Hardware

	Runpod A40 GPU instance

	### Framework versions

	- PEFT 0.18.1
	- Transformers 4.57.6
	- Pytorch 2.4.1+cu124
	- Datasets 4.5.0
	- Tokenizers 0.22.2

	## Citation

	If you use this dataset, please cite:
	```
	@Model{
	Ihenacho2026phi4_african_history_lora_ds2,
	author = {Daniel Ihenacho},
	title = {phi4_african_history_lora_ds2},
	year = {2026},
	publisher = {Hugging Face Models},
	url = {https://huggingface.co/DannyAI/phi4_african_history_lora_ds2},
	urldate = {2026-01-27},
	}
	```

	## Model Card Authors

	Daniel Ihenacho

	## Model Card Contact

	- [LinkedIn](https://www.linkedin.com/in/daniel-ihenacho-637467223)
	- [GitHub](https://github.com/daniau23)