Update README.md
Browse files
README.md
CHANGED
|
@@ -32,12 +32,12 @@ This is a LoRA fine-tuned version of **microsoft/Phi-4-mini-instruct** for Afric
|
|
| 32 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
| 33 |
|
| 34 |
- **Developed by:** Daniel Ihenacho
|
| 35 |
-
- **Funded by
|
| 36 |
-
- **Shared by
|
| 37 |
- **Model type:** Text Generation
|
| 38 |
- **Language(s) (NLP):** English
|
| 39 |
- **License:** mit
|
| 40 |
-
- **Finetuned from model
|
| 41 |
|
| 42 |
### Model Sources [optional]
|
| 43 |
|
|
@@ -49,64 +49,106 @@ This is the model card of a 🤗 transformers model that has been pushed on the
|
|
| 49 |
|
| 50 |
## Uses
|
| 51 |
|
| 52 |
-
|
| 53 |
-
|
| 54 |
-
### Direct Use
|
| 55 |
-
|
| 56 |
-
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
| 57 |
-
|
| 58 |
-
[More Information Needed]
|
| 59 |
-
|
| 60 |
-
### Downstream Use [optional]
|
| 61 |
-
|
| 62 |
-
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
| 63 |
-
|
| 64 |
-
[More Information Needed]
|
| 65 |
|
| 66 |
### Out-of-Scope Use
|
| 67 |
|
| 68 |
-
|
| 69 |
-
|
| 70 |
-
[More Information Needed]
|
| 71 |
-
|
| 72 |
-
## Bias, Risks, and Limitations
|
| 73 |
-
|
| 74 |
-
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
| 75 |
-
|
| 76 |
-
[More Information Needed]
|
| 77 |
-
|
| 78 |
-
### Recommendations
|
| 79 |
-
|
| 80 |
-
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
|
| 81 |
-
|
| 82 |
-
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
|
| 83 |
|
| 84 |
## How to Get Started with the Model
|
| 85 |
|
| 86 |
-
|
| 87 |
-
|
| 88 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 89 |
|
| 90 |
## Training Details
|
| 91 |
|
| 92 |
### Training Data
|
| 93 |
|
| 94 |
-
|
| 95 |
-
|
| 96 |
-
|
| 97 |
-
|
| 98 |
-
|
| 99 |
-
|
| 100 |
-
|
| 101 |
-
|
| 102 |
-
|
| 103 |
-
|
| 104 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 105 |
|
| 106 |
|
| 107 |
#### Training Hyperparameters
|
| 108 |
|
| 109 |
-
|
| 110 |
|
| 111 |
#### Speeds, Sizes, Times [optional]
|
| 112 |
|
|
|
|
| 32 |
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated.
|
| 33 |
|
| 34 |
- **Developed by:** Daniel Ihenacho
|
| 35 |
+
- **Funded by:** Daniel Ihenacho
|
| 36 |
+
- **Shared by:** Daniel Ihenacho
|
| 37 |
- **Model type:** Text Generation
|
| 38 |
- **Language(s) (NLP):** English
|
| 39 |
- **License:** mit
|
| 40 |
+
- **Finetuned from model:** [More Information Needed]
|
| 41 |
|
| 42 |
### Model Sources [optional]
|
| 43 |
|
|
|
|
| 49 |
|
| 50 |
## Uses
|
| 51 |
|
| 52 |
+
This can be used for QA datasets about African History
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 53 |
|
| 54 |
### Out-of-Scope Use
|
| 55 |
|
| 56 |
+
Can be used beypnd African History but should not.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
## How to Get Started with the Model
|
| 59 |
|
| 60 |
+
```python
|
| 61 |
+
from transformers import pipeline
|
| 62 |
+
from transformers import (
|
| 63 |
+
AutoTokenizer,
|
| 64 |
+
AutoModelForCausalLM)
|
| 65 |
+
from peft import LoraConfig, get_peft_model, PeftModel
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
model_id = "microsoft/Phi-4-mini-instruct"
|
| 69 |
+
|
| 70 |
+
tokeniser = AutoTokenizer.from_pretrained(model_id)
|
| 71 |
+
|
| 72 |
+
# load base model
|
| 73 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 74 |
+
model_id,
|
| 75 |
+
device_map = "auto",
|
| 76 |
+
torch_dtype = torch.bfloat16,
|
| 77 |
+
trust_remote_code = False
|
| 78 |
+
)
|
| 79 |
+
|
| 80 |
+
# Load the fine-tuned LoRA model
|
| 81 |
+
lora_id = "DannyAI/phi4_african_history_lora"
|
| 82 |
+
lora_model = PeftModel.from_pretrained(
|
| 83 |
+
model,lora_id
|
| 84 |
+
)
|
| 85 |
+
|
| 86 |
+
generator = pipeline(
|
| 87 |
+
"text-generation",
|
| 88 |
+
model=lora_model,
|
| 89 |
+
tokenizer=tokeniser,
|
| 90 |
+
)
|
| 91 |
+
question = "What is the significance of African feminist scholarly activism in contemporary resistance movements?"
|
| 92 |
+
def generate_answer(question)->str:
|
| 93 |
+
"""Generates an answer for the given question using the fine-tuned LoRA model.
|
| 94 |
+
"""
|
| 95 |
+
messages = [
|
| 96 |
+
{"role": "system", "content": "You are a helpful AI assistant specialised in African history which gives concise answers to questions asked."},
|
| 97 |
+
{"role": "user", "content": question}
|
| 98 |
+
]
|
| 99 |
+
|
| 100 |
+
# pipeline() returns a list of dicts; return_full_text=False gives only the assistant's reply
|
| 101 |
+
output = generator(
|
| 102 |
+
messages,
|
| 103 |
+
max_new_tokens=2048,
|
| 104 |
+
temperature=0.1,
|
| 105 |
+
do_sample=False,
|
| 106 |
+
return_full_text=False
|
| 107 |
+
)
|
| 108 |
+
return output[0]['generated_text'].strip()
|
| 109 |
+
```
|
| 110 |
+
```
|
| 111 |
+
# Example output
|
| 112 |
+
African feminist scholarly activism is significant in contemporary resistance movements as it provides a critical framework for understanding and addressing the specific challenges faced by African women in the context of global capitalism, neocolonialism, and patriarchal structures.
|
| 113 |
+
```
|
| 114 |
|
| 115 |
## Training Details
|
| 116 |
|
| 117 |
### Training Data
|
| 118 |
|
| 119 |
+
| Step | Training Loss | Validation Loss |
|
| 120 |
+
|------|--------------|----------------|
|
| 121 |
+
| 100 | 1.643900 | 1.650120 |
|
| 122 |
+
| 200 | 1.548300 | 1.577856 |
|
| 123 |
+
| 300 | 1.581000 | 1.551598 |
|
| 124 |
+
| 400 | 1.578900 | 1.538108 |
|
| 125 |
+
| 500 | 1.498800 | 1.528269 |
|
| 126 |
+
| 600 | 1.401300 | 1.518312 |
|
| 127 |
+
| 700 | 1.520000 | 1.513678 |
|
| 128 |
+
| 800 | 1.436400 | 1.506603 |
|
| 129 |
+
| 900 | 1.545600 | 1.504393 |
|
| 130 |
+
| 1000 | 1.439800 | 1.502365 |
|
| 131 |
+
| 1100 | 1.452100 | 1.500665 |
|
| 132 |
+
| 1200 | 1.466000 | 1.494793 |
|
| 133 |
+
| 1300 | 1.408300 | 1.493954 |
|
| 134 |
+
| 1400 | 1.508900 | 1.493219 |
|
| 135 |
+
| 1500 | 1.487500 | 1.493616 |
|
| 136 |
+
| 1600 | 1.383300 | 1.489923 |
|
| 137 |
+
| 1700 | 1.534100 | 1.489187 |
|
| 138 |
+
| 1800 | 1.468800 | 1.489143 |
|
| 139 |
+
| 1900 | 1.405100 | 1.488410 |
|
| 140 |
+
| 2000 | 1.509100 | 1.487043 |
|
| 141 |
+
| 2100 | 1.435800 | 1.488957 |
|
| 142 |
+
| 2200 | 1.434400 | 1.487890 |
|
| 143 |
+
| 2300 | 1.416800 | 1.488166 |
|
| 144 |
+
| 2400 | 1.416600 | 1.487361 |
|
| 145 |
+
| 2500 | 1.439200 | 1.487180 |
|
| 146 |
+
| 2600 | 1.450000 | 1.486632 |
|
| 147 |
|
| 148 |
|
| 149 |
#### Training Hyperparameters
|
| 150 |
|
| 151 |
+
|
| 152 |
|
| 153 |
#### Speeds, Sizes, Times [optional]
|
| 154 |
|