Instructions to use DannyAI/phi4_lora_axolotl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use DannyAI/phi4_lora_axolotl with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("microsoft/Phi-4-mini-instruct")
model = PeftModel.from_pretrained(base_model, "DannyAI/phi4_lora_axolotl")

Transformers

How to use DannyAI/phi4_lora_axolotl with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="DannyAI/phi4_lora_axolotl", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("DannyAI/phi4_lora_axolotl", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("DannyAI/phi4_lora_axolotl", trust_remote_code=True)
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use DannyAI/phi4_lora_axolotl with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "DannyAI/phi4_lora_axolotl"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DannyAI/phi4_lora_axolotl",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/DannyAI/phi4_lora_axolotl

SGLang

How to use DannyAI/phi4_lora_axolotl with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "DannyAI/phi4_lora_axolotl" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DannyAI/phi4_lora_axolotl",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "DannyAI/phi4_lora_axolotl" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "DannyAI/phi4_lora_axolotl",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use DannyAI/phi4_lora_axolotl with Docker Model Runner:
```
docker model run hf.co/DannyAI/phi4_lora_axolotl
```

DannyAI commited on Jan 27

Commit

d4d48c0

verified ·

1 Parent(s): b926e35

update to Readme file

Browse files

Files changed (1) hide show

README.md +136 -30

README.md CHANGED Viewed

@@ -19,9 +19,6 @@ metrics:
 - bertscore
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
@@ -98,41 +95,86 @@ hub_private_repo: false
 # phi4_lora_axolotl
-This model is a fine-tuned version of [microsoft/Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) on the DannyAI/African-History-QA-Dataset dataset.
-It achieves the following results on the evaluation set:
-- Loss: 1.7479
-- Ppl: 5.7428
-- Memory/max Active (gib): 14.84
-- Memory/max Allocated (gib): 14.84
-- Memory/device Reserved (gib): 31.79
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 2
-- eval_batch_size: 2
-- seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 8
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 20
-- training_steps: 650
 ### Training results
@@ -154,10 +196,74 @@ The following hyperparameters were used during training:
 | 2.5727        | 50.0    | 650  | 1.7479          | 5.7428 | 14.84        | 14.84           | 31.79          |
 ### Framework versions
 - PEFT 0.18.1
 - Transformers 4.57.6
 - Pytorch 2.9.1+cu128
 - Datasets 4.5.0
-- Tokenizers 0.22.2

 - bertscore
 ---
 [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
 <details><summary>See axolotl config</summary>
 # phi4_lora_axolotl
+This is a LoRA fine-tuned version of **microsoft/Phi-4-mini-instruct** for African History using the **DannyAI/African-History-QA-Dataset** dataset.
+It achieves a loss value of 1.7479 on the validation set
+## Model Details
+### Model Description
+- **Developed by:** Daniel Ihenacho
+- **Funded by:** Daniel Ihenacho
+- **Shared by:** Daniel Ihenacho
+- **Model type:** Text Generation
+- **Language(s) (NLP):** English
+- **License:** mit
+- **Finetuned from model:** microsoft/Phi-4-mini-instruct
+## Uses
+This can be used for QA datasets about African History
+### Out-of-Scope Use
+Can be used beyond African History but should not.
+## How to Get Started with the Model
+```python
+from transformers import pipeline
+from transformers import (
+    AutoTokenizer,
+    AutoModelForCausalLM)
+from peft import PeftModel
+model_id = "microsoft/Phi-4-mini-instruct"
+tokeniser = AutoTokenizer.from_pretrained(model_id)
+# load base model
+model  = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    device_map = "auto",
+    torch_dtype = torch.bfloat16,
+    trust_remote_code = False
+)
+# Load the fine-tuned LoRA model
+lora_id = "DannyAI/phi4_lora_axolotl"
+lora_model = PeftModel.from_pretrained(
+    model,lora_id
+)
+generator = pipeline(
+    "text-generation",
+    model=lora_model,
+    tokenizer=tokeniser,
+)
+question = "What is the significance of African feminist scholarly activism in contemporary resistance movements?"
+def generate_answer(question)->str:
+    """Generates an answer for the given question using the fine-tuned LoRA model.
+    """
+    messages = [
+        {"role": "system", "content": "You are a helpful AI assistant specialised in African history which gives concise answers to questions asked."},
+        {"role": "user", "content": question}
+    ]
+    output = generator(
+        messages,
+        max_new_tokens=2048,
+        temperature=0.1,
+        do_sample=False,
+        return_full_text=False
+    )
+    return output[0]['generated_text'].strip()
+```
+```
+# Example output
+African feminist scholarly activism is significant in contemporary resistance movements as it provides a critical framework for understanding and addressing the specific challenges faced by African women in the context of global capitalism, neocolonialism, and patriarchal structures.
+```
+## Training Details
 ### Training results
 | 2.5727        | 50.0    | 650  | 1.7479          | 5.7428 | 14.84        | 14.84           | 31.79          |
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 8
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 20
+- training_steps: 650
+### Lora Configuration
+- r: 8
+- lora_alpha: 16
+- target_modules: ["q_proj", "v_proj", "k_proj", "o_proj"]
+- lora_dropout: 0.05 # dataset is small, hence a low dropout value
+- bias: "none"
+- task_type: "CAUSAL_LM"
+## Evaluation
+#### Metrics
+| Models | Bert Score | TinyMMLU| TinyTrufulQA
+|------|--------------|----------------|----------------|
+| Base model | 0.88868 | 0.6837 |0.49745|
+| Fine tuned Model | 0.88981 | 0.67371 |0.46626|
+## Compute Infrastructure
+[Runpod](https://console.runpod.io/).
+### Hardware
+Runpod A40 GPU instance
 ### Framework versions
 - PEFT 0.18.1
 - Transformers 4.57.6
 - Pytorch 2.9.1+cu128
 - Datasets 4.5.0
+- Tokenizers 0.22.2
+## Citation
+If you use this dataset, please cite:
+```
+@Model{
+Ihenacho2026phi4_lora_axolotl,
+  author    = {Daniel Ihenacho},
+  title     = {phi4_lora_axolotl},
+  year      = {2026},
+  publisher = {Hugging Face Models},
+  url       = {https://huggingface.co/DannyAI/phi4_lora_axolotl},
+  urldate   = {2026-01-27},
+}
+```
+## Model Card Authors
+Daniel Ihenacho
+## Model Card Contact
+- [LinkedIn](https://www.linkedin.com/in/daniel-ihenacho-637467223)
+- [GitHub](https://github.com/daniau23)