Instructions to use CodeMasterAbdul/alloy-phi3-steel-maintenance with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use CodeMasterAbdul/alloy-phi3-steel-maintenance with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="CodeMasterAbdul/alloy-phi3-steel-maintenance")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("CodeMasterAbdul/alloy-phi3-steel-maintenance")
model = AutoModelForCausalLM.from_pretrained("CodeMasterAbdul/alloy-phi3-steel-maintenance", device_map="auto")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use CodeMasterAbdul/alloy-phi3-steel-maintenance with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "CodeMasterAbdul/alloy-phi3-steel-maintenance"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CodeMasterAbdul/alloy-phi3-steel-maintenance",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/CodeMasterAbdul/alloy-phi3-steel-maintenance

SGLang

How to use CodeMasterAbdul/alloy-phi3-steel-maintenance with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "CodeMasterAbdul/alloy-phi3-steel-maintenance" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CodeMasterAbdul/alloy-phi3-steel-maintenance",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "CodeMasterAbdul/alloy-phi3-steel-maintenance" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "CodeMasterAbdul/alloy-phi3-steel-maintenance",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use CodeMasterAbdul/alloy-phi3-steel-maintenance with Docker Model Runner:
```
docker model run hf.co/CodeMasterAbdul/alloy-phi3-steel-maintenance
```

Alloy-Agent Phi-3 Maintenance

Fine-tuned Phi-3-mini-4k-instruct for predictive maintenance tasks in industrial environments. The model was trained on equipment sensor data, failure diagnostics, and maintenance procedures from steel manufacturing contexts.

Model Overview

This is a QLoRA fine-tuned version of microsoft/Phi-3-mini-4k-instruct (3.8B parameters) specialized for equipment maintenance analysis. Training focused on interpreting sensor readings, diagnosing failure modes, and generating maintenance recommendations.

The model handles tasks like remaining useful life (RUL) estimation, root cause analysis, risk classification, and maintenance procedure generation based on equipment telemetry data.

Training Data

Dataset: 1,973 maintenance records combining NASA CMAPSS turbofan data, UCI AI4I 2020 predictive maintenance dataset, and synthetic domain scenarios.

Split: 1,776 training / 197 validation

Input format:

Equipment: [type] | ID: [id]
Operating Hours: [hours]
Sensor Readings: Temperature, Vibration, Pressure, etc.

Output format:

DIAGNOSIS: [failure analysis]
ROOT CAUSE: [technical cause]
RISK LEVEL: [LOW/MEDIUM/HIGH/CRITICAL]
RUL: [hours] ± [confidence]
RECOMMENDATIONS: [maintenance actions]

Training Configuration

Hardware: Google Colab T4 GPU (15GB VRAM)
Duration: 4.5 hours, 666 steps over 3 epochs
Final eval loss: 0.02508

Method: QLoRA (4-bit quantization + LoRA adapters)

LoRA rank: 16, alpha: 16, dropout: 0.05
Target modules: all attention and MLP projection layers
Trainable params: 29.9M / 3.8B (0.78%)

Hyperparameters:

learning_rate: 2e-4
batch_size: 8 (2 per device, 4 gradient accumulation steps)
optimizer: adamw_8bit
weight_decay: 0.01
warmup_steps: 50
max_grad_norm: 1.0
lr_scheduler: linear
fp16: True
gradient_checkpointing: True

Training utilized Unsloth for 2x speedup during fine-tuning.

Usage

Basic inference:

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "abdul-nazeer/alloy-agent-phi3-maintenance",
    device_map="auto",
    torch_dtype="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("abdul-nazeer/alloy-agent-phi3-maintenance")

prompt = """<|system|>You are an industrial maintenance AI assistant specialized in steel plant equipment analysis.<|end|>
<|user|>Equipment: Air Compressor Unit
Temperature: 95°C (baseline: 75°C)
Vibration: 1.2 mm/s (baseline: 0.5 mm/s)
Pressure: 7.8 bar (baseline: 8.5 bar)
Operating Hours: 2,150 hours

Analyze and provide maintenance recommendations.<|end|>
<|assistant|>"""

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=400,
    temperature=0.7,
    do_sample=True,
    top_p=0.9
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("<|assistant|>")[1])

For faster inference with 4-bit quantization:

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="abdul-nazeer/alloy-agent-phi3-maintenance",
    max_seq_length=4096,
    dtype=None,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

Performance

Training converged smoothly with no overfitting:

Step   0: train_loss ~2.50
Step 100: train_loss 0.52, eval_loss 0.0687
Step 200: train_loss 0.18
Step 400: train_loss 0.08
Step 666: train_loss 0.025, eval_loss 0.02508

Loss reduction: 2.5 → 0.025 (100x improvement)

Inference: ~0.7s per response on T4 GPU with 4-bit quantization

Limitations

Trained primarily on steel plant equipment data - performance on other industrial domains may vary
Outputs should be validated by maintenance engineers for critical systems
Model provides estimates, not guarantees - RUL predictions have inherent uncertainty
English language only
Requires structured sensor data inputs for best results

Use Cases

The model is designed for decision support in industrial maintenance:

Early failure detection from sensor anomalies
RUL estimation for maintenance scheduling
Root cause analysis during equipment diagnostics
Generating maintenance work orders and procedures

Not intended for:

Autonomous control of equipment
Safety-critical decisions without human review
Financial/legal advice
Medical equipment diagnostics

Technical Details

Architecture: Phi-3-mini-4k-instruct base

Total parameters: 3.82B
Trainable (LoRA): 29.88M (0.78%)
Quantization: 4-bit NF4
Context window: 4096 tokens (2048 used in training)
Attention heads: 32
Hidden size: 3072
Layers: 32

Requirements:

Minimum: 4GB GPU VRAM (with 4-bit quantization)
Recommended: 8GB+ GPU VRAM for production
Dependencies: transformers, torch, accelerate, bitsandbytes

Citation

@misc{alloy-agent-phi3-2026,
  title={Alloy-Agent Phi-3: Fine-Tuned Model for Industrial Predictive Maintenance},
  author={Abdul Nazeer},
  year={2026},
  publisher={HuggingFace},
  url={https://huggingface.co/abdul-nazeer/alloy-agent-phi3-maintenance}
}

Base model:

@article{phi3,
  title={Phi-3 Technical Report},
  author={Microsoft},
  year={2024},
  url={https://arxiv.org/abs/2404.14219}
}

License

Apache 2.0 (inherited from Phi-3 base model)

Acknowledgments

Built on microsoft/Phi-3-mini-4k-instruct. Training optimized with Unsloth. Datasets sourced from NASA CMAPSS and UCI AI4I 2020.

Downloads last month: 9

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for CodeMasterAbdul/alloy-phi3-steel-maintenance

Base model

microsoft/Phi-3-mini-4k-instruct

Finetuned

(1058)

this model

Space using CodeMasterAbdul/alloy-phi3-steel-maintenance 1

Paper for CodeMasterAbdul/alloy-phi3-steel-maintenance

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 262

Evaluation results

Evaluation Loss
self-reported

0.025