rudycaz commited on
Commit
851bf10
·
verified ·
1 Parent(s): 8bf5aa1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +64 -31
README.md CHANGED
@@ -1,45 +1,78 @@
1
  ---
 
 
 
 
 
 
 
 
 
2
  language:
3
- - multilingual
4
- library_name: mlx
5
- license: mit
6
- license_link: https://huggingface.co/microsoft/Phi-3.5-mini-instruct/resolve/main/LICENSE
7
  pipeline_tag: text-generation
8
- tags:
9
- - nlp
10
- - code
11
- - mlx
12
- widget:
13
- - messages:
14
- - role: user
15
- content: Can you provide ways to eat combinations of bananas and dragonfruits?
16
- base_model: mlx-community/Phi-3.5-mini-instruct-4bit
17
  ---
18
 
19
- # rudycaz/phi35-phish-mlx
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- This model [rudycaz/phi35-phish-mlx](https://huggingface.co/rudycaz/phi35-phish-mlx) was
22
- converted to MLX format from [mlx-community/Phi-3.5-mini-instruct-4bit](https://huggingface.co/mlx-community/Phi-3.5-mini-instruct-4bit)
23
- using mlx-lm version **0.30.7**.
24
 
25
- ## Use with mlx
26
 
27
- ```bash
28
- pip install mlx-lm
29
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- ```python
32
  from mlx_lm import load, generate
33
 
34
- model, tokenizer = load("rudycaz/phi35-phish-mlx")
 
 
 
 
 
35
 
36
- prompt = "hello"
 
 
37
 
38
- if tokenizer.chat_template is not None:
39
- messages = [{"role": "user", "content": prompt}]
40
- prompt = tokenizer.apply_chat_template(
41
- messages, add_generation_prompt=True, return_dict=False,
42
- )
43
 
44
- response = generate(model, tokenizer, prompt=prompt, verbose=True)
45
- ```
 
1
  ---
2
+ base_model: mlx-community/Phi-3.5-mini-instruct-4bit
3
+ tags:
4
+ - mlx
5
+ - mlx-lm
6
+ - phi-3.5
7
+ - peft
8
+ - lora
9
+ - phishing
10
+ - cybersecurity
11
  language:
12
+ - en
 
 
 
13
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
14
  ---
15
 
16
+ # phi35-phish-mlx
17
+
18
+ This repository contains an **Apple MLX**-format phishing-focused model derived from **Phi-3.5 Mini Instruct (4-bit)**. It is intended to help classify suspicious emails and support security review workflows.
19
+
20
+ ## What’s in this repo
21
+
22
+ This repo is meant to be used in one of these ways:
23
+
24
+ - **Fused model** (base + adapter merged into a single MLX model directory), OR
25
+ - **Adapter-only** (LoRA adapter weights) to be applied on top of the base model locally
26
+
27
+ If you are unsure which you uploaded, check the repo file list:
28
+ - Fused model typically includes MLX weights + tokenizer/config files for direct inference.
29
+ - Adapter-only typically includes adapter weight files/config and requires the base model separately.
30
+
31
+ ## Base model
32
+
33
+ - `mlx-community/Phi-3.5-mini-instruct-4bit`
34
 
35
+ ## Dataset
 
 
36
 
37
+ This model was fine-tuned for phishing detection using a Kaggle phishing email dataset:
38
 
39
+ - Kaggle dataset: **“phishing-email-dataset”** (naserabdullahalam)
40
+ https://www.kaggle.com/datasets/naserabdullahalam/phishing-email-dataset
41
+
42
+ > If you trained Phi-3.5 on a different Kaggle dataset, replace the link above with the exact dataset URL you used so the citation is accurate.
43
+
44
+ ## Intended behavior
45
+
46
+ Given an email, the intended output is a single label:
47
+ - `PHISHING`
48
+ - `LEGIT`
49
+
50
+ Example prompt format:
51
+ ```text
52
+ You are a security assistant. Classify the following email as PHISHING or LEGIT.
53
+
54
+ EMAIL:
55
+ <paste email here>
56
+
57
+ Answer with exactly one word: PHISHING or LEGIT.
58
+
59
+
60
+ pip install -U mlx-lm huggingface_hub
61
 
 
62
  from mlx_lm import load, generate
63
 
64
+ # Option A: load this repo directly (if fused model is uploaded)
65
+ MODEL_ID = "rudycaz/phi35-phish-mlx"
66
+
67
+ model, tokenizer = load(MODEL_ID)
68
+
69
+ prompt = """You are a security assistant. Classify the following email as PHISHING or LEGIT.
70
 
71
+ EMAIL:
72
+ Subject: Verify your account
73
+ Body: Please click the link below to verify...
74
 
75
+ Answer with exactly one word: PHISHING or LEGIT.
76
+ """
 
 
 
77
 
78
+ print(generate(model, tokenizer, prompt, max_tokens=8))