DaJulster commited on
Commit
88a1e3e
·
verified ·
1 Parent(s): e90dc4c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +84 -82
README.md CHANGED
@@ -1,82 +1,84 @@
1
- ---
2
- language: en
3
- library_name: pytorch
4
- license: mit
5
- pipeline_tag: text-classification
6
- tags:
7
- - pytorch
8
- - multitask
9
- - ai-detection
10
- ---
11
-
12
- # SuaveAI Detection Multitask Model V1
13
-
14
- This repository contains a custom PyTorch multitask model checkpoint and auxiliary files.
15
-
16
- ## Files
17
-
18
- - `multitask_model.pth`: model checkpoint weights
19
- - `label_encoder.pkl`: label encoder used to map predictions to labels
20
- - `tok.txt`: tokenizer/vocabulary artifact used during preprocessing
21
-
22
- ## Important
23
-
24
- This is a **custom PyTorch checkpoint** and is not a native Transformers `AutoModel` package.
25
- This repo now includes Hugging Face custom-code files so it can be loaded from Hub with
26
- `trust_remote_code=True`.
27
-
28
- ## Load from Hugging Face Hub
29
-
30
- ```python
31
- import torch
32
- from transformers import AutoModel, AutoTokenizer
33
-
34
- repo_id = "DaJulster/SuaveAI-Dectection-Multitask-Model-V1"
35
-
36
- tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
37
- model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)
38
- model.eval()
39
-
40
- text = "This is a sample input"
41
- inputs = tokenizer(text, return_tensors="pt", truncation=True)
42
- with torch.no_grad():
43
- outputs = model(**inputs)
44
-
45
- binary_logits = outputs.logits_binary
46
- multiclass_logits = outputs.logits_multiclass
47
- ```
48
-
49
- Binary prediction uses `logits_binary`, and AI-model classification uses `logits_multiclass`.
50
-
51
- ## Quick start
52
-
53
- ```python
54
- import torch
55
- import pickle
56
-
57
- # 1) Recreate your model class exactly as in training
58
- # from model_def import MultiTaskModel
59
- # model = MultiTaskModel(...)
60
-
61
- model = ... # instantiate your model architecture
62
- state = torch.load("multitask_model.pth", map_location="cpu")
63
- model.load_state_dict(state)
64
- model.eval()
65
-
66
- with open("label_encoder.pkl", "rb") as f:
67
- label_encoder = pickle.load(f)
68
-
69
- with open("tok.txt", "r", encoding="utf-8") as f:
70
- tokenizer_artifact = f.read()
71
-
72
- # Run your preprocessing + inference pipeline here
73
- ```
74
-
75
- ## Intended use
76
-
77
- - Multitask AI detection inference in your custom pipeline.
78
-
79
- ## Limitations
80
-
81
- - Requires matching model definition and preprocessing pipeline.
82
- - Not plug-and-play with `transformers.AutoModel.from_pretrained`.
 
 
 
1
+ ---
2
+ language: en
3
+ library_name: pytorch
4
+ license: mit
5
+ pipeline_tag: text-classification
6
+ tags:
7
+ - pytorch
8
+ - multitask
9
+ - ai-detection
10
+ ---
11
+
12
+ # SuaveAI Detection Multitask Model V1
13
+
14
+ This repository contains a custom PyTorch multitask model checkpoint and auxiliary files.
15
+
16
+ The notebook used to train this model is here: https://www.kaggle.com/code/julienserbanescu/suaveai
17
+
18
+ ## Files
19
+
20
+ - `multitask_model.pth`: model checkpoint weights
21
+ - `label_encoder.pkl`: label encoder used to map predictions to labels
22
+ - `tok.txt`: tokenizer/vocabulary artifact used during preprocessing
23
+
24
+ ## Important
25
+
26
+ This is a **custom PyTorch checkpoint** and is not a native Transformers `AutoModel` package.
27
+ This repo now includes Hugging Face custom-code files so it can be loaded from Hub with
28
+ `trust_remote_code=True`.
29
+
30
+ ## Load from Hugging Face Hub
31
+
32
+ ```python
33
+ import torch
34
+ from transformers import AutoModel, AutoTokenizer
35
+
36
+ repo_id = "DaJulster/SuaveAI-Dectection-Multitask-Model-V1"
37
+
38
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
39
+ model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)
40
+ model.eval()
41
+
42
+ text = "This is a sample input"
43
+ inputs = tokenizer(text, return_tensors="pt", truncation=True)
44
+ with torch.no_grad():
45
+ outputs = model(**inputs)
46
+
47
+ binary_logits = outputs.logits_binary
48
+ multiclass_logits = outputs.logits_multiclass
49
+ ```
50
+
51
+ Binary prediction uses `logits_binary`, and AI-model classification uses `logits_multiclass`.
52
+
53
+ ## Quick start
54
+
55
+ ```python
56
+ import torch
57
+ import pickle
58
+
59
+ # 1) Recreate your model class exactly as in training
60
+ # from model_def import MultiTaskModel
61
+ # model = MultiTaskModel(...)
62
+
63
+ model = ... # instantiate your model architecture
64
+ state = torch.load("multitask_model.pth", map_location="cpu")
65
+ model.load_state_dict(state)
66
+ model.eval()
67
+
68
+ with open("label_encoder.pkl", "rb") as f:
69
+ label_encoder = pickle.load(f)
70
+
71
+ with open("tok.txt", "r", encoding="utf-8") as f:
72
+ tokenizer_artifact = f.read()
73
+
74
+ # Run your preprocessing + inference pipeline here
75
+ ```
76
+
77
+ ## Intended use
78
+
79
+ - Multitask AI detection inference in your custom pipeline.
80
+
81
+ ## Limitations
82
+
83
+ - Requires matching model definition and preprocessing pipeline.
84
+ - Not plug-and-play with `transformers.AutoModel.from_pretrained`.