File size: 2,228 Bytes
88a1e3e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
---
language: en
library_name: pytorch
license: mit
pipeline_tag: text-classification
tags:
  - pytorch
  - multitask
  - ai-detection
---

# SuaveAI Detection Multitask Model V1

This repository contains a custom PyTorch multitask model checkpoint and auxiliary files.

The notebook used to train this model is here: https://www.kaggle.com/code/julienserbanescu/suaveai

## Files

- `multitask_model.pth`: model checkpoint weights
- `label_encoder.pkl`: label encoder used to map predictions to labels
- `tok.txt`: tokenizer/vocabulary artifact used during preprocessing

## Important

This is a **custom PyTorch checkpoint** and is not a native Transformers `AutoModel` package.
This repo now includes Hugging Face custom-code files so it can be loaded from Hub with
`trust_remote_code=True`.

## Load from Hugging Face Hub

```python
import torch
from transformers import AutoModel, AutoTokenizer

repo_id = "DaJulster/SuaveAI-Dectection-Multitask-Model-V1"

tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True)
model = AutoModel.from_pretrained(repo_id, trust_remote_code=True)
model.eval()

text = "This is a sample input"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
with torch.no_grad():
  outputs = model(**inputs)

binary_logits = outputs.logits_binary
multiclass_logits = outputs.logits_multiclass
```

Binary prediction uses `logits_binary`, and AI-model classification uses `logits_multiclass`.

## Quick start

```python
import torch
import pickle

# 1) Recreate your model class exactly as in training
# from model_def import MultiTaskModel
# model = MultiTaskModel(...)

model = ...  # instantiate your model architecture
state = torch.load("multitask_model.pth", map_location="cpu")
model.load_state_dict(state)
model.eval()

with open("label_encoder.pkl", "rb") as f:
    label_encoder = pickle.load(f)

with open("tok.txt", "r", encoding="utf-8") as f:
    tokenizer_artifact = f.read()

# Run your preprocessing + inference pipeline here
```

## Intended use

- Multitask AI detection inference in your custom pipeline.

## Limitations

- Requires matching model definition and preprocessing pipeline.
- Not plug-and-play with `transformers.AutoModel.from_pretrained`.