| | --- |
| | language: en |
| | library_name: pytorch |
| | license: mit |
| | pipeline_tag: text-classification |
| | tags: |
| | - pytorch |
| | - multitask |
| | - ai-detection |
| | --- |
| | |
| | # SuaveAI Detection Multitask Model V1 |
| |
|
| | This repository contains a custom PyTorch multitask model checkpoint and auxiliary files. |
| |
|
| | The notebook used to train this model is here: https://www.kaggle.com/code/julienserbanescu/suaveai |
| |
|
| | ## Files |
| |
|
| | - `multitask_model.pth`: model checkpoint weights |
| | - `label_encoder.pkl`: label encoder used to map predictions to labels |
| | - `tok.txt`: tokenizer/vocabulary artifact used during preprocessing |
| |
|
| | ## Important |
| |
|
| | This is a **custom PyTorch checkpoint** and is not a native Transformers `AutoModel` package. |
| | This repo now includes Hugging Face custom-code files so it can be loaded from Hub with |
| | `trust_remote_code=True`. |
| |
|
| | ## Load from Hugging Face Hub |
| |
|
| | ```python |
| | import torch |
| | from transformers import AutoModel, AutoTokenizer |
| | |
| | repo_id = "DaJulster/SuaveAI-Dectection-Multitask-Model-V1" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(repo_id, trust_remote_code=True) |
| | model = AutoModel.from_pretrained(repo_id, trust_remote_code=True) |
| | model.eval() |
| | |
| | text = "This is a sample input" |
| | inputs = tokenizer(text, return_tensors="pt", truncation=True) |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | |
| | binary_logits = outputs.logits_binary |
| | multiclass_logits = outputs.logits_multiclass |
| | ``` |
| |
|
| | Binary prediction uses `logits_binary`, and AI-model classification uses `logits_multiclass`. |
| |
|
| | ## Quick start |
| |
|
| | ```python |
| | import torch |
| | import pickle |
| | |
| | # 1) Recreate your model class exactly as in training |
| | # from model_def import MultiTaskModel |
| | # model = MultiTaskModel(...) |
| | |
| | model = ... # instantiate your model architecture |
| | state = torch.load("multitask_model.pth", map_location="cpu") |
| | model.load_state_dict(state) |
| | model.eval() |
| | |
| | with open("label_encoder.pkl", "rb") as f: |
| | label_encoder = pickle.load(f) |
| | |
| | with open("tok.txt", "r", encoding="utf-8") as f: |
| | tokenizer_artifact = f.read() |
| | |
| | # Run your preprocessing + inference pipeline here |
| | ``` |
| |
|
| | ## Intended use |
| |
|
| | - Multitask AI detection inference in your custom pipeline. |
| |
|
| | ## Limitations |
| |
|
| | - Requires matching model definition and preprocessing pipeline. |
| | - Not plug-and-play with `transformers.AutoModel.from_pretrained`. |
| |
|