onnx-community
/

ScamLLM-ONNX

Text Classification

Transformers.js

Model card Files Files and versions

ScamLLM-ONNX / README.md

Felladrin's picture

Upload folder using huggingface_hub

ba87451 verified about 1 month ago

|

history blame contribute delete

3.23 kB

	---
	license: unknown
	library_name: transformers.js
	base_model:
	- phishbot/ScamLLM
	pipeline_tag: text-classification
	---



	# ScamLLM (ONNX)


	This is an ONNX version of [phishbot/ScamLLM](https://huggingface.co/phishbot/ScamLLM). It was automatically converted and uploaded using [this Hugging Face Space](https://huggingface.co/spaces/onnx-community/convert-to-onnx).


	## Usage with Transformers.js


	See the pipeline documentation for `text-classification`: https://huggingface.co/docs/transformers.js/api/pipelines#module_pipelines.TextClassificationPipeline


	---


	# Overview

	<!-- This model is obtained by finetuning Pre-Trained RoBERTa on dataset containing several sets of malicious prompts.
	Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails.
	This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts, as detailed in the corresponding arXiv paper.
	Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails. -->

	Our model, "ScamLLM" is designed to identify malicious prompts that can be used to generate phishing websites using popular commercial LLMs like ChatGPT, Bard and Claude.
	This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts.

	Try out "ScamLLM" using the Inference API. Our model classifies prompts with "Label 1" to signify the identification of a phishing attempt, while "Label 0" denotes a prompt that is considered safe and non-malicious.

	## Dataset Details

	The dataset utilized for training this model has been created using malicious prompts generated using GPT 3.5T and GPT-4.
	Due to being active vulnerabilities under review, our dataset of malicious prompts is available only upon request at this stage.

	## Training Details

	The model was trained using RobertaForSequenceClassification.from_pretrained.
	In this process, both the model and tokenizer pertinent to the RoBERTa-base were employed and trained for 10 epochs (learning rate 2e-5 and AdamW Optimizer).

	## Inference

	There are multiple ways to test this model, with the simplest being to use the Inference API, as well as with the pipeline "text-classification" as below:

	```python
	from transformers import pipeline
	classifier = pipeline(task="text-classification", model="phishbot/ScamLLM", top_k=None)
	prompt = ["Your Sample Sentence or Prompt...."]
	model_outputs = classifier(prompt)
	print(model_outputs[0])
	```

	If you use our model in your research, please cite our paper "From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models" (https://www.computer.org/csdl/proceedings-article/sp/2024/313000a221/1WPcYLpYFHy).

	BibTeX below:

	```@inproceedings{roy2024chatbots,
	title={From Chatbots to Phishbots?: Phishing Scam Generation in Commercial Large Language Models},
	author={Roy, Sayak Saha and Thota, Poojitha and Naragam, Krishna Vamsi and Nilizadeh, Shirin},
	booktitle={2024 IEEE Symposium on Security and Privacy (SP)},
	pages={221--221},
	year={2024},
	organization={IEEE Computer Society}
	}
	```