phishbot
/

ScamLLM

+---
+license: unknown
+---
+# Overview
+<!-- This model is obtained by finetuning Pre-Trained RoBERTa on dataset containing several sets of malicious prompts.
+Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails.
+This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts, as detailed in the corresponding arXiv paper.
+Using this model, we can classify malicious prompts that can lead towards creation of phishing websites and phishing emails. -->
+Our model, "ScamLLM" is designed to identify malicious prompts that can be used to generate phishing websites and emails using popular commercial LLMs like ChatGPT, Bard and Claude.
+This model is obtained by finetuning a Pre-Trained RoBERTa using a dataset encompassing multiple sets of malicious prompts, as detailed in our corresponding arXiv paper
+<!--- **Paper:** https://arxiv.org/abs/2310.19181 -->
+Try out "ScamLLM" using the Inference API. Our model classifies prompts with "Label 1" to signify the identification of a phishing attempt, while "Label 0" denotes a prompt that is considered safe and non-malicious.
+## Dataset Details
+The dataset utilized for training this model has been created using malicious prompts generated by GPT-4.
+Due to ethical concerns, our dataset is currently available only upon request.
+## Training Details
+The model was trained using RobertaForSequenceClassification.from_pretrained.
+In this process, both the model and tokenizer pertinent to the RoBERTa-base were employed.
+We trained this model for 10 epochs, setting a learning rate to 2e-5, and used AdamW Optimizer.
+## Inference
+There are multiple ways to use this model. The simplest way to use is with pipeline "text-classification"
+```python
+from transformers import pipeline
+classifier = pipeline(task="text-classification", model="phishbot/ScamLLM", top_k=None)
+prompt = ["Your Sample Sentence or Prompt...."]
+model_outputs = classifier(prompt)
+print(model_outputs[0])
+```
+### Results
+Achieved an accuracy of 96% with an F1-score of 0.96, on test sets distribution, explained in the paper.
+<!--## Citation
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section.
+If you find Isitphish to be useful, please cite it with:
+```
+@misc{roy2023chatbots,
+      title={From Chatbots to PhishBots? -- Preventing Phishing scams created using ChatGPT, Google Bard and Claude},
+      author={Sayak Saha Roy and Poojitha Thota and Krishna Vamsi Naragam and Shirin Nilizadeh},
+      year={2023},
+      eprint={2310.19181},
+      archivePrefix={arXiv},
+      primaryClass={cs.CR}
+}
+```-->