InfinitodeLTD
/

SMCM-OPEN-ARC

Text Classification

Model card Files Files and versions

JohanBeytell commited on Oct 19, 2025

Commit

c110d02

·

verified ·

1 Parent(s): c1112f1

Update README.md

Files changed (1) hide show

README.md +82 -3

README.md CHANGED Viewed

@@ -1,3 +1,82 @@
----
-license: mit
----

+---
+license: mit
+language:
+- en
+metrics:
+- precision
+- recall
+- f1
+- accuracy
+pipeline_tag: text-classification
+tags:
+- classification
+- security
+---
+# Model Card for Infinitode/SMCM-OPEN-ARC
+Repository: https://github.com/Infinitode/OPEN-ARC/
+## Model Description
+OPEN-ARC-SMC is a MultinomialNB model developed as part of Infinitode's OPEN-ARC initiative. It was created to categorize text, particularly emails, as either spam or legitimate (ham).
+**Architecture**:
+- **MultinomialNB**: Used default parameters.
+- **Framework**: SKLearn.
+- **Training Setup**: Trained using default params.
+## Uses
+- Determining whether emails or SMS are spam or legitimate.
+- Enhancing research and developing defensive measures against spammers.
+## Limitations
+Emails or SMS may be classified as false positives or false negatives because of the nature of the data and its inherent limitations.
+## Training Data
+- Dataset: Spam Mail Classifier Dataset dataset from Kaggle.
+- Source URL: https://www.kaggle.com/datasets/mosapabdelghany/spam-mail-classifier/
+- Content: Messages categorized as either spam or ham (legitimate emails or SMS).
+- Size: 1000 email/SMS messages labeled as spam or ham.
+- Preprocessing: The preprocessing steps included removing missing values and converting text into vectors.
+## Training Procedure
+- Metrics: accuracy, precision, recall, F1
+- Train/Testing Split: 80% train, 20% testing.
+## Evaluation Results
+| Metric | Value |
+| ------ | ----- |
+| Testing Accuracy | 98.48% |
+| Testing Precision (`spam`) | 96.15% |
+| Testing Recall (`spam`) | 93.17% |
+| Testing F1 (`spam`) | 94.64% |
+## How to Use
+```python
+new_emails = [
+    "Congratulations! You've won a free prize. Click the link to claim.", # Likely spam
+    "Hi, just confirming our meeting for tomorrow at 10 AM. Thanks." # Likely not spam
+]
+# Vectorize the new emails using the fitted vectorizer
+new_emails_vectorized = vectorizer.transform(new_emails)
+# Make predictions
+predictions = model.predict(new_emails_vectorized)
+for i, email in enumerate(new_emails):
+    print(f"\nEmail: '{email}'")
+    print(f"Prediction: {predictions[i]}")
+```
+## Contact
+For questions or issues, open a GitHub issue or reach out at https://infinitode.netlify.app/forms/contact.