atdokmeci commited on
Commit
84f00ae
·
verified ·
1 Parent(s): 8cb568e

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -0
README.md ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SMS Spam Detection: Combined Model Card
2
+
3
+ ## Models
4
+
5
+ ### 1. Multinomial Naive Bayes
6
+ - **Type:** MultinomialNB
7
+ - **Library:** scikit-learn
8
+ - **Description:** A Naive Bayes classifier for multinomially distributed data, commonly used for text classification tasks.
9
+ - **Training Data:** SMS Spam Collection dataset (`train.csv`), preprocessed and vectorized using CountVectorizer.
10
+ - **Features:** Bag-of-words (unigrams), stopwords removed.
11
+ - **Target:** `label` (0: ham, 1: spam)
12
+ - **Accuracy:** `{{ accuracy_score(tahmin, y_test) }}`
13
+ - **Date Trained:** `{{ datetime.now().strftime("%Y-%m-%d") }}`
14
+
15
+ ### 2. Decision Tree Classifier
16
+ - **Type:** DecisionTreeClassifier
17
+ - **Library:** scikit-learn
18
+ - **Description:** A decision tree classifier for binary classification of SMS messages.
19
+ - **Training Data:** SMS Spam Collection dataset (`train.csv`), preprocessed and vectorized using CountVectorizer.
20
+ - **Features:** Bag-of-words (unigrams), stopwords removed.
21
+ - **Target:** `label` (0: ham, 1: spam)
22
+ - **Accuracy:** `{{ accuracy_score(tahmin3, y_test) }}`
23
+ - **Date Trained:** `{{ datetime.now().strftime("%Y-%m-%d") }}`
24
+
25
+ ## Preprocessing
26
+
27
+ - Lowercasing all text
28
+ - Removing punctuation, digits, and newlines
29
+ - Stopwords removed during vectorization
30
+
31
+ ## Evaluation Metric
32
+
33
+ - Accuracy on test set
34
+
35
+ ## Notes
36
+
37
+ - Models saved using joblib.
38
+ - For further evaluation, consider precision, recall, and F1-score.