File size: 4,112 Bytes
af7b60b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# IndoHoaxDetector Example Notebook\n",
    "\n",
    "This notebook demonstrates how to use the IndoHoaxDetector model to classify Indonesian news articles as hoax or legitimate.\n",
    "\n",
    "## Setup\n",
    "\n",
    "First, make sure you have the required libraries installed:\n",
    "```bash\n",
    "pip install scikit-learn\n",
    "```\n",
    "\n",
    "And ensure the `logreg_model.pkl` file is in the same directory as this notebook."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pickle\n",
    "import numpy as np\n",
    "\n",
    "# Load the model\n",
    "with open('logreg_model.pkl', 'rb') as f:\n",
    "    model = pickle.load(f)\n",
    "\n",
    "print(\"Model loaded successfully!\")\n",
    "print(f\"Model type: {type(model)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example Predictions\n",
    "\n",
    "Let's test the model with some example Indonesian news texts."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Example texts\n",
    "test_texts = [\n",
    "    \"Presiden Joko Widodo mengumumkan program vaksinasi COVID-19 untuk seluruh masyarakat Indonesia.\",\n",
    "    \"Ditemukan dinosaurus hidup di Danau Toba, Sumatera Utara.\",\n",
    "    \"Harga BBM akan naik mulai bulan depan akibat kenaikan harga minyak dunia.\",\n",
    "    \"Minum jus lemon setiap hari bisa menyembuhkan diabetes secara permanen.\"\n",
    "]\n",
    "\n",
    "# Make predictions\n",
    "predictions = model.predict(test_texts)\n",
    "probabilities = model.predict_proba(test_texts)\n",
    "\n",
    "# Display results\n",
    "for i, (text, pred, prob) in enumerate(zip(test_texts, predictions, probabilities)):\n",
    "    label = \"Hoax\" if pred == 1 else \"Legitimate\"\n",
    "    confidence = prob[pred]\n",
    "    print(f\"Example {i+1}:\")\n",
    "    print(f\"Text: {text[:80]}{'...' if len(text) > 80 else ''}\")\n",
    "    print(f\"Prediction: {label} (Confidence: {confidence:.4f})\")\n",
    "    print(f\"Probabilities: Legitimate={prob[0]:.4f}, Hoax={prob[1]:.4f}\")\n",
    "    print(\"-\" * 50)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Custom Prediction\n",
    "\n",
    "Try your own text below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Enter your own Indonesian news text here\n",
    "custom_text = \"Masukkan berita Indonesia yang ingin Anda periksa di sini.\"\n",
    "\n",
    "# Make prediction\n",
    "custom_pred = model.predict([custom_text])[0]\n",
    "custom_prob = model.predict_proba([custom_text])[0]\n",
    "\n",
    "custom_label = \"Hoax\" if custom_pred == 1 else \"Legitimate\"\n",
    "custom_confidence = custom_prob[custom_pred]\n",
    "\n",
    "print(f\"Custom Text: {custom_text}\")\n",
    "print(f\"Prediction: {custom_label}\")\n",
    "print(f\"Confidence: {custom_confidence:.4f}\")\n",
    "print(f\"Probabilities: Legitimate={custom_prob[0]:.4f}, Hoax={custom_prob[1]:.4f}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Notes\n",
    "\n",
    "- This model is trained on Indonesian news data and may not work well with other languages.\n",
    "- Always use human judgment and verify information from multiple sources.\n",
    "- The model provides probability scores that can help assess confidence in predictions."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}