theonegareth
/

IndoHoaxDetector

@@ -14,7 +14,7 @@ tags: ["nlp", "text-classification", "indonesian", "hoax-detection", "machine-le
 # IndoHoaxDetector
-A machine learning model for detecting hoax news articles in Indonesian language. This project uses a logistic regression classifier trained on a dataset of Indonesian news to identify potentially misleading or false information.
 ## Features
@@ -65,10 +65,11 @@ print(prediction)  # 0 for legitimate, 1 for hoax
 ## Limitations
-- Trained specifically on Indonesian news
 - May not perform well on other languages or domains
 - Accuracy depends on the quality and representativeness of training data
-- False positives/negatives possible
 ## Contributing

 # IndoHoaxDetector
+A machine learning model for detecting hoax-style news articles in Indonesian language. This project uses a logistic regression classifier trained on linguistic features of Indonesian news to identify articles written in a style typical of hoaxes or fake news, **not to verify factual accuracy**. It analyzes writing patterns, sensationalism, and other stylistic indicators rather than checking the truthfulness of the content.
 ## Features
 ## Limitations
+- **Stylistic Analysis Only**: This model detects hoax-like writing style, not factual accuracy. A legitimate article could be flagged as hoax if written sensationally, and vice versa.
+- Trained specifically on Indonesian news linguistic patterns
 - May not perform well on other languages or domains
 - Accuracy depends on the quality and representativeness of training data
+- False positives/negatives possible due to stylistic variations
 ## Contributing

app.py CHANGED Viewed

@@ -60,7 +60,7 @@ demo = gr.Interface(
     ),
     outputs=gr.Markdown(label="Detection Result"),
     title="IndoHoaxDetector",
-    description="Detect hoax news in Indonesian language using machine learning. Enter news text to check if it's likely legitimate or a hoax.",
     examples=[
         ["Presiden mengumumkan program bantuan sosial untuk masyarakat miskin di seluruh Indonesia."],
         ["Ditemukan cara ampuh menghilangkan stres hanya dengan minum air putih 2 liter sehari."],

     ),
     outputs=gr.Markdown(label="Detection Result"),
     title="IndoHoaxDetector",
+    description="**Stylistic Analysis Tool**: Detects if Indonesian news text is written in a hoax-like style using machine learning. This analyzes writing patterns and sensationalism, **not factual accuracy**. Results indicate writing style similarity to known hoaxes, not truth verification.",
     examples=[
         ["Presiden mengumumkan program bantuan sosial untuk masyarakat miskin di seluruh Indonesia."],
         ["Ditemukan cara ampuh menghilangkan stres hanya dengan minum air putih 2 liter sehari."],

modelcard.md CHANGED Viewed

@@ -3,7 +3,7 @@
 ## Model Details
 ### Model Description
-IndoHoaxDetector is a binary classification model designed to detect hoax news articles in the Indonesian language. It uses logistic regression trained on a dataset of Indonesian news to classify text as either legitimate or hoax.
 - **Developed by**: Gareth Aurelius Harrison
 - **Model type**: Logistic Regression (scikit-learn)
@@ -18,7 +18,7 @@ IndoHoaxDetector is a binary classification model designed to detect hoax news a
 ## Uses
 ### Direct Use
-This model can be used to analyze Indonesian news articles and determine if they are likely to be hoaxes. It is intended for educational, research, and journalistic purposes to help identify potentially misleading information.
 ### Downstream Use
 - News verification tools
@@ -42,6 +42,7 @@ Users should be aware that this model:
 - Requires human verification for critical applications
 ### Known Limitations
 - **Data Bias**: The model is trained on a limited dataset; performance may vary with different topics or writing styles
 - **Language Specificity**: Only works for Indonesian text
 - **Temporal Limitations**: News patterns change over time; the model may become less accurate with newer data

 ## Model Details
 ### Model Description
+IndoHoaxDetector is a binary classification model designed to detect hoax-style news articles in the Indonesian language. It uses logistic regression trained on linguistic features of Indonesian news to classify text as either legitimate or hoax-like writing. **This model analyzes writing style and patterns, not factual accuracy or truthfulness of the content.**
 - **Developed by**: Gareth Aurelius Harrison
 - **Model type**: Logistic Regression (scikit-learn)
 ## Uses
 ### Direct Use
+This model can be used to analyze Indonesian news articles and determine if they are written in a hoax-like style. It identifies linguistic patterns typical of fake news but does **not verify factual accuracy**. It is intended for educational, research, and journalistic purposes to help identify potentially sensational or misleading writing styles.
 ### Downstream Use
 - News verification tools
 - Requires human verification for critical applications
 ### Known Limitations
+- **Stylistic vs Factual Analysis**: This model detects writing style typical of hoaxes, not factual inaccuracies. Legitimate news written sensationally may be flagged as hoax, and factual hoaxes written professionally may be missed.
 - **Data Bias**: The model is trained on a limited dataset; performance may vary with different topics or writing styles
 - **Language Specificity**: Only works for Indonesian text
 - **Temporal Limitations**: News patterns change over time; the model may become less accurate with newer data