Spaces:

pravjet
/

new_mis_info

No application file

App Files Files Community

pravjet commited on Sep 22, 2025

Commit

5c9da83

verified ·

1 Parent(s): 1d7ea27

Upload 3 files

Browse files

Files changed (3) hide show

README.md +28 -13
app.py +63 -0
requirements.txt +5 -0

README.md CHANGED Viewed

@@ -1,13 +1,28 @@
----
-title: New Mis Info
-emoji: 💻
-colorFrom: gray
-colorTo: red
-sdk: gradio
-sdk_version: 5.46.1
-app_file: app.py
-pinned: false
-license: mit
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Misinformation Detection Dashboard (Gradio)
+This Gradio app detects misinformation in news articles using a fine-tuned DistilBERT model.
+## Features
+- Paste a news article URL
+- Automatically extracts and detects language
+- Translates Russian articles to English using Facebook's FSMT
+- Predicts fakeness score using DistilBERT
+- Displays trust verdict based on score
+## Files
+- `app.py`: Main Gradio interface and model logic
+- `requirements.txt`: Python dependencies
+- `README.md`: Project overview
+## Deployment (Hugging Face Spaces)
+1. Create a new Space using Gradio SDK
+2. Upload `app.py`, `requirements.txt`, and `README.md`
+3. Hugging Face will auto-launch the app
+## Requirements
+- Python 3.8+
+- Internet access for model downloads

app.py ADDED Viewed

	@@ -0,0 +1,63 @@

+import gradio as gr
+from transformers import AutoTokenizer, AutoModelForSequenceClassification, FSMTForConditionalGeneration, FSMTTokenizer
+from newspaper import Article
+from langdetect import detect
+import torch
+# Load models
+classification_model_name = 'distilbert-base-uncased-finetuned-sst-2-english'
+translation_model_name = 'facebook/wmt19-ru-en'
+classification_model = AutoModelForSequenceClassification.from_pretrained(classification_model_name)
+classification_tokenizer = AutoTokenizer.from_pretrained(classification_model_name)
+translation_model = FSMTForConditionalGeneration.from_pretrained(translation_model_name)
+translation_tokenizer = FSMTTokenizer.from_pretrained(translation_model_name)
+def analyze_article(url):
+    try:
+        article = Article(url)
+        article.download()
+        article.parse()
+        text = article.title + '. ' + article.text
+        lang = detect(text)
+    except Exception as e:
+        return f"Error: {e}", "", "", 0, "error"
+    translated_text = ""
+    if lang == 'ru':
+        input_ids = translation_tokenizer.encode(text, return_tensors="pt", max_length=512, truncation=True)
+        outputs = translation_model.generate(input_ids)
+        translated_text = translation_tokenizer.decode(outputs[0], skip_special_tokens=True)
+        text = translated_text
+    tokens = classification_tokenizer(text, truncation=True, return_tensors="pt")
+    with torch.no_grad():
+        outputs = classification_model(**tokens)
+    score = torch.nn.functional.softmax(outputs.logits[0], dim=0)[1].item()
+    percentage = int(score * 100)
+    if percentage > 70:
+        status = "We would not trust this text!"
+    elif percentage > 40:
+        status = "We are not sure about this text!"
+    else:
+        status = "We would trust this text!"
+    return text[:1000], lang.upper(), translated_text[:1000] if translated_text else "Not required", percentage, status
+demo = gr.Interface(
+    fn=analyze_article,
+    inputs=gr.Textbox(label="Enter Article URL"),
+    outputs=[
+        gr.Textbox(label="Extracted or Translated Text"),
+        gr.Textbox(label="Detected Language"),
+        gr.Textbox(label="Translated Text (if Russian)"),
+        gr.Number(label="Fakeness Score (%)"),
+        gr.Textbox(label="Trust Verdict")
+    ],
+    title="🧠 Misinformation Detection Dashboard",
+    description="Paste a news article URL to detect language, translate if needed, and predict fakeness using a fine-tuned DistilBERT model."
+)
+demo.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,5 @@

+gradio
+transformers
+torch
+newspaper3k
+langdetect