Spaces:

malorieiovino
/

TruthLens

Sleeping

App Files Files Community

TruthLens / README.md

malorieiovino

Update README.md

8d630cf verified 11 months ago

preview code

raw

history blame contribute delete

4.98 kB

A newer version of the Streamlit SDK is available: 1.54.0

Upgrade

metadata

title: TruthLens
emoji: 🔎
colorFrom: blue
colorTo: green
sdk: streamlit
app_file: app/app.py
pinned: false

TruthLens: A Fact-Checking Assistant with Linguistic Understanding

TruthLens is an advanced NLP-powered fact-checking application that uses transformer-based models fine-tuned on fact-checking datasets to assess the truthfulness of textual claims, with special attention to linguistic nuances and semantic structures.

Overview

This application leverages pre-trained language models to analyze claims and classify them based on their factuality. Three different models are implemented to explore how different architectures handle the linguistic complexities of factual statements:

DistilBERT (FEVER dataset): A lightweight model trained on the Fact Extraction and VERification dataset
RoBERTa (FEVER dataset): A more robust model also trained on the FEVER dataset
DeBERTa (LIAR dataset): An advanced model trained on the LIAR political fact-checking dataset

Linguistic Analysis Capabilities

TruthLens goes beyond simple fact verification by examining how different models process complex linguistic phenomena:

Negation handling: Assessing how models interpret "not," "never," and other negative constructions
Modal verbs: Analyzing treatment of uncertainty markers like "might," "could," and "should"
Epistemic modality: Examining expressions of certainty, possibility, and probabilistic statements
Conditional constructions: Evaluating how if-then relationships and hypotheticals are processed
Intensifiers and hedges: Testing the impact of modifiers like "literally," "very," and "somewhat"
Nested propositional structures: Measuring comprehension of claims embedded within other claims
Comparative and superlative statements: Analyzing how relative and absolute comparisons are interpreted

Technical Implementation

Models

The models were fine-tuned on fact-checking datasets using transformers from Hugging Face:

FEVER Dataset: Contains 185,445 claims with SUPPORTS, REFUTES, and NOT ENOUGH INFO labels
LIAR Dataset: Contains 12,800 political statements with six fine-grained truthfulness labels

The models are hosted on Hugging Face and loaded directly into the application at runtime.

Application

The application is built with:

Streamlit: For the interactive web interface
PyTorch: For model inference
Transformers: For loading and utilizing the fine-tuned models
Hugging Face Spaces: For deployment and hosting

Linguistic Phenomena Evaluation

TruthLens specifically examines model behavior across these linguistic constructions:

Basic Facts vs. Complex Assertions: Comparing performance on simple statements versus compound or complex sentences
Negation Scope: Assessing whether models understand the scope of negation within sentences
Modal Semantics: Evaluating if models distinguish between epistemic possibility, permission, and obligation
Ambiguity Resolution: Testing how models handle lexical and structural ambiguities
Hedged Claims: Analyzing recognition of uncertainty markers and their effect on truthfulness assessment
Presuppositions: Examining how models handle implicit assumptions within claims
Figurative Language: Testing literal versus non-literal interpretation of metaphorical statements
Subjective vs. Objective Claims: Measuring distinction between verifiable facts and expressions of opinion

NLP Coursework Project

This project was developed as part of an NLP coursework assessment, focusing on the application of computational linguistics and transformer-based language models to fact verification. It demonstrates the intersection of natural language processing, computational semantics, and information verification systems.

The research specifically explores how different transformer architectures handle linguistic nuances that humans naturally process but remain challenging for AI systems, providing insights into both the capabilities and limitations of current NLP approaches to automated fact-checking.

Usage

Select a fact-checking model from the sidebar
Enter a claim or select an example
Click "Check Fact" to analyze the claim
Review the prediction and confidence scores
Examine the detailed linguistic analysis breakdown

Future Linguistic Research Directions

Development of models with enhanced pragmatic understanding
Integration of discourse analysis for contextual claim verification
Cross-linguistic adaptation for fact-checking in multiple languages
Improved recognition of rhetorical devices and their impact on factuality
Semantic frame analysis for better understanding of claim structures
Temporal reasoning for evolving truths and time-dependent facts

Author

Malorie Iovino

License

This project is available for educational and linguistic research purposes.