Spaces:

Marius16
/

FakeNews-XAI

Running

App Files Files Community

FakeNews-XAI / evaluation

2.46 MB

Ctrl+K

1 contributor

History: 17 commits

Marius16

Set FAKE_THRESHOLD=0.75, benchmark results P=0.909 R=0.227 F1=0.364 Acc=0.650 FP=1 FN=34

f734793 13 days ago

figures
Testes + refactor about 1 month ago
results
Set FAKE_THRESHOLD=0.75, benchmark results P=0.909 R=0.227 F1=0.364 Acc=0.650 FP=1 FN=34 13 days ago
benchmark_articles.json

125 kB
Testes + refactor about 2 months ago
benchmark_articles_extended.json

220 kB
RSS verification panel, benchmark split (100/150), remove Ollama/REBEL from frontend, sync Qwen3 labels 21 days ago
compare_pipelines.py

5.96 kB
Add compare_pipelines.py - batch evaluation A vs B 3 months ago
run_baseline.py

6.07 kB
threshold sweep + TF-IDF/RF/SVM baselines, event-date KG +64 canonical events (1169->1233 facts), V8 action-before-office check, _check_event_date() in external.py 13 days ago
run_benchmark.py

12.1 kB
Replace Ollama/llama3 with spacy-llm + Qwen3-1.7B (Pipeline B + Explainer), archive REBEL (F1=0.038), fix TCS n_claims=0 score, add ISOT/RAGuard eval scripts, update start.sh and requirements 30 days ago
run_evaluation.py

21.7 kB
Testes + refactor about 1 month ago
run_extraction_benchmark.py

15.1 kB
Replace Ollama/llama3 with spacy-llm + Qwen3-1.7B (Pipeline B + Explainer), archive REBEL (F1=0.038), fix TCS n_claims=0 score, add ISOT/RAGuard eval scripts, update start.sh and requirements 30 days ago
run_isot_eval.py

13.6 kB
DATE_TOLERANCE 400->200, cross-entity overlap check with same_role_category filter, Reference KG +22 manual facts (Nixon/Ford/LBJ/JFK/Truman/Blair/Thatcher/Starmer/Macron/Tusk + legislation) — F1 0.345->0.393, Recall 0.227->0.273, Sep +0.129->+0.151 20 days ago
run_politifact_eval.py

18.2 kB
Testes + refactor about 2 months ago
run_raguard_eval.py

16.8 kB
DATE_TOLERANCE 400->200, cross-entity overlap check with same_role_category filter, Reference KG +22 manual facts (Nixon/Ford/LBJ/JFK/Truman/Blair/Thatcher/Starmer/Macron/Tusk + legislation) — F1 0.345->0.393, Recall 0.227->0.273, Sep +0.129->+0.151 20 days ago
run_threshold_sweep.py

6.68 kB
threshold sweep + TF-IDF/RF/SVM baselines, event-date KG +64 canonical events (1169->1233 facts), V8 action-before-office check, _check_event_date() in external.py 13 days ago
test_wikidata_inverse.py

6.14 kB
Wikidata inverse, Neo4j cache decoupled, auto-title, cross-article button, batch checkboxes, threshold 0.55 about 2 months ago