--- license: mit datasets: - copenlu/mm-framing --- RoBERTa topic classifier for topic injection into the Longformer Framing Classifier. Classifies input text into one of 19 discrete topics: 1. Business & Economy 2. Crime & Safety 3. Disaster & Accidents 4. Education 5. Entertainment 6. Environment & Nature 7. Health 8. Immigration 9. Infrastructure & Transport 10. Legal 11. Lifestyle & Culture 12. Media 13. Other/Unknown 14. Politics 15. Science & Technology 16. Social Issues 17. Sports 18. War & Conflict 19. Weather These were derived empirically by consolidating the unstructured gpt_topic field from the mm_framing silver dataset into discrete categories based on similarity. Achieved a 76.4% validation accuracy on 64,000 examples, which was deemed sufficient for assisting domain-specific reasoning in downstream model.