File size: 1,089 Bytes
985c57f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
---
license: mit
datasets:
- copenlu/mm-framing
---
RoBERTa topic classifier for topic injection into the Longformer Framing Classifier. Classifies input text into one of 19 discrete topics:
1. Business & Economy
2. Crime & Safety
3. Disaster & Accidents
4. Education
5. Entertainment
6. Environment & Nature
7. Health
8. Immigration
9. Infrastructure & Transport
10. Legal
11. Lifestyle & Culture
12. Media
13. Other/Unknown
14. Politics
15. Science & Technology
16. Social Issues
17. Sports
18. War & Conflict
19. Weather
These were derived empirically by consolidating the unstructured gpt_topic field from the mm_framing silver dataset into
discrete categories based on similarity.
Achieved a 76.4% validation accuracy on 64,000 examples, which was deemed sufficient for assisting domain-specific reasoning in downstream model. |