| license: mit | |
| datasets: | |
| - copenlu/mm-framing | |
| RoBERTa topic classifier for topic injection into the Longformer Framing Classifier. Classifies input text into one of 19 discrete topics: | |
| 1. Business & Economy | |
| 2. Crime & Safety | |
| 3. Disaster & Accidents | |
| 4. Education | |
| 5. Entertainment | |
| 6. Environment & Nature | |
| 7. Health | |
| 8. Immigration | |
| 9. Infrastructure & Transport | |
| 10. Legal | |
| 11. Lifestyle & Culture | |
| 12. Media | |
| 13. Other/Unknown | |
| 14. Politics | |
| 15. Science & Technology | |
| 16. Social Issues | |
| 17. Sports | |
| 18. War & Conflict | |
| 19. Weather | |
| These were derived empirically by consolidating the unstructured gpt_topic field from the mm_framing silver dataset into | |
| discrete categories based on similarity. | |
| Achieved a 76.4% validation accuracy on 64,000 examples, which was deemed sufficient for assisting domain-specific reasoning in downstream model. |