File size: 1,089 Bytes
985c57f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
---
license: mit
datasets:
- copenlu/mm-framing
---

RoBERTa topic classifier for topic injection into the Longformer Framing Classifier. Classifies input text into one of 19 discrete topics:

                                                                                                                   
  1. Business & Economy
  2. Crime & Safety
  3. Disaster & Accidents
  4. Education
  5. Entertainment
  6. Environment & Nature                                                                                          
  7. Health
  8. Immigration
  9. Infrastructure & Transport
  10. Legal
  11. Lifestyle & Culture
  12. Media
  13. Other/Unknown
  14. Politics
  15. Science & Technology
  16. Social Issues
  17. Sports
  18. War & Conflict
  19. Weather

  These were derived empirically by consolidating the unstructured gpt_topic field from the mm_framing silver dataset into    
  discrete categories based on similarity.

  Achieved a 76.4% validation accuracy on 64,000 examples, which was deemed sufficient for assisting domain-specific reasoning in downstream model.