tkbarb10 commited on
Commit
6a01b67
·
verified ·
1 Parent(s): 0140c3c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -59,11 +59,11 @@ there are several limitations to outline
59
  - When reviewing records were ambiguous or that the classifier incorrectly predicted, it was clear that the labeling scheme is fuzzy in
60
  some instances. For instance, many "Opinion" comments can be viewed as "Expressive" "Arguments", leading to ambiguous labeling from models.
61
  It would be worth exploring a more nuanced labeling scheme, perhaps splitting "Expressive" into 2-3 labels and Opinion into another 1 or 2
62
- - Due to the nature of the project, the commentary data used for training was subject to the following limitations
63
  - Queries were isolated to "politics" or "US politics"
64
- - With one exception, all comment data is dated from Jan 1, 2026 to Feb 12, 2026
65
- - We set a ceiling and a floor for number of comments per post. No posts with under 10 comments were used, and for posts with
66
- several comments, we only pulled the most recent 300
67
 
68
  ## Training and evaluation data
69
 
 
59
  - When reviewing records were ambiguous or that the classifier incorrectly predicted, it was clear that the labeling scheme is fuzzy in
60
  some instances. For instance, many "Opinion" comments can be viewed as "Expressive" "Arguments", leading to ambiguous labeling from models.
61
  It would be worth exploring a more nuanced labeling scheme, perhaps splitting "Expressive" into 2-3 labels and Opinion into another 1 or 2
62
+ - Due to the nature of the project, the commentary data used for training is subject to the following limitations
63
  - Queries were isolated to "politics" or "US politics"
64
+ - All comment data is dated from Jan 1, 2025 to Feb 12, 2026, with the majority originating in 2026
65
+ - We set a ceiling and a floor for number of comments per post. No posts with under 10 comments were used, and number of comments scraped
66
+ were capped at 300
67
 
68
  ## Training and evaluation data
69