amannor commited on
Commit
eec62c0
·
verified ·
1 Parent(s): 2f031f7

Fixed markdown and removed misplaced metrics form readme

Browse files
Files changed (1) hide show
  1. README.md +25 -14
README.md CHANGED
@@ -11,18 +11,22 @@ tags:
11
  - sustainable-development-goals
12
  - impact-tech
13
  - text-classification
14
- - BERT for Startup SDG Classification
15
  ---
 
 
16
  This is a bert-base-uncased model fine-tuned to classify startup company descriptions into one of the 17 UN Sustainable Development Goals (SDGs), plus a "no-impact" category.
17
  This model was trained by Kfir Bar as part of the research paper: "Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals" (2022).
18
  This repository is hosted by Alon Mannor to make the original model weights accessible to the public.
19
 
20
- Model Details
21
- Base Model: bert-base-uncased
22
- Task: Text Classification
23
- Labels: 18 (0: No Impact, 1-17: corresponding SDG)
24
- Label Mapping (id2label) The model outputs a logit for each of the 18 classes.
 
 
25
  The mapping from the index (ID) to the label name is as follows:
 
26
  {
27
  "0": "0: No Impact",
28
  "1": "SDG 1: No Poverty",
@@ -42,9 +46,10 @@ The mapping from the index (ID) to the label name is as follows:
42
  "15": "SDG 15: Life on Land",
43
  "16": "SDG 16: Peace and Justice Strong Institutions",
44
  "17": "SDG 17: Partnerships to achieve the Goal"
45
- }
 
46
 
47
- How to Use:
48
 
49
  You can use this model directly with the text-classification pipeline.
50
 
@@ -69,17 +74,23 @@ print(result_2)
69
  # [{'label': '0: No Impact', 'score': 0.95...}]
70
  ````
71
 
72
- Training Data:
73
  The model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:
74
- Rainmaking (Compass): A global database of impact-focused startups.
75
- Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.
76
 
77
- Performance
78
  The model was evaluated on a test set of 866 startups from the original paper.
79
- Task: F1-Weighted F1-Macro F1-Micro 18-Label (Full )0.7900.4730.7906-Label (5Ps)0.8360.6020.836
 
 
 
 
 
 
80
  The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.
81
 
82
- Citation:
83
  If you use this model or its underlying research, please cite the original paper:
84
  @inproceedings{bar2022usinglm,
85
  title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},
 
11
  - sustainable-development-goals
12
  - impact-tech
13
  - text-classification
 
14
  ---
15
+
16
+ # BERT for Startup SDG Classification
17
  This is a bert-base-uncased model fine-tuned to classify startup company descriptions into one of the 17 UN Sustainable Development Goals (SDGs), plus a "no-impact" category.
18
  This model was trained by Kfir Bar as part of the research paper: "Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals" (2022).
19
  This repository is hosted by Alon Mannor to make the original model weights accessible to the public.
20
 
21
+ ## Model Details
22
+ * Base Model: bert-base-uncased
23
+ * Task: Text Classification
24
+ * Labels: 18 (0: No Impact, 1-17: corresponding SDG)
25
+
26
+ ### Label Mapping (id2label)
27
+ The model outputs a logit for each of the 18 classes.
28
  The mapping from the index (ID) to the label name is as follows:
29
+ ````json
30
  {
31
  "0": "0: No Impact",
32
  "1": "SDG 1: No Poverty",
 
46
  "15": "SDG 15: Life on Land",
47
  "16": "SDG 16: Peace and Justice Strong Institutions",
48
  "17": "SDG 17: Partnerships to achieve the Goal"
49
+ }
50
+ ````
51
 
52
+ ### How to Use:
53
 
54
  You can use this model directly with the text-classification pipeline.
55
 
 
74
  # [{'label': '0: No Impact', 'score': 0.95...}]
75
  ````
76
 
77
+ ### Training Data:
78
  The model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:
79
+ 1. Rainmaking (Compass): A global database of impact-focused startups.
80
+ 2. Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.
81
 
82
+ ### Performance
83
  The model was evaluated on a test set of 866 startups from the original paper.
84
+
85
+ | Task | F1-Weighted |
86
+ | :-------------: | :---------: |
87
+ | 18-Label (Full) | 0.79 |
88
+ | 6-Label (5Ps) | 0.836 |
89
+
90
+
91
  The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.
92
 
93
+ ### Citation:
94
  If you use this model or its underlying research, please cite the original paper:
95
  @inproceedings{bar2022usinglm,
96
  title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},