Fixed markdown and removed misplaced metrics form readme
Browse files
README.md
CHANGED
|
@@ -11,18 +11,22 @@ tags:
|
|
| 11 |
- sustainable-development-goals
|
| 12 |
- impact-tech
|
| 13 |
- text-classification
|
| 14 |
-
- BERT for Startup SDG Classification
|
| 15 |
---
|
|
|
|
|
|
|
| 16 |
This is a bert-base-uncased model fine-tuned to classify startup company descriptions into one of the 17 UN Sustainable Development Goals (SDGs), plus a "no-impact" category.
|
| 17 |
This model was trained by Kfir Bar as part of the research paper: "Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals" (2022).
|
| 18 |
This repository is hosted by Alon Mannor to make the original model weights accessible to the public.
|
| 19 |
|
| 20 |
-
Model Details
|
| 21 |
-
Base Model: bert-base-uncased
|
| 22 |
-
Task: Text Classification
|
| 23 |
-
Labels: 18 (0: No Impact, 1-17: corresponding SDG)
|
| 24 |
-
|
|
|
|
|
|
|
| 25 |
The mapping from the index (ID) to the label name is as follows:
|
|
|
|
| 26 |
{
|
| 27 |
"0": "0: No Impact",
|
| 28 |
"1": "SDG 1: No Poverty",
|
|
@@ -42,9 +46,10 @@ The mapping from the index (ID) to the label name is as follows:
|
|
| 42 |
"15": "SDG 15: Life on Land",
|
| 43 |
"16": "SDG 16: Peace and Justice Strong Institutions",
|
| 44 |
"17": "SDG 17: Partnerships to achieve the Goal"
|
| 45 |
-
}
|
|
|
|
| 46 |
|
| 47 |
-
How to Use:
|
| 48 |
|
| 49 |
You can use this model directly with the text-classification pipeline.
|
| 50 |
|
|
@@ -69,17 +74,23 @@ print(result_2)
|
|
| 69 |
# [{'label': '0: No Impact', 'score': 0.95...}]
|
| 70 |
````
|
| 71 |
|
| 72 |
-
Training Data:
|
| 73 |
The model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:
|
| 74 |
-
Rainmaking (Compass): A global database of impact-focused startups.
|
| 75 |
-
Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.
|
| 76 |
|
| 77 |
-
Performance
|
| 78 |
The model was evaluated on a test set of 866 startups from the original paper.
|
| 79 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.
|
| 81 |
|
| 82 |
-
Citation:
|
| 83 |
If you use this model or its underlying research, please cite the original paper:
|
| 84 |
@inproceedings{bar2022usinglm,
|
| 85 |
title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},
|
|
|
|
| 11 |
- sustainable-development-goals
|
| 12 |
- impact-tech
|
| 13 |
- text-classification
|
|
|
|
| 14 |
---
|
| 15 |
+
|
| 16 |
+
# BERT for Startup SDG Classification
|
| 17 |
This is a bert-base-uncased model fine-tuned to classify startup company descriptions into one of the 17 UN Sustainable Development Goals (SDGs), plus a "no-impact" category.
|
| 18 |
This model was trained by Kfir Bar as part of the research paper: "Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals" (2022).
|
| 19 |
This repository is hosted by Alon Mannor to make the original model weights accessible to the public.
|
| 20 |
|
| 21 |
+
## Model Details
|
| 22 |
+
* Base Model: bert-base-uncased
|
| 23 |
+
* Task: Text Classification
|
| 24 |
+
* Labels: 18 (0: No Impact, 1-17: corresponding SDG)
|
| 25 |
+
|
| 26 |
+
### Label Mapping (id2label)
|
| 27 |
+
The model outputs a logit for each of the 18 classes.
|
| 28 |
The mapping from the index (ID) to the label name is as follows:
|
| 29 |
+
````json
|
| 30 |
{
|
| 31 |
"0": "0: No Impact",
|
| 32 |
"1": "SDG 1: No Poverty",
|
|
|
|
| 46 |
"15": "SDG 15: Life on Land",
|
| 47 |
"16": "SDG 16: Peace and Justice Strong Institutions",
|
| 48 |
"17": "SDG 17: Partnerships to achieve the Goal"
|
| 49 |
+
}
|
| 50 |
+
````
|
| 51 |
|
| 52 |
+
### How to Use:
|
| 53 |
|
| 54 |
You can use this model directly with the text-classification pipeline.
|
| 55 |
|
|
|
|
| 74 |
# [{'label': '0: No Impact', 'score': 0.95...}]
|
| 75 |
````
|
| 76 |
|
| 77 |
+
### Training Data:
|
| 78 |
The model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:
|
| 79 |
+
1. Rainmaking (Compass): A global database of impact-focused startups.
|
| 80 |
+
2. Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.
|
| 81 |
|
| 82 |
+
### Performance
|
| 83 |
The model was evaluated on a test set of 866 startups from the original paper.
|
| 84 |
+
|
| 85 |
+
| Task | F1-Weighted |
|
| 86 |
+
| :-------------: | :---------: |
|
| 87 |
+
| 18-Label (Full) | 0.79 |
|
| 88 |
+
| 6-Label (5Ps) | 0.836 |
|
| 89 |
+
|
| 90 |
+
|
| 91 |
The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.
|
| 92 |
|
| 93 |
+
### Citation:
|
| 94 |
If you use this model or its underlying research, please cite the original paper:
|
| 95 |
@inproceedings{bar2022usinglm,
|
| 96 |
title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},
|