YAML markdown issues fix + update to README
Browse files
README.md
CHANGED
|
@@ -1,88 +1,45 @@
|
|
| 1 |
-
language:
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
| 31 |
-
|
| 32 |
-
|
| 33 |
-
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
| 37 |
-
|
| 38 |
-
{
|
| 39 |
-
|
| 40 |
-
|
| 41 |
-
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
Training Data
|
| 47 |
-
|
| 48 |
-
The model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:
|
| 49 |
-
|
| 50 |
-
Rainmaking (Compass): A global database of impact-focused startups.
|
| 51 |
-
|
| 52 |
-
Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.
|
| 53 |
-
|
| 54 |
-
Performance
|
| 55 |
-
|
| 56 |
-
The model was evaluated on a test set of 866 startups from the original paper.
|
| 57 |
-
|
| 58 |
-
Task
|
| 59 |
-
|
| 60 |
-
F1-Weighted
|
| 61 |
-
|
| 62 |
-
F1-Macro
|
| 63 |
-
|
| 64 |
-
F1-Micro
|
| 65 |
-
|
| 66 |
-
18-Label (Full)
|
| 67 |
-
|
| 68 |
-
0.790
|
| 69 |
-
|
| 70 |
-
0.473
|
| 71 |
-
|
| 72 |
-
0.790
|
| 73 |
-
|
| 74 |
-
6-Label (5Ps)
|
| 75 |
-
|
| 76 |
-
0.836
|
| 77 |
-
|
| 78 |
-
0.602
|
| 79 |
-
|
| 80 |
-
0.836
|
| 81 |
-
|
| 82 |
-
The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.
|
| 83 |
-
|
| 84 |
-
Citation
|
| 85 |
-
|
| 86 |
-
If you use this model or its underlying research, please cite the original paper:
|
| 87 |
-
|
| 88 |
-
@inproceedings{bar2022usinglm, title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals}, author={Bar, Kfir}, booktitle={Anonymous Submission to IJCAI-22}, year={2022}, url={[https://github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf](https://github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf)} }
|
|
|
|
| 1 |
+
language: enlicense: mitpipeline_tag: text-classificationbase_model: bert-base-uncaseddatasets:customtags:sdgsustainable-development-goalsimpact-techtext-classificationBERT for Startup SDG ClassificationThis is a bert-base-uncased model fine-tuned to classify startup company descriptions into one of the 17 UN Sustainable Development Goals (SDGs), plus a "no-impact" category.This model was trained by Kfir Bar as part of the research paper: "Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals" (2022).This repository is hosted by Alon Mannor to make the original model weights accessible to the public.Model DetailsBase Model: bert-base-uncasedTask: Text ClassificationLabels: 18 (0: No Impact, 1-17: corresponding SDG)Label Mapping (id2label)The model outputs a logit for each of the 18 classes. The mapping from the index (ID) to the label name is as follows:{
|
| 2 |
+
"0": "0: No Impact",
|
| 3 |
+
"1": "SDG 1: No Poverty",
|
| 4 |
+
"2": "SDG 2: Zero Hunger",
|
| 5 |
+
"3": "SDG 3: Good Health and Well-being",
|
| 6 |
+
"4": "SDG 4: Quality Education",
|
| 7 |
+
"5": "SDG 5: Gender Equality",
|
| 8 |
+
"6": "SDG 6: Clean Water and Sanitation",
|
| 9 |
+
"7": "SDG 7: Affordable and Clean Energy",
|
| 10 |
+
"8": "SDG 8: Decent Work and Economic Growth",
|
| 11 |
+
"9": "SDG 9: Industry, Innovation and Infrastructure",
|
| 12 |
+
"10": "SDG 10: Reduced Inequality",
|
| 13 |
+
"11": "SDG 11: Sustainable Cities and Communities",
|
| 14 |
+
"12": "SDG 12: Responsible Consumption and Production",
|
| 15 |
+
"13": "SDG 13: Climate Action",
|
| 16 |
+
"14": "SDG 14: Life Below Water",
|
| 17 |
+
"15": "SDG 15: Life on Land",
|
| 18 |
+
"16": "SDG 16: Peace and Justice Strong Institutions",
|
| 19 |
+
"17": "SDG 17: Partnerships to achieve the Goal"
|
| 20 |
+
}
|
| 21 |
+
How to UseYou can use this model directly with the text-classification pipeline.from transformers import pipeline
|
| 22 |
+
|
| 23 |
+
# Load the classifier
|
| 24 |
+
classifier = pipeline("text-classification", model="amannor/bert-base-uncased-sdgclassifier")
|
| 25 |
+
|
| 26 |
+
# Example description
|
| 27 |
+
text = "Our company develops innovative, low-cost solar panels to bring electricity to rural communities."
|
| 28 |
+
|
| 29 |
+
# Get prediction
|
| 30 |
+
result = classifier(text)
|
| 31 |
+
print(result)
|
| 32 |
+
# [{'label': 'SDG 7: Affordable and Clean Energy', 'score': 0.98...}]
|
| 33 |
+
|
| 34 |
+
# Example of a non-impact startup
|
| 35 |
+
text_2 = "We are a B2B platform for optimizing advertising spend on social media."
|
| 36 |
+
result_2 = classifier(text_2)
|
| 37 |
+
print(result_2)
|
| 38 |
+
# [{'label': '0: No Impact', 'score': 0.95...}]
|
| 39 |
+
Training DataThe model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:Rainmaking (Compass): A global database of impact-focused startups.Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.PerformanceThe model was evaluated on a test set of 866 startups from the original paper.TaskF1-WeightedF1-MacroF1-Micro18-Label (Full)0.7900.4730.7906-Label (5Ps)0.8360.6020.836The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.CitationIf you use this model or its underlying research, please cite the original paper:@inproceedings{bar2022usinglm,
|
| 40 |
+
title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},
|
| 41 |
+
author={Bar, Kfir},
|
| 42 |
+
booktitle={Anonymous Submission to IJCAI-22},
|
| 43 |
+
year={2022},
|
| 44 |
+
url={httpsall://[github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf](https://github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf)}
|
| 45 |
+
}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|