amannor commited on
Commit
2f031f7
·
verified ·
1 Parent(s): fc60171

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -27
README.md CHANGED
@@ -1,24 +1,55 @@
1
- language: enlicense: mitpipeline_tag: text-classificationbase_model: bert-base-uncaseddatasets:customtags:sdgsustainable-development-goalsimpact-techtext-classificationBERT for Startup SDG ClassificationThis is a bert-base-uncased model fine-tuned to classify startup company descriptions into one of the 17 UN Sustainable Development Goals (SDGs), plus a "no-impact" category.This model was trained by Kfir Bar as part of the research paper: "Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals" (2022).This repository is hosted by Alon Mannor to make the original model weights accessible to the public.Model DetailsBase Model: bert-base-uncasedTask: Text ClassificationLabels: 18 (0: No Impact, 1-17: corresponding SDG)Label Mapping (id2label)The model outputs a logit for each of the 18 classes. The mapping from the index (ID) to the label name is as follows:{
2
- "0": "0: No Impact",
3
- "1": "SDG 1: No Poverty",
4
- "2": "SDG 2: Zero Hunger",
5
- "3": "SDG 3: Good Health and Well-being",
6
- "4": "SDG 4: Quality Education",
7
- "5": "SDG 5: Gender Equality",
8
- "6": "SDG 6: Clean Water and Sanitation",
9
- "7": "SDG 7: Affordable and Clean Energy",
10
- "8": "SDG 8: Decent Work and Economic Growth",
11
- "9": "SDG 9: Industry, Innovation and Infrastructure",
12
- "10": "SDG 10: Reduced Inequality",
13
- "11": "SDG 11: Sustainable Cities and Communities",
14
- "12": "SDG 12: Responsible Consumption and Production",
15
- "13": "SDG 13: Climate Action",
16
- "14": "SDG 14: Life Below Water",
17
- "15": "SDG 15: Life on Land",
18
- "16": "SDG 16: Peace and Justice Strong Institutions",
19
- "17": "SDG 17: Partnerships to achieve the Goal"
20
- }
21
- How to UseYou can use this model directly with the text-classification pipeline.from transformers import pipeline
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  # Load the classifier
24
  classifier = pipeline("text-classification", model="amannor/bert-base-uncased-sdgclassifier")
@@ -36,10 +67,24 @@ text_2 = "We are a B2B platform for optimizing advertising spend on social media
36
  result_2 = classifier(text_2)
37
  print(result_2)
38
  # [{'label': '0: No Impact', 'score': 0.95...}]
39
- Training DataThe model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:Rainmaking (Compass): A global database of impact-focused startups.Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.PerformanceThe model was evaluated on a test set of 866 startups from the original paper.TaskF1-WeightedF1-MacroF1-Micro18-Label (Full)0.7900.4730.7906-Label (5Ps)0.8360.6020.836The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.CitationIf you use this model or its underlying research, please cite the original paper:@inproceedings{bar2022usinglm,
40
- title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},
41
- author={Bar, Kfir},
42
- booktitle={Anonymous Submission to IJCAI-22},
43
- year={2022},
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  url={httpsall://[github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf](https://github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf)}
45
- }
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ base_model:
6
+ - google-bert/bert-base-uncased
7
+ pipeline_tag: text-classification
8
+ datasets: custom
9
+ tags:
10
+ - sdg
11
+ - sustainable-development-goals
12
+ - impact-tech
13
+ - text-classification
14
+ - BERT for Startup SDG Classification
15
+ ---
16
+ This is a bert-base-uncased model fine-tuned to classify startup company descriptions into one of the 17 UN Sustainable Development Goals (SDGs), plus a "no-impact" category.
17
+ This model was trained by Kfir Bar as part of the research paper: "Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals" (2022).
18
+ This repository is hosted by Alon Mannor to make the original model weights accessible to the public.
19
+
20
+ Model Details
21
+ Base Model: bert-base-uncased
22
+ Task: Text Classification
23
+ Labels: 18 (0: No Impact, 1-17: corresponding SDG)
24
+ Label Mapping (id2label) The model outputs a logit for each of the 18 classes.
25
+ The mapping from the index (ID) to the label name is as follows:
26
+ {
27
+ "0": "0: No Impact",
28
+ "1": "SDG 1: No Poverty",
29
+ "2": "SDG 2: Zero Hunger",
30
+ "3": "SDG 3: Good Health and Well-being",
31
+ "4": "SDG 4: Quality Education",
32
+ "5": "SDG 5: Gender Equality",
33
+ "6": "SDG 6: Clean Water and Sanitation",
34
+ "7": "SDG 7: Affordable and Clean Energy",
35
+ "8": "SDG 8: Decent Work and Economic Growth",
36
+ "9": "SDG 9: Industry, Innovation and Infrastructure",
37
+ "10": "SDG 10: Reduced Inequality",
38
+ "11": "SDG 11: Sustainable Cities and Communities",
39
+ "12": "SDG 12: Responsible Consumption and Production",
40
+ "13": "SDG 13: Climate Action",
41
+ "14": "SDG 14: Life Below Water",
42
+ "15": "SDG 15: Life on Land",
43
+ "16": "SDG 16: Peace and Justice Strong Institutions",
44
+ "17": "SDG 17: Partnerships to achieve the Goal"
45
+ }
46
+
47
+ How to Use:
48
+
49
+ You can use this model directly with the text-classification pipeline.
50
+
51
+ ````python
52
+ from transformers import pipeline
53
 
54
  # Load the classifier
55
  classifier = pipeline("text-classification", model="amannor/bert-base-uncased-sdgclassifier")
 
67
  result_2 = classifier(text_2)
68
  print(result_2)
69
  # [{'label': '0: No Impact', 'score': 0.95...}]
70
+ ````
71
+
72
+ Training Data:
73
+ The model was trained on a dataset of 4,247 startup descriptions (from the Gidron et al. 2023 extension) aggregated from two main sources, which were manually annotated by experts:
74
+ Rainmaking (Compass): A global database of impact-focused startups.
75
+ Start-up Nation Central (SNC): A database of Israeli startups, including both impact and non-impact companies.
76
+
77
+ Performance
78
+ The model was evaluated on a test set of 866 startups from the original paper.
79
+ Task: F1-Weighted F1-Macro F1-Micro 18-Label (Full )0.7900.4730.7906-Label (5Ps)0.8360.6020.836
80
+ The performance for the 6-label task (People, Planet, Prosperity, Peace, Partnerships, No-Impact) was aggregated from the 18-label predictions.
81
+
82
+ Citation:
83
+ If you use this model or its underlying research, please cite the original paper:
84
+ @inproceedings{bar2022usinglm,
85
+ title={Using Language Models for Classifying Startups Into the UN’s 17 Sustainable Development Goals},
86
+ author={Bar, Kfir},
87
+ booktitle={Anonymous Submission to IJCAI-22},
88
+ year={2022},
89
  url={httpsall://[github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf](https://github.com/Amannor/sdg-codebase/blob/master/articles/IJCAI_2022_SDGs_Methodology.pdf)}
90
+ }