vishalp23 commited on
Commit
22357ad
·
verified ·
1 Parent(s): 9029b73
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +89 -5
  3. distilbert_model +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ distilbert_model filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,5 +1,89 @@
1
- ---
2
- license: other
3
- license_name: gnu-general-public-license
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Subject Classifier built on Distilbert
2
+
3
+ ## Table of Contents
4
+ - [Model Details](#model-details)
5
+ - [How to Get Started With the Model](#how-to-get-started-with-the-model)
6
+ - [Uses](#uses)
7
+ - [Risks, Limitations and Biases](#risks-limitations-and-biases)
8
+ - [Training](#training)
9
+ - [Evaluation](#evaluation)
10
+ - [Environmental Impact](#environmental-impact)
11
+
12
+ ## Model Details
13
+
14
+ **Model Description:** This is the [uncased DistilBERT model](https://huggingface.co/distilbert-base-uncased) fine-tuned on a custom dataset that is built on the [IITJEE NEET AIIMS Students Questions Data](https://www.kaggle.com/datasets/mrutyunjaybiswal/iitjee-neet-aims-students-questions-data?resource=download) for the subject classification task.
15
+ - **Developed by:** The [Typeform](https://www.typeform.com/) team.
16
+ - **Model Type:** Text Classification
17
+ - **Language(s):** English
18
+ - **License:** GNU GENERAL PUBLIC LICENSE
19
+ - **Parent Model:** See the [distilbert base uncased model](https://huggingface.co/distilbert-base-uncased) for more information about the Distilled-BERT base model.
20
+
21
+
22
+ ## Uses
23
+ This model can be used for text classification tasks.
24
+
25
+
26
+ ## Risks, Limitations and Biases
27
+ **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**
28
+
29
+ Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
30
+
31
+
32
+ ## Training
33
+
34
+ Training is done on a [NVIDIA RTX 3070](https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3070-3070ti/) [AMD Ryzen 7 5800](https://www.amd.com/en/products/cpu/amd-ryzen-7-5800) with the following hyperparameters:
35
+
36
+ ```
37
+ $ training.ipynb \
38
+ --model_name_or_path distilbert-base-uncased \
39
+ --do_train \
40
+ --do_eval \
41
+ --max_seq_length 512 \
42
+ --per_device_train_batch_size 4 \
43
+ --learning_rate 1e-05 \
44
+ --num_train_epochs 5 \
45
+ ```
46
+
47
+ ## Evaluation
48
+
49
+
50
+ #### Evaluation Results
51
+ When fine-tuned on downstream tasks, this model achieves the following results:
52
+
53
+ Epochs: 5 | Train Loss: 0.001 | Train Accuracy: 0.989 | Val Loss: 0.006 | Val Accuracy: 0.950
54
+ CPU times: user 18h 19min 13s, sys: 1min 34s, total: 18h 20min 47s
55
+ Wall time: 18h 20min 7s
56
+ - **Epoch = ** 5.0
57
+ - **Evaluation Accuracy =** 0.950
58
+ - **Evaluation Loss =** 0.006
59
+ - **Training Accuracy =** 0.989
60
+ - **Training Loss =** 0.001
61
+
62
+ #### Testing Results
63
+
64
+ | | precision | recall | f1-score | support |
65
+ |-----------------|-----------|--------|----------|---------|
66
+ | biology | 0.98 | 0.99 | 0.99 | 15988 |
67
+ | chemistry | 1.00 | 0.99 | 0.99 | 20678 |
68
+ | computer | 1.00 | 0.99 | 0.99 | 8754 |
69
+ | maths | 1.00 | 1.00 | 1.00 | 26661 |
70
+ | physics | 0.99 | 0.98 | 0.99 | 10306 |
71
+ | social sciences | 0.99 | 1.00 | 0.99 | 25695 |
72
+ | | | | | |
73
+ | accuracy | 0.99 | 108082 | | |
74
+ | macro avg | 0.99 | 0.99 | 0.99 | 108082 |
75
+ | weighted avg | 0.99 | 0.99 | 0.99 | 108082 |
76
+
77
+
78
+ ## Environmental Impact
79
+
80
+ Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). We present the hardware type based on the [associated paper](https://arxiv.org/pdf/2105.09680.pdf).
81
+
82
+
83
+ **Hardware Type:** 1 NVIDIA RTX 3070
84
+
85
+ **Hours used:** 18h 19min 13s
86
+
87
+ **Carbon Emitted:** (Power consumption x Time x Carbon produced based on location of power grid): Unknown
88
+
89
+
distilbert_model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e3f621290f280db69baa8def0e624e8b70e0e6abd200ab604a0d9c901c7d5f4
3
+ size 266135036