SallySims commited on
Commit
cd0d32e
Β·
verified Β·
1 Parent(s): 206d64f

Add detailed model card

Browse files
Files changed (1) hide show
  1. README.md +140 -0
README.md ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: apache-2.0
4
+ tags:
5
+ - pytorch
6
+ - text-classification
7
+ - nli
8
+ - zero-shot-classification
9
+ - dei
10
+ - equibert
11
+ metrics:
12
+ - f1
13
+ - accuracy
14
+ ---
15
+
16
+ # EquiBERT β€” DEI Natural Language Inference
17
+
18
+ **Model ID:** `SallySims/equibert-nli`
19
+
20
+ DEI-domain NLI model for textual entailment and zero-shot
21
+ classification. Drop-in replacement for `facebook/bart-large-mnli`.
22
+
23
+ ## Labels
24
+
25
+ | ID | Label | Description |
26
+ |----|-------|-------------|
27
+ | 0 | `entailment` | Premise logically supports hypothesis |
28
+ | 1 | `contradiction` | Premise contradicts hypothesis |
29
+ | 2 | `neutral` | Premise neither supports nor contradicts |
30
+
31
+ ## Usage β€” Zero-Shot Classification
32
+
33
+ ```python
34
+ from transformers import pipeline
35
+
36
+ nli = pipeline("zero-shot-classification", model="SallySims/equibert-nli")
37
+
38
+ result = nli(
39
+ "We conduct annual pay equity reviews published in our DEI report.",
40
+ candidate_labels=["pay equity", "hiring bias", "inclusion culture"]
41
+ )
42
+ # {"labels": ["pay equity", ...], "scores": [0.91, ...]}
43
+ ```
44
+
45
+ ## Usage β€” Direct Entailment
46
+
47
+ ```python
48
+ premise = "We conduct annual pay equity reviews."
49
+ hypothesis = "The organisation has a formal pay equity process."
50
+ inputs = tokenizer(premise, hypothesis, return_tensors="pt")
51
+ # label = id2label[model(**inputs).logits.argmax(-1).item()]
52
+ # β†’ "entailment"
53
+ ```
54
+
55
+ ## DEI Policy Verification
56
+
57
+ Use this model to verify whether organisational statements are
58
+ consistent with stated DEI policies β€” identifying contradictions
59
+ between policy documents and actual communications.
60
+
61
+ ## Model Description
62
+
63
+ EquiBERT is a multi-task DEI (Diversity, Equity and Inclusion) transformer
64
+ built on a dual-encoder backbone that fuses **RoBERTa-base** and
65
+ **DeBERTa-v3-base** via a learned weighted sum (Ξ± parameter).
66
+ The fused representation is fed into task-specific heads covering
67
+ 17 distinct DEI analysis tasks.
68
+
69
+ **Organisation:** [SallySims](https://huggingface.co/SallySims)
70
+ **Framework:** PyTorch + HuggingFace Transformers
71
+ **Backbone:** RoBERTa-base + DeBERTa-v3-base (dual encoder, fused)
72
+ **Language:** English
73
+ **Domain:** Organisational DEI text β€” HR communications, policies,
74
+ job descriptions, performance reviews, leadership statements, reports
75
+
76
+ ## Architecture
77
+
78
+ ```
79
+ Input Text
80
+ β”‚
81
+ β”œβ”€β”€β–Ά RoBERTa-base encoder ──▢ Linear projection
82
+ β”‚ β”‚
83
+ └──▢ DeBERTa-v3-base encoder ──▢ Linear projection
84
+ β”‚
85
+ Weighted fusion (learned Ξ±)
86
+ β”‚
87
+ Layer Norm + Dropout
88
+ β”‚
89
+ Task-specific head (see below)
90
+ ```
91
+
92
+ ## Training Data
93
+
94
+ Trained on synthetic DEI organisational text generated by the
95
+ EquiBERT synthetic data pipeline, covering 20 DEI categories
96
+ across HR, policy, leadership, and workforce analytics domains.
97
+ For production use, fine-tune on real labelled DEI data.
98
+
99
+ ## Limitations
100
+
101
+ - Trained on synthetic data β€” predictions should be validated
102
+ before use in real HR or policy decisions.
103
+ - English-only.
104
+ - Not a substitute for qualified DEI practitioners or legal advice.
105
+ - May reflect biases present in the training corpus.
106
+
107
+ ## Citation
108
+
109
+ If you use EquiBERT in your research, please cite:
110
+
111
+ ```bibtex
112
+ @misc{equibert2024,
113
+ author = {SallySims},
114
+ title = {EquiBERT: A Multi-Task DEI Transformer},
115
+ year = {2024},
116
+ publisher = {HuggingFace},
117
+ url = {https://huggingface.co/SallySims}
118
+ }
119
+ ```
120
+
121
+ ## All EquiBERT Models
122
+
123
+ | Model | Task | Primary Metric |
124
+ |-------|------|---------------|
125
+ | [equibert-bias-classifier](https://huggingface.co/SallySims/equibert-bias-classifier) | Bias Detection | Macro F1 |
126
+ | [equibert-microaggression](https://huggingface.co/SallySims/equibert-microaggression) | Microaggression Detection | Macro F1 |
127
+ | [equibert-category-tagger](https://huggingface.co/SallySims/equibert-category-tagger) | DEI Category Tagging | Macro F1 |
128
+ | [equibert-event-exclusion](https://huggingface.co/SallySims/equibert-event-exclusion) | Event Exclusion Classification | Macro F1 |
129
+ | [equibert-inclusive-language](https://huggingface.co/SallySims/equibert-inclusive-language) | Inclusive Language Scoring | Span F1 |
130
+ | [equibert-review-auditor](https://huggingface.co/SallySims/equibert-review-auditor) | Performance Review Auditing | Span F1 |
131
+ | [equibert-washing-detector](https://huggingface.co/SallySims/equibert-washing-detector) | DEI Washing Detection | MAE |
132
+ | [equibert-framing-scorer](https://huggingface.co/SallySims/equibert-framing-scorer) | Report Framing Scoring | MAE |
133
+ | [equibert-awareness-scorer](https://huggingface.co/SallySims/equibert-awareness-scorer) | DEI Awareness Scoring | MAE |
134
+ | [equibert-similarity](https://huggingface.co/SallySims/equibert-similarity) | Semantic Similarity | Accuracy |
135
+ | [equibert-ner](https://huggingface.co/SallySims/equibert-ner) | DEI Entity Recognition | Span F1 |
136
+ | [equibert-relation-extraction](https://huggingface.co/SallySims/equibert-relation-extraction) | Relation Extraction | Macro F1 |
137
+ | [equibert-qa](https://huggingface.co/SallySims/equibert-qa) | Extractive QA | Span EM |
138
+ | [equibert-search](https://huggingface.co/SallySims/equibert-search) | Semantic Search | MRR@10 |
139
+ | [equibert-nli](https://huggingface.co/SallySims/equibert-nli) | NLI / Textual Entailment | Macro F1 |
140
+ | [equibert-generator](https://huggingface.co/SallySims/equibert-generator) | DEI Text Generation | ROUGE-L |