File size: 5,941 Bytes
69a51e4
d68b174
 
 
 
 
 
 
 
 
 
 
 
69a51e4
 
d68b174
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
---
language: en
license: apache-2.0
tags:
  - pytorch
  - text2text-generation
  - dei
  - text-generation
  - t5
  - equibert
metrics:
  - rouge
  - bleu
---

# EquiBERT β€” DEI Text Generator

**Model ID:** `SallySims/equibert-generator`

T5-base fine-tuned for conditional DEI text generation.
Generates inclusive, equitable organisational text across
seven task types given a task prefix and input.

## Task Prefixes

| Prefix | Task | Input β†’ Output |
|--------|------|----------------|
| `rewrite inclusive:` | Inclusive rewriting | Biased text β†’ Inclusive version |
| `generate policy:` | Policy generation | Topic β†’ Policy section |
| `generate jd:` | Job description | Role description β†’ Inclusive JD |
| `rewrite framing:` | Framing correction | Victim-blaming text β†’ Structural framing |
| `generate commitment:` | DEI commitment | Goal β†’ Measurable commitment |
| `rewrite review:` | Review debiasing | Biased review β†’ Unbiased version |
| `generate awareness:` | Awareness content | Topic β†’ Awareness statement |

## Usage

```python
from transformers import T5ForConditionalGeneration, T5Tokenizer

model     = T5ForConditionalGeneration.from_pretrained("SallySims/equibert-generator")
tokenizer = T5Tokenizer.from_pretrained("SallySims/equibert-generator")

prompt = "rewrite inclusive: We need a rock star developer who can dominate the roadmap."
inputs = tokenizer(prompt, return_tensors="pt", max_length=256, truncation=True)
output = model.generate(**inputs, max_new_tokens=200, num_beams=4)
result = tokenizer.decode(output[0], skip_special_tokens=True)
print(result)
# "We are looking for a skilled developer with strong technical expertise
#  who can contribute meaningfully to our product roadmap."
```

## Applications

- Automated inclusive job description generation
- DEI report framing improvement
- Performance review debiasing assistance
- Policy language generation
- Leadership communication coaching

## Model Description

EquiBERT is a multi-task DEI (Diversity, Equity and Inclusion) transformer
built on a dual-encoder backbone that fuses **RoBERTa-base** and
**DeBERTa-v3-base** via a learned weighted sum (Ξ± parameter).
The fused representation is fed into task-specific heads covering
17 distinct DEI analysis tasks.

**Organisation:** [SallySims](https://huggingface.co/SallySims)
**Framework:** PyTorch + HuggingFace Transformers
**Backbone:** RoBERTa-base + DeBERTa-v3-base (dual encoder, fused)
**Language:** English
**Domain:** Organisational DEI text β€” HR communications, policies,
job descriptions, performance reviews, leadership statements, reports

## Architecture

```
Input Text
    β”‚
    β”œβ”€β”€β–Ά RoBERTa-base encoder ──▢ Linear projection
    β”‚                                     β”‚
    └──▢ DeBERTa-v3-base encoder ──▢ Linear projection
                                          β”‚
                              Weighted fusion (learned Ξ±)
                                          β”‚
                                   Layer Norm + Dropout
                                          β”‚
                              Task-specific head (see below)
```

## Training Data

Trained on synthetic DEI organisational text generated by the
EquiBERT synthetic data pipeline, covering 20 DEI categories
across HR, policy, leadership, and workforce analytics domains.
For production use, fine-tune on real labelled DEI data.

## Limitations

- Trained on synthetic data β€” predictions should be validated
  before use in real HR or policy decisions.
- English-only.
- Not a substitute for qualified DEI practitioners or legal advice.
- May reflect biases present in the training corpus.

## Citation

If you use EquiBERT in your research, please cite:

```bibtex
@misc{equibert2024,
  author    = {SallySims},
  title     = {EquiBERT: A Multi-Task DEI Transformer},
  year      = {2024},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/SallySims}
}
```

## All EquiBERT Models

| Model | Task | Primary Metric |
|-------|------|---------------|
| [equibert-bias-classifier](https://huggingface.co/SallySims/equibert-bias-classifier) | Bias Detection | Macro F1 |
| [equibert-microaggression](https://huggingface.co/SallySims/equibert-microaggression) | Microaggression Detection | Macro F1 |
| [equibert-category-tagger](https://huggingface.co/SallySims/equibert-category-tagger) | DEI Category Tagging | Macro F1 |
| [equibert-event-exclusion](https://huggingface.co/SallySims/equibert-event-exclusion) | Event Exclusion Classification | Macro F1 |
| [equibert-inclusive-language](https://huggingface.co/SallySims/equibert-inclusive-language) | Inclusive Language Scoring | Span F1 |
| [equibert-review-auditor](https://huggingface.co/SallySims/equibert-review-auditor) | Performance Review Auditing | Span F1 |
| [equibert-washing-detector](https://huggingface.co/SallySims/equibert-washing-detector) | DEI Washing Detection | MAE |
| [equibert-framing-scorer](https://huggingface.co/SallySims/equibert-framing-scorer) | Report Framing Scoring | MAE |
| [equibert-awareness-scorer](https://huggingface.co/SallySims/equibert-awareness-scorer) | DEI Awareness Scoring | MAE |
| [equibert-similarity](https://huggingface.co/SallySims/equibert-similarity) | Semantic Similarity | Accuracy |
| [equibert-ner](https://huggingface.co/SallySims/equibert-ner) | DEI Entity Recognition | Span F1 |
| [equibert-relation-extraction](https://huggingface.co/SallySims/equibert-relation-extraction) | Relation Extraction | Macro F1 |
| [equibert-qa](https://huggingface.co/SallySims/equibert-qa) | Extractive QA | Span EM |
| [equibert-search](https://huggingface.co/SallySims/equibert-search) | Semantic Search | MRR@10 |
| [equibert-nli](https://huggingface.co/SallySims/equibert-nli) | NLI / Textual Entailment | Macro F1 |
| [equibert-generator](https://huggingface.co/SallySims/equibert-generator) | DEI Text Generation | ROUGE-L |