File size: 4,986 Bytes

e38d76a
3bfa58e
 
 
 
 
 
 
 
 
 
e38d76a
 
3bfa58e

---
language: en
license: apache-2.0
tags:
  - pytorch
  - question-answering
  - dei
  - equibert
metrics:
  - exact_match
  - f1
---

# EquiBERT — DEI Extractive Question Answering

**Model ID:** `SallySims/equibert-qa`

Extractive QA model for DEI policy and report comprehension.
Finds the answer span within a provided context passage.
Drop-in replacement for `deepset/roberta-base-squad2`.

## Usage

```python
from transformers import pipeline

qa = pipeline("question-answering", model="SallySims/equibert-qa")

result = qa(
    question="What is the gender pay gap?",
    context="Our independent audit found a 9% unexplained gap for women "
            "after controlling for role and tenure."
)
# {"answer": "9%", "score": 0.94, "start": 40, "end": 42}
```

## Example Questions

- "What is the pay equity gap?"
- "Who owns the DEI targets?"
- "What percentage of hires are from underrepresented groups?"
- "By when will the gap be closed?"
- "What does the inclusion survey show?"

## Model Description

EquiBERT is a multi-task DEI (Diversity, Equity and Inclusion) transformer
built on a dual-encoder backbone that fuses **RoBERTa-base** and
**DeBERTa-v3-base** via a learned weighted sum (α parameter).
The fused representation is fed into task-specific heads covering
17 distinct DEI analysis tasks.

**Organisation:** [SallySims](https://huggingface.co/SallySims)
**Framework:** PyTorch + HuggingFace Transformers
**Backbone:** RoBERTa-base + DeBERTa-v3-base (dual encoder, fused)
**Language:** English
**Domain:** Organisational DEI text — HR communications, policies,
job descriptions, performance reviews, leadership statements, reports

## Architecture

```
Input Text
    │
    ├──▶ RoBERTa-base encoder ──▶ Linear projection
    │                                     │
    └──▶ DeBERTa-v3-base encoder ──▶ Linear projection
                                          │
                              Weighted fusion (learned α)
                                          │
                                   Layer Norm + Dropout
                                          │
                              Task-specific head (see below)
```

## Training Data

Trained on synthetic DEI organisational text generated by the
EquiBERT synthetic data pipeline, covering 20 DEI categories
across HR, policy, leadership, and workforce analytics domains.
For production use, fine-tune on real labelled DEI data.

## Limitations

- Trained on synthetic data — predictions should be validated
  before use in real HR or policy decisions.
- English-only.
- Not a substitute for qualified DEI practitioners or legal advice.
- May reflect biases present in the training corpus.

## Citation

If you use EquiBERT in your research, please cite:

```bibtex
@misc{equibert2024,
  author    = {SallySims},
  title     = {EquiBERT: A Multi-Task DEI Transformer},
  year      = {2024},
  publisher = {HuggingFace},
  url       = {https://huggingface.co/SallySims}
}
```

## All EquiBERT Models

| Model | Task | Primary Metric |
|-------|------|---------------|
| [equibert-bias-classifier](https://huggingface.co/SallySims/equibert-bias-classifier) | Bias Detection | Macro F1 |
| [equibert-microaggression](https://huggingface.co/SallySims/equibert-microaggression) | Microaggression Detection | Macro F1 |
| [equibert-category-tagger](https://huggingface.co/SallySims/equibert-category-tagger) | DEI Category Tagging | Macro F1 |
| [equibert-event-exclusion](https://huggingface.co/SallySims/equibert-event-exclusion) | Event Exclusion Classification | Macro F1 |
| [equibert-inclusive-language](https://huggingface.co/SallySims/equibert-inclusive-language) | Inclusive Language Scoring | Span F1 |
| [equibert-review-auditor](https://huggingface.co/SallySims/equibert-review-auditor) | Performance Review Auditing | Span F1 |
| [equibert-washing-detector](https://huggingface.co/SallySims/equibert-washing-detector) | DEI Washing Detection | MAE |
| [equibert-framing-scorer](https://huggingface.co/SallySims/equibert-framing-scorer) | Report Framing Scoring | MAE |
| [equibert-awareness-scorer](https://huggingface.co/SallySims/equibert-awareness-scorer) | DEI Awareness Scoring | MAE |
| [equibert-similarity](https://huggingface.co/SallySims/equibert-similarity) | Semantic Similarity | Accuracy |
| [equibert-ner](https://huggingface.co/SallySims/equibert-ner) | DEI Entity Recognition | Span F1 |
| [equibert-relation-extraction](https://huggingface.co/SallySims/equibert-relation-extraction) | Relation Extraction | Macro F1 |
| [equibert-qa](https://huggingface.co/SallySims/equibert-qa) | Extractive QA | Span EM |
| [equibert-search](https://huggingface.co/SallySims/equibert-search) | Semantic Search | MRR@10 |
| [equibert-nli](https://huggingface.co/SallySims/equibert-nli) | NLI / Textual Entailment | Macro F1 |
| [equibert-generator](https://huggingface.co/SallySims/equibert-generator) | DEI Text Generation | ROUGE-L |