Human Value Detection – DeBERTa Baseline

This model is the baseline 19-way value detector from the papers:

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum
Víctor Yeste, Paolo Rosso (2026), arXiv:2601.14172

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts
Víctor Yeste, Paolo Rosso (2026), arXiv:2602.00913

It is a multi-label classifier over the 19 refined Schwartz basic values, trained on the English, machine-translated portion of the ValueEval'24 / ValuesML corpus.

Inputs: a single sentence (news / political text, in English).
Outputs: a probability for each of the 19 Schwartz values.
Labels: we collapse “attained” and “constrained” into a single binary label per value (value is expressed vs. not expressed).

This is the text-only DeBERTa-base baseline used in the paper. In the experiments, it is also one of the members of the best-performing ensemble (Baseline + LIWC-22 + Previous-2-Sentences).

Intended use

Research on human value detection and moral language.
Baseline / starting point for work on:
- Schwartz value theory in NLP
- Moral/value-aware text analysis in news and political discourse
- Multi-label classification under class imbalance

The model was not trained or audited for safety-critical or high-stakes decision-making.

Labels

The 19 labels follow the refined Schwartz value continuum:

Self-direction: thought
Self-direction: action
Stimulation
Hedonism
Achievement
Power: dominance
Power: resources
Face
Security: personal
Security: societal
Tradition
Conformity: rules
Conformity: interpersonal
Humility
Benevolence: caring
Benevolence: dependability
Universalism: concern
Universalism: nature
Universalism: tolerance

How to use

1. Quick start: direct use with Transformers

Because this model uses a custom architecture (EnhancedDebertaForSequenceClassification with extra feature inputs), it is loaded via AutoModelForSequenceClassification(..., trust_remote_code=True) rather than the generic pipeline("text-classification"), which only supports a fixed list of built-in model classes.

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_id = "VictorYeste/human-value-detection-deberta-baseline"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
    model_id,
    trust_remote_code=True,  # important for custom model code
)

values = [
    "Self-direction: thought",
    "Self-direction: action",
    "Stimulation",
    "Hedonism",
    "Achievement",
    "Power: dominance",
    "Power: resources",
    "Face",
    "Security: personal",
    "Security: societal",
    "Tradition",
    "Conformity: rules",
    "Conformity: interpersonal",
    "Humility",
    "Benevolence: caring",
    "Benevolence: dependability",
    "Universalism: concern",
    "Universalism: nature",
    "Universalism: tolerance",
]

id2label = {i: label for i, label in enumerate(values)}

def predict_values(text, threshold=0.50):
    enc = tokenizer(text, return_tensors="pt", truncation=True)
    with torch.no_grad():
        outputs = model(**enc)

    logits = outputs.logits.squeeze(0)          # (19,)
    probs = torch.sigmoid(logits)               # tensor of shape (19,)
    probs = probs.cpu().numpy()

    active = probs >= threshold
    active_labels = [id2label[i] for i, is_on in enumerate(active) if is_on]

    return {
        "probs": {id2label[i]: float(p) for i, p in enumerate(probs)},
        "labels": active_labels,
    }

example = "We must do more to protect the environment and future generations."
print(predict_values(example))

This will return something like:

{
    'probs': {
        'Self-direction: thought': 0.004236925393342972,
        'Self-direction: action': 0.007529713679105043,
        'Stimulation': 0.014666699804365635,
        'Hedonism': 0.004158752970397472,
        'Achievement': 0.017073791474103928,
        'Power: dominance': 0.006939167156815529,
        'Power: resources': 0.0076741743832826614,
        'Face': 0.0034943644423037767,
        'Security: personal': 0.00695117749273777,
        'Security: societal': 0.012955584563314915,
        'Tradition': 0.00661467807367444,
        'Conformity: rules': 0.0017643438186496496,
        'Conformity: interpersonal': 0.004064192529767752,
        'Humility': 0.0032048451248556376,
        'Benevolence: caring': 0.011124887503683567,
        'Benevolence: dependability': 0.017767170444130898,
        'Universalism: concern': 0.01814778335392475,
        'Universalism: nature': 0.9813610911369324,
        'Universalism: tolerance': 0.0025894937571138144
    },
    'labels': ['Universalism: nature']
}

Note: this is a multi-label model that returns all labels with scores in [0,1]. You still need to choose a threshold to decide what counts as “present”.

2. Multi-label usage with a custom threshold

For research use, you will often want to:

Apply a sigmoid over logits
Use a label-wise or global threshold (e.g., 0.3 instead of 0.5)

The predict_values function above already takes a threshold argument, so you can simply do:

predict_values(example, threshold=0.30)

and tune the threshold on your own validation set depending on your precision/recall preferences.

Training data

The model was trained on the English, machine-translated portion of the ValueEval’24 / ValuesML dataset:

Domain: news articles and political manifestos
Unit of analysis: individual sentences
Labels: 19 refined Schwartz values
Each value has attained and constrained annotations in the original data
For this model, these are collapsed into a single binary label per value
Presence variable in the paper ((z_s)) is defined as “any of the 19 labels is positive”, but this model directly predicts the 19 values.

Important: the original dataset is distributed under a restricted Data Usage Agreement. You must obtain the data separately from the ValueEval/ValuesML organisers (e.g. via Zenodo) and respect their license.

Training setup

Base model: microsoft/deberta-base
Task: 19-way multi-label classification
Objective: binary cross-entropy (BCEWithLogits) over the 19 labels
Max sequence length: 512 tokens
Optimizer: AdamW
Effective batch size: 16 (batch 4 × gradient accumulation 4)
Learning rate: 2e-5
Weight decay: 0.15
Epochs: up to 10 with early stopping on validation macro–F1
Hardware: single GPU with ≤ 8 GB VRAM

This is the text-only baseline configuration in the paper (no LIWC or context features).

Performance (paper reference)

On the English ValueEval’24 sentence-level splits used in the paper, this baseline DeBERTa model achieves:

Macro–F₁ (19 values, test): ~0.31 (with a tuned global threshold around 0.30)

The paper reports comparisons with:

Presence-gated hierarchies
Feature-augmented DeBERTa models (LIWC-22, prior-sentence context, topics)
Instruction-tuned LLM baselines (7–9B)
A small soft-voting ensemble of three DeBERTa-based models (including this baseline) obtains the best overall performance (macro–F₁ ≈ 0.33).

For full details, please refer to the papers:

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum Víctor Yeste, Paolo Rosso (2026), arXiv:2601.14172

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts Víctor Yeste, Paolo Rosso (2026), arXiv:2602.00913

Limitations and bias

The model is trained on news and political texts; it may not generalise to:
- Social media
- Everyday conversations
- Other genres or languages
Values are annotated at the sentence level; many real-world value cues are only clear in broader context.
Rare values (e.g., Humility, Hedonism, Universalism: tolerance) have few positive examples and are harder to predict.
No systematic bias or fairness analysis has been conducted; the model should not be used for profiling individuals or making high-stakes decisions.

If you use this model, please:

Keep humans in the loop.
Treat outputs as noisy indicators, especially for rare labels.

License

The model weights and code in this repository are released under the Apache License 2.0.

See the LICENSE file (or the license field in this model card) for details.

Note: This does not grant you any rights over the underlying training data (ValueEval/ValuesML). Please obtain and use that data under its own license and Data Usage Agreement.

Citation

If you use this model or the associated code in your research, please cite:

@misc{yeste2026humanvaluessinglesentence,
      title={Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum}, 
      author={Víctor Yeste and Paolo Rosso},
      year={2026},
      eprint={2601.14172},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.14172}, 
}

@misc{yeste2026schwartzhigherordervalueshelp,
      title={Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts}, 
      author={Víctor Yeste and Paolo Rosso},
      year={2026},
      eprint={2602.00913},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.00913}, 
}

You may also want to cite the ValueEval / ValuesML dataset:

@misc{ValueEval24Zenodo,
  author  = {{The ValuesML Team}},
  title   = {Touch{\'e}24{-}ValueEval},
  year    = {2024},
  month   = {8},
  version = {2024-08-09},
  publisher = {Zenodo},
  doi     = {10.5281/zenodo.13283288},
  url     = {https://doi.org/10.5281/zenodo.13283288}
}

Downloads last month: 26

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VictorYeste/human-value-detection-deberta-baseline

Base model

microsoft/deberta-base

Finetuned

(75)

this model

Papers for VictorYeste/human-value-detection-deberta-baseline

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts

Paper • 2602.00913 • Published Jan 31

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum

Paper • 2601.14172 • Published Jan 20