Human Value Detection – DeBERTa + Previous Sentences (2-sentence context)

This model is the DeBERTa + 2 previous sentences value detector from the papers:

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum
Víctor Yeste, Paolo Rosso (2026), arXiv:2601.14172

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts
Víctor Yeste, Paolo Rosso (2026), arXiv:2602.00913

It is a multi-label classifier over the 19 refined Schwartz basic values, trained on the English, machine-translated portion of the ValueEval'24 / ValuesML corpus, with two previous sentences incorporated as features.

Base backbone: microsoft/deberta-base
Inputs during training/inference in the paper:
- The current sentence (tokenized with DeBERTa)
- A vector of binary labels predicted for the previous 2 sentences (prev_label_features, size = 2 × 19)
- Optionally, the previous sentences and their predicted labels concatenated to the text
Outputs: a probability for each of the 19 Schwartz values.
Labels: we collapse “attained” and “constrained” into a single binary label per value (value is expressed vs. not expressed).

This checkpoint corresponds to the “Baseline + Previous-Sentences-2” feature-augmented model used in the paper and is one of the members of the best-performing DeBERTa ensemble.

⚠️ Important: In the paper, previous-sentence information is computed auto-regressively over each document (Text-ID). This Hugging Face model only contains the trained weights. To reproduce the full pipeline, please refer to the accompanying code repository.

Intended use

Research on human value detection, moral language, and discourse/context effects.
Baseline / starting point for work on:
- Context-aware value detection
- Sequential modelling of moral content in news and political discourse
- Multi-label classification with auxiliary context features

The model was not trained or audited for safety-critical or high-stakes decision-making.

Labels

The 19 labels follow the refined Schwartz value continuum:

Self-direction: thought
Self-direction: action
Stimulation
Hedonism
Achievement
Power: dominance
Power: resources
Face
Security: personal
Security: societal
Tradition
Conformity: rules
Conformity: interpersonal
Humility
Benevolence: caring
Benevolence: dependability
Universalism: concern
Universalism: nature
Universalism: tolerance

How to use

Because this model uses a custom architecture (EnhancedDebertaForSequenceClassification) with a previous-sentence label branch, it is loaded via:

AutoModelForSequenceClassification(..., trust_remote_code=True)
and (for full fidelity) expects a prev_label_features tensor of shape [batch_size, 2 * num_labels].

1. Minimal single-sentence example (no context, prev labels set to zeros)

If you just want to run the model on one sentence without context, you can pass a zero vector as prev_label_features.
This will run, but it will not fully match the paper’s setup (where prior labels are dynamic predictions from previous sentences in the same document).

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_id = "VictorYeste/human-value-detection-deberta-previous-sentences-2"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
    model_id,
    trust_remote_code=True,  # important for custom model code
)

values = [
    "Self-direction: thought",
    "Self-direction: action",
    "Stimulation",
    "Hedonism",
    "Achievement",
    "Power: dominance",
    "Power: resources",
    "Face",
    "Security: personal",
    "Security: societal",
    "Tradition",
    "Conformity: rules",
    "Conformity: interpersonal",
    "Humility",
    "Benevolence: caring",
    "Benevolence: dependability",
    "Universalism: concern",
    "Universalism: nature",
    "Universalism: tolerance",
]

id2label = {i: label for i, label in enumerate(values)}

def predict_values(text, prev_labels_vec=None, threshold=0.50):
    enc = tokenizer(text, return_tensors="pt", truncation=True)

    # Previous-sentence label features:
    # vector of length 2 * num_labels (prev-1 and prev-2).
    if prev_labels_vec is None:
        prev_dim = 2 * model.config.num_labels
        prev_labels_vec = [0.0] * prev_dim

    prev_tensor = torch.tensor([prev_labels_vec], dtype=torch.float32)

    with torch.no_grad():
        outputs = model(**enc, prev_label_features=prev_tensor)

    logits = outputs.logits.squeeze(0)   # (19,)
    probs = torch.sigmoid(logits)        # tensor (19,)
    probs = probs.cpu().numpy()

    active = probs >= threshold
    active_labels = [id2label[i] for i, is_on in enumerate(active) if is_on]

    return {
        "probs": {id2label[i]: float(p) for i, p in enumerate(probs)},
        "labels": active_labels,
    }

example = "We must do more to protect the environment and future generations."
print(predict_values(example))

This treats the sentence as if there were no informative previous sentences (all zero labels).

2. Using real previous-sentence labels (document-level use)

In the paper, for each document (Text-ID):

Sentences are processed in order of Sentence-ID.
For sentence t, the model receives:
- The (possibly augmented) text at position t.
- A vector containing the predicted binary labels for sentences t−1 and t−2.

This requires an auto-regressive loop over the document. You can implement this yourself by:

Keeping a prev_pred_1 and prev_pred_2 vector of length 19 (one per label).
For each sentence:
- Build a prev_labels_vec = prev_pred_1 + prev_pred_2.
- Call predict_values(..., prev_labels_vec=prev_labels_vec).
- Update prev_pred_2 = prev_pred_1, prev_pred_1 = new_binary_preds.

For the exact implementation used in the paper (including text concatenation with tagged previous sentences), please see the accompanying GitHub repository.

Training data

The model was trained on the English, machine-translated portion of the ValueEval’24 / ValuesML dataset:

Domain: news articles and political manifestos
Unit of analysis: individual sentences, grouped into documents by Text-ID
Labels: 19 refined Schwartz values
Each value has attained and constrained annotations in the original data
For this model, these are collapsed into a single binary label per value

Important: the original dataset is distributed under a restricted Data Usage Agreement. You must obtain the data separately from the ValueEval/ValuesML organisers (e.g. via Zenodo) and respect their license.

Training setup

Base model: microsoft/deberta-base
Task: 19-way multi-label classification
Objective: binary cross-entropy (BCEWithLogits) over the 19 labels
Inputs:
- DeBERTa sentence embedding
- A 2×19-dimensional vector of previous-sentence labels (prev_label_features) passed through a 16-dimensional MLP branch
Max sequence length: 512 tokens
Optimizer: AdamW
Effective batch size: 16 (batch 4 × gradient accumulation 4, adjusted when previous-sentence features are active)
Learning rate: 2e-5
Weight decay: 0.15
Epochs: up to 10 with early stopping on validation macro–F1
Hardware: single GPU with ≤ 8 GB VRAM

This is the “Baseline + Previous-Sentences-2” context-aware configuration described in the paper.

Performance (paper reference)

On the English ValueEval’24 sentence-level splits, the paper compares:

Text-only DeBERTa baseline
DeBERTa with LIWC-22 features
DeBERTa with previous-sentence features (this model)
Additional feature-augmented variants (topics, etc.)
Instruction-tuned LLM baselines (7–9B)
A small soft-voting ensemble of three DeBERTa-based models (including this one) which achieves the best overall macro–F₁ (≈ 0.33).

For exact macro–F₁ scores and per-label results, please refer to:

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum Víctor Yeste, Paolo Rosso (2026), arXiv:2601.14172

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts Víctor Yeste, Paolo Rosso (2026), arXiv:2602.00913

Limitations and bias

The model is trained on news and political texts; it may not generalise to:
- Social media
- Everyday conversations
- Other genres or languages
Values are annotated at the sentence level; many real-world value cues are only clear in broader context.
Rare values (e.g., Humility, Hedonism, Universalism: tolerance) have few positive examples and are harder to predict.
The previous-sentence mechanism assumes a coherent document structure (ordered sentences for each Text-ID).
No systematic bias or fairness analysis has been conducted; the model should not be used for profiling individuals or making high-stakes decisions.

If you use this model, please:

Keep humans in the loop.
Treat outputs as noisy indicators, especially for rare labels.

License

The model weights and code in this repository are released under the Apache License 2.0.

This does not grant you any rights over the underlying training data (ValueEval/ValuesML). Please obtain and use that data under its own license and Data Usage Agreement.

Citation

If you use this model or the associated code in your research, please cite:

@misc{yeste2026humanvaluessinglesentence,
      title={Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum}, 
      author={Víctor Yeste and Paolo Rosso},
      year={2026},
      eprint={2601.14172},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.14172}, 
}

@misc{yeste2026schwartzhigherordervalueshelp,
      title={Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts}, 
      author={Víctor Yeste and Paolo Rosso},
      year={2026},
      eprint={2602.00913},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2602.00913}, 
}

You may also want to cite the ValueEval / ValuesML dataset:

@misc{ValueEval24Zenodo,
  author    = {{The ValuesML Team}},
  title     = {Touch{\'e}24{-}ValueEval},
  year      = {2024},
  month     = {8},
  version   = {2024-08-09},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.13283288},
  url       = {https://doi.org/10.5281/zenodo.13283288}
}

Downloads last month: 3

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VictorYeste/human-value-detection-deberta-previous-sentences-2

Base model

microsoft/deberta-base

Finetuned

(75)

this model

Papers for VictorYeste/human-value-detection-deberta-previous-sentences-2

Do Schwartz Higher-Order Values Help Sentence-Level Human Value Detection? When Hard Gating Hurts

Paper • 2602.00913 • Published Jan 31

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum

Paper • 2601.14172 • Published Jan 20