Human Value Detection – DeBERTa + Previous Sentences (2-sentence context)

This model is the DeBERTa + 2 previous sentences value detector from the paper:

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum
Víctor Yeste, Paolo Rosso (2026)

It is a multi-label classifier over the 19 refined Schwartz basic values, trained on the English, machine-translated portion of the ValueEval'24 / ValuesML corpus, with two previous sentences incorporated as features.

  • Base backbone: microsoft/deberta-base
  • Inputs during training/inference in the paper:
    • The current sentence (tokenized with DeBERTa)
    • A vector of binary labels predicted for the previous 2 sentences (prev_label_features, size = 2 × 19)
    • Optionally, the previous sentences and their predicted labels concatenated to the text
  • Outputs: a probability for each of the 19 Schwartz values.
  • Labels: we collapse “attained” and “constrained” into a single binary label per value (value is expressed vs. not expressed).

This checkpoint corresponds to the “Baseline + Previous-Sentences-2” feature-augmented model used in the paper and is one of the members of the best-performing DeBERTa ensemble.

⚠️ Important: In the paper, previous-sentence information is computed auto-regressively over each document (Text-ID). This Hugging Face model only contains the trained weights. To reproduce the full pipeline, please refer to the accompanying code repository.


Intended use

  • Research on human value detection, moral language, and discourse/context effects.
  • Baseline / starting point for work on:
    • Context-aware value detection
    • Sequential modelling of moral content in news and political discourse
    • Multi-label classification with auxiliary context features

The model was not trained or audited for safety-critical or high-stakes decision-making.


Labels

The 19 labels follow the refined Schwartz value continuum:

  1. Self-direction: thought
  2. Self-direction: action
  3. Stimulation
  4. Hedonism
  5. Achievement
  6. Power: dominance
  7. Power: resources
  8. Face
  9. Security: personal
  10. Security: societal
  11. Tradition
  12. Conformity: rules
  13. Conformity: interpersonal
  14. Humility
  15. Benevolence: caring
  16. Benevolence: dependability
  17. Universalism: concern
  18. Universalism: nature
  19. Universalism: tolerance

How to use

Because this model uses a custom architecture (EnhancedDebertaForSequenceClassification) with a previous-sentence label branch, it is loaded via:

  • AutoModelForSequenceClassification(..., trust_remote_code=True)
  • and (for full fidelity) expects a prev_label_features tensor of shape [batch_size, 2 * num_labels].

1. Minimal single-sentence example (no context, prev labels set to zeros)

If you just want to run the model on one sentence without context, you can pass a zero vector as prev_label_features.
This will run, but it will not fully match the paper’s setup (where prior labels are dynamic predictions from previous sentences in the same document).

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_id = "VictorYeste/human-value-detection-deberta-previous-sentences-2"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(
    model_id,
    trust_remote_code=True,  # important for custom model code
)

values = [
    "Self-direction: thought",
    "Self-direction: action",
    "Stimulation",
    "Hedonism",
    "Achievement",
    "Power: dominance",
    "Power: resources",
    "Face",
    "Security: personal",
    "Security: societal",
    "Tradition",
    "Conformity: rules",
    "Conformity: interpersonal",
    "Humility",
    "Benevolence: caring",
    "Benevolence: dependability",
    "Universalism: concern",
    "Universalism: nature",
    "Universalism: tolerance",
]

id2label = {i: label for i, label in enumerate(values)}

def predict_values(text, prev_labels_vec=None, threshold=0.50):
    enc = tokenizer(text, return_tensors="pt", truncation=True)

    # Previous-sentence label features:
    # vector of length 2 * num_labels (prev-1 and prev-2).
    if prev_labels_vec is None:
        prev_dim = 2 * model.config.num_labels
        prev_labels_vec = [0.0] * prev_dim

    prev_tensor = torch.tensor([prev_labels_vec], dtype=torch.float32)

    with torch.no_grad():
        outputs = model(**enc, prev_label_features=prev_tensor)

    logits = outputs.logits.squeeze(0)   # (19,)
    probs = torch.sigmoid(logits)        # tensor (19,)
    probs = probs.cpu().numpy()

    active = probs >= threshold
    active_labels = [id2label[i] for i, is_on in enumerate(active) if is_on]

    return {
        "probs": {id2label[i]: float(p) for i, p in enumerate(probs)},
        "labels": active_labels,
    }

example = "We must do more to protect the environment and future generations."
print(predict_values(example))

This treats the sentence as if there were no informative previous sentences (all zero labels).

2. Using real previous-sentence labels (document-level use)

In the paper, for each document (Text-ID):

  1. Sentences are processed in order of Sentence-ID.
  2. For sentence t, the model receives:
    • The (possibly augmented) text at position t.
    • A vector containing the predicted binary labels for sentences t−1 and t−2.

This requires an auto-regressive loop over the document. You can implement this yourself by:

  • Keeping a prev_pred_1 and prev_pred_2 vector of length 19 (one per label).
  • For each sentence:
    • Build a prev_labels_vec = prev_pred_1 + prev_pred_2.
    • Call predict_values(..., prev_labels_vec=prev_labels_vec).
    • Update prev_pred_2 = prev_pred_1, prev_pred_1 = new_binary_preds.

For the exact implementation used in the paper (including text concatenation with tagged previous sentences), please see the accompanying GitHub repository.


Training data

The model was trained on the English, machine-translated portion of the ValueEval’24 / ValuesML dataset: • Domain: news articles and political manifestos • Unit of analysis: individual sentences, grouped into documents by Text-ID • Labels: 19 refined Schwartz values • Each value has attained and constrained annotations in the original data • For this model, these are collapsed into a single binary label per value

Important: the original dataset is distributed under a restricted Data Usage Agreement. You must obtain the data separately from the ValueEval/ValuesML organisers (e.g. via Zenodo) and respect their license.


Training setup

•	Base model: microsoft/deberta-base
•	Task: 19-way multi-label classification
•	Objective: binary cross-entropy (BCEWithLogits) over the 19 labels
•	Inputs:
•	DeBERTa sentence embedding
•	A 2×19-dimensional vector of previous-sentence labels (prev_label_features) passed through a 16-dimensional MLP branch
•	Max sequence length: 512 tokens
•	Optimizer: AdamW
•	Effective batch size: 16 (batch 4 × gradient accumulation 4, adjusted when previous-sentence features are active)
•	Learning rate: 2e-5
•	Weight decay: 0.15
•	Epochs: up to 10 with early stopping on validation macro–F1
•	Hardware: single GPU with ≤ 8 GB VRAM

This is the “Baseline + Previous-Sentences-2” context-aware configuration described in the paper.


Performance (paper reference)

On the English ValueEval’24 sentence-level splits, the paper compares:

  • Text-only DeBERTa baseline
  • DeBERTa with LIWC-22 features
  • DeBERTa with previous-sentence features (this model)
  • Additional feature-augmented variants (topics, etc.)
  • Instruction-tuned LLM baselines (7–9B)
  • A small soft-voting ensemble of three DeBERTa-based models (including this one) which achieves the best overall macro–F₁ (≈ 0.33).

For exact macro–F₁ scores and per-label results, please refer to:

Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum Víctor Yeste, Paolo Rosso (2026)


Limitations and bias

  • The model is trained on news and political texts; it may not generalise to:
  • Social media
  • Everyday conversations
  • Other genres or languages
  • Values are annotated at the sentence level; many real-world value cues are only clear in broader context.
  • Rare values (e.g., Humility, Hedonism, Universalism: tolerance) have few positive examples and are harder to predict.
  • The previous-sentence mechanism assumes a coherent document structure (ordered sentences for each Text-ID).
  • No systematic bias or fairness analysis has been conducted; the model should not be used for profiling individuals or making high-stakes decisions.

If you use this model, please:

  • Keep humans in the loop.
  • Treat outputs as noisy indicators, especially for rare labels.

License

The model weights and code in this repository are released under the Apache License 2.0.

This does not grant you any rights over the underlying training data (ValueEval/ValuesML). Please obtain and use that data under its own license and Data Usage Agreement.


Citation

If you use this model or the associated code in your research, please cite:

@misc{yeste2026humanvaluessinglesentence,
      title={Human Values in a Single Sentence: Moral Presence, Hierarchies, and Transformer Ensembles on the Schwartz Continuum}, 
      author={Víctor Yeste and Paolo Rosso},
      year={2026},
      eprint={2601.14172},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2601.14172}, 
}

You may also want to cite the ValueEval / ValuesML dataset:

@misc{ValueEval24Zenodo,
  author    = {{The ValuesML Team}},
  title     = {Touch{\'e}24{-}ValueEval},
  year      = {2024},
  month     = {8},
  version   = {2024-08-09},
  publisher = {Zenodo},
  doi       = {10.5281/zenodo.13283288},
  url       = {https://doi.org/10.5281/zenodo.13283288}
}
Downloads last month
31
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for VictorYeste/human-value-detection-deberta-previous-sentences-2

Finetuned
(69)
this model

Paper for VictorYeste/human-value-detection-deberta-previous-sentences-2