Instructions to use gabrielloiseau/BERT-base-privacy with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use gabrielloiseau/BERT-base-privacy with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="gabrielloiseau/BERT-base-privacy")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("gabrielloiseau/BERT-base-privacy") model = AutoModelForSequenceClassification.from_pretrained("gabrielloiseau/BERT-base-privacy") - Notebooks
- Google Colab
- Kaggle
Distilling Human-Aligned Privacy Sensitivity Assessment from Large Language Models
This model is a lightweight encoder (150M parameters) based on BERT, designed for assessing the privacy sensitivity of textual data. It was introduced in the paper Distilling Human-Aligned Privacy Sensitivity Assessment from Large Language Models.
The model was distilled from Mistral Large 3 (675B) using a large-scale dataset of privacy-annotated texts across 10 diverse domains. It aims to preserve strong agreement with human judgments of privacy while being efficient enough for large-scale real-world deployment.
Model Details
- Task: Text Classification (Privacy Sensitivity Score 1-5)
- Model Type: BERT-base
- Distilled from: Mistral Large 3
- Repository: https://github.com/gabrielloiseau/privacy-distillation
- Paper: arXiv:2603.29497
How to Get Started with the Model
You can use the model directly with a Hugging Face pipeline:
from transformers import pipeline
classifier = pipeline("text-classification", model="gabrielloiseau/BERT-base-privacy")
result = classifier("Happy First Day of Spring!")
print(result) # [{'label': '1', 'score': 0.98}]
The labels 1 through 5 represent the degree of privacy sensitivity, where 1 is the lowest and 5 is the highest.
Citation
@misc{loiseau2026distilling,
title={Distilling Human-Aligned Privacy Sensitivity Assessment from Large Language Models},
author={Gabriel Loiseau and Damien Sileo and Damien Riquet and Maxime Meyer and Marc Tommasi},
year={2026},
eprint={2603.29497},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2603.29497},
}
- Downloads last month
- 3