justtherightsize
/

small-e-czech-multi-label-supportive-interactions-cs

Feature Extraction

Model card Files Files and versions

small-e-czech-multi-label-supportive-interactions-cs / README.md

justtherightsize's picture

justtherightsize

Upload README.md

e481069 over 2 years ago

|

history blame contribute delete

2.55 kB

	---
	# For reference on model card metadata, see the spec: https://github.com/huggingface/hub-docs/blob/main/modelcard.md?plain=1
	# Doc / guide: https://huggingface.co/docs/hub/model-cards
	license: mit
	language:
	- cs
	---
	# Model Card for small-e-czech-multi-label-supportive-interactions-cs

	<!-- Provide a quick summary of what the model is/does. -->

	This model is fine-tuned for multi-label text classification of Supportive Interactions in Instant Messenger dialogs of Adolescents.

	## Model Description

	The model was fine-tuned on a dataset of Instant Messenger dialogs of Adolescents. The classification is multi-label and the model outputs probablities for labels {0,1,2,3,4,5}:

	0. None
	1. Informational Support
	2. Emotional Support
	3. Social Companionship
	4. Appraisal
	5. Instrumental Support

	- Developed by: Anonymous
	- Language(s): cs
	- Finetuned from: small-e-czech

	## Model Sources

	<!-- Provide the basic links for the model. -->

	- Repository: https://github.com/justtherightsize/supportive-interactions-and-risks
	- Paper: Stay tuned!

	## Usage
	Here is how to use this model to classify a context-window of a dialogue:

	```python
	import numpy as np
	import torch
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	# Prepare input texts. This model is pretrained on multi-lingual data
	# and fine-tuned on English
	test_texts = ['Utterance1;Utterance2;Utterance3']

	# Load the model and tokenizer
	model = AutoModelForSequenceClassification.from_pretrained(
	'justtherightsize/small-e-czech-multi-label-supportive-interactions-cs', num_labels=6).to("cuda")

	tokenizer = AutoTokenizer.from_pretrained(
	'justtherightsize/small-e-czech-multi-label-supportive-interactions-cs',
	use_fast=False, truncation_side='left')
	assert tokenizer.truncation_side == 'left'

	# Define helper functions
	def predict_one(text: str, tok, mod, threshold=0.5):
	encoding = tok(text, return_tensors="pt", truncation=True, padding=True,
	max_length=256)
	encoding = {k: v.to(mod.device) for k, v in encoding.items()}
	outputs = mod(**encoding)
	logits = outputs.logits
	sigmoid = torch.nn.Sigmoid()
	probs = sigmoid(logits.squeeze().cpu())
	predictions = np.zeros(probs.shape)
	predictions[np.where(probs >= threshold)] = 1
	return predictions, probs

	def print_predictions(texts):
	preds = [predict_one(tt, tokenizer, model) for tt in texts]
	for c, p in preds:
	print(f'{c}: {p.tolist():.4f}')

	# Run the prediction
	print_predictions(test_texts)
	```