| | --- |
| | |
| | |
| | license: mit |
| | language: |
| | - cs |
| | --- |
| | # Model Card for small-e-czech-multi-label-supportive-interactions-cs |
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| |
|
| | This model is fine-tuned for multi-label text classification of Supportive Interactions in Instant Messenger dialogs of Adolescents. |
| |
|
| | ## Model Description |
| |
|
| | The model was fine-tuned on a dataset of Instant Messenger dialogs of Adolescents. The classification is multi-label and the model outputs probablities for labels {0,1,2,3,4,5}: |
| |
|
| | 0. None |
| | 1. Informational Support |
| | 2. Emotional Support |
| | 3. Social Companionship |
| | 4. Appraisal |
| | 5. Instrumental Support |
| |
|
| | - **Developed by:** Anonymous |
| | - **Language(s):** cs |
| | - **Finetuned from:** small-e-czech |
| |
|
| | ## Model Sources |
| |
|
| | <!-- Provide the basic links for the model. --> |
| |
|
| | - **Repository:** https://github.com/justtherightsize/supportive-interactions-and-risks |
| | - **Paper:** Stay tuned! |
| |
|
| | ## Usage |
| | Here is how to use this model to classify a context-window of a dialogue: |
| |
|
| | ```python |
| | import numpy as np |
| | import torch |
| | from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| | |
| | # Prepare input texts. This model is pretrained on multi-lingual data |
| | # and fine-tuned on English |
| | test_texts = ['Utterance1;Utterance2;Utterance3'] |
| | |
| | # Load the model and tokenizer |
| | model = AutoModelForSequenceClassification.from_pretrained( |
| | 'justtherightsize/small-e-czech-multi-label-supportive-interactions-cs', num_labels=6).to("cuda") |
| | |
| | tokenizer = AutoTokenizer.from_pretrained( |
| | 'justtherightsize/small-e-czech-multi-label-supportive-interactions-cs', |
| | use_fast=False, truncation_side='left') |
| | assert tokenizer.truncation_side == 'left' |
| | |
| | # Define helper functions |
| | def predict_one(text: str, tok, mod, threshold=0.5): |
| | encoding = tok(text, return_tensors="pt", truncation=True, padding=True, |
| | max_length=256) |
| | encoding = {k: v.to(mod.device) for k, v in encoding.items()} |
| | outputs = mod(**encoding) |
| | logits = outputs.logits |
| | sigmoid = torch.nn.Sigmoid() |
| | probs = sigmoid(logits.squeeze().cpu()) |
| | predictions = np.zeros(probs.shape) |
| | predictions[np.where(probs >= threshold)] = 1 |
| | return predictions, probs |
| | |
| | def print_predictions(texts): |
| | preds = [predict_one(tt, tokenizer, model) for tt in texts] |
| | for c, p in preds: |
| | print(f'{c}: {p.tolist():.4f}') |
| | |
| | # Run the prediction |
| | print_predictions(test_texts) |
| | ``` |