| | --- |
| | |
| | |
| | license: mit |
| | language: |
| | - cs |
| | --- |
| | # Model Card for robeczech-binary-supportive-interactions-cs |
| |
|
| | <!-- Provide a quick summary of what the model is/does. --> |
| |
|
| | This model is fine-tuned for binary text classification of Supportive Interactions in Instant Messenger dialogs of Adolescents in Czech. |
| |
|
| | ## Model Description |
| |
|
| | The model was fine-tuned on a dataset of Czech Instant Messenger dialogs of Adolescents. The classification is binary and the model outputs probablities for labels {0,1}: Supportive Interactions present or not. |
| |
|
| | - **Developed by:** Anonymous |
| | - **Language(s):** cs |
| | - **Finetuned from:** ufal/robeczech-base |
| |
|
| | ## Model Sources |
| |
|
| | <!-- Provide the basic links for the model. --> |
| |
|
| | - **Repository:** https://github.com/justtherightsize/supportive-interactions-and-risks |
| | - **Paper:** Stay tuned! |
| |
|
| | ## Usage |
| | Here is how to use this model to classify a context-window of a dialogue: |
| |
|
| | ```python |
| | import numpy as np |
| | from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| | |
| | # Prepare input texts. This model is fine-tuned for Czech |
| | test_texts = ['Utterance1;Utterance2;Utterance3'] |
| | |
| | # Load the model and tokenizer |
| | model = AutoModelForSequenceClassification.from_pretrained( |
| | 'justtherightsize/robeczech-binary-supportive-interactions-cs', num_labels=2).to("cuda") |
| | |
| | tokenizer = AutoTokenizer.from_pretrained( |
| | 'justtherightsize/robeczech-binary-supportive-interactions-cs', |
| | use_fast=False, truncation_side='left') |
| | assert tokenizer.truncation_side == 'left' |
| | |
| | # Define helper functions |
| | def get_probs(text, tokenizer, model): |
| | inputs = tokenizer(text, padding=True, truncation=True, max_length=256, |
| | return_tensors="pt").to("cuda") |
| | outputs = model(**inputs) |
| | return outputs[0].softmax(1) |
| | |
| | def preds2class(probs, threshold=0.5): |
| | pclasses = np.zeros(probs.shape) |
| | pclasses[np.where(probs >= threshold)] = 1 |
| | return pclasses.argmax(-1) |
| | |
| | def print_predictions(texts): |
| | probabilities = [get_probs( |
| | texts[i], tokenizer, model).cpu().detach().numpy()[0] |
| | for i in range(len(texts))] |
| | predicted_classes = preds2class(np.array(probabilities)) |
| | for c, p in zip(predicted_classes, probabilities): |
| | print(f'{c}: {p}') |
| | |
| | # Run the prediction |
| | print_predictions(test_texts) |
| | ``` |