RashidNLP's picture
Update README.md
e0ccf72
|
raw
history blame
2.9 kB
metadata
language:
  - de

library_name: transformers tags: - Text Classification - Pytorch - Discourse Classification - Roberta

Roberta for German Discourse Classification

This is a xlm Roberta model finetuned on a German Discourse dataset of 60 discourses having a total over 10k sentences.

Understanding the labels

Externalization can be seen as a form of attribution that emphasizes situational factors as the cause of behavior, rather than dispositional factors. For example, if someone is expressing a strong emotion in their discourse, an external attribution might suggest that the emotion is a result of the situation or context they are in, rather than a reflection of their personality or character.

Elicitation can be seen as a form of attribution that emphasizes the role of the listener in shaping the discourse of others. By asking questions or providing prompts, the listener can elicit particular types of responses or information from the speaker, which can help to shape the course of the conversation.

Conflict can be seen as a form of attribution that emphasizes the role of interpersonal factors in shaping behavior. When people engage in conflict, they often attribute the behavior of others to dispositional factors, such as their personality or character, rather than situational factors.

Acceptance can be seen as a form of attribution that emphasizes the role of empathy and understanding in shaping behavior. When people accept the perspectives or experiences of others, they are often making an attribution that emphasizes the situational factors that have shaped those perspectives, rather than dispositional factors.

Integration can be seen as a form of attribution that emphasizes the complexity and nuance of human behavior. By combining multiple perspectives or ideas, people can create a more comprehensive understanding of the behavior of others, which may incorporate both dispositional and situational factors.

How to use the model

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

def get_label(sentence):
    vectors = tokenizer(sentence, return_tensors='pt').to(device)
    outputs = bert_model(**vectors).logits
    probs = torch.nn.functional.softmax(outputs, dim = 1)[0]
    bert_dict = {}
    keys = ['Externalization', 'Elicitation', 'Conflict', 'Acceptence', 'Integration', 'None']
    for i in range(len(keys)):
        bert_dict[keys[i]] = round(probs[i].item(), 3)
    return bert_dict

MODEL_NAME = 'RashidNLP/Roberta-German-Discourse'
MODEL_DIR = 'model'
CHECKPOINT_DIR = 'checkpoints'
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
OUTPUTS = 6

bert_model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME, num_labels = OUTPUTS).to(device)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

get_label("Gehst du zum Oktoberfest?")