--- license: apache-2.0 language: - en --- The model is a port of our CommentBERT model from the paper: ``` @inproceedings{ochodek2022automated, title={Automated code review comment classification to improve modern code reviews}, author={Ochodek, Miroslaw and Staron, Miroslaw and Meding, Wilhelm and S{\"o}der, Ola}, booktitle={International Conference on Software Quality}, pages={23--40}, year={2022}, organization={Springer} } ``` The original model was implemented in Keras with two outputs - comment-purpose and subject-purpose. Here, we divided it into two separate model with one output each. --- license: apache-2.0 --- ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification from scipy.special import softmax checkpoint = 'mochodek/bert4comment-purpose' tokenizer = AutoTokenizer.from_pretrained(checkpoint) model = AutoModelForSequenceClassification.from_pretrained(checkpoint) id2class = { 0: 'discussion_participation', 1: 'discussion_trigger', 2: 'change_request', 3: 'acknowledgement', 4: 'same_as' } text = "Please, make constant from that string" encoded_input = tokenizer(text, return_tensors='pt') output = model(**encoded_input) scores = softmax(output.logits.detach().numpy()) id2class[np.argmax(scores)] ```