Text Classification
ONNX
sentence-classification
multilingual
Jarbas's picture
Upload README.md with huggingface_hub
8d4e772 verified
metadata
language:
  - en
  - de
  - es
  - fr
  - it
  - nl
  - pt
license: apache-2.0
tags:
  - sentence-classification
  - text-classification
  - onnx
  - multilingual
datasets:
  - TigreGotico/sentence-types-multilingual

sentence-types

Multilingual sentence-type classifiers (ONNX) trained on TigreGotico/sentence-types-multilingual (9,900 balanced samples per language, 6 classes).

Used by little_questions.

Classes

command, exclamation, polar_question, request, statement, wh_question

Models

File Language
sentence_type_EN_0.8.0.onnx English
sentence_type_DE_0.8.0.onnx German
sentence_type_ES_0.8.0.onnx Spanish
sentence_type_FR_0.8.0.onnx French
sentence_type_IT_0.8.0.onnx Italian
sentence_type_NL_0.8.0.onnx Dutch
sentence_type_PT_0.8.0.onnx Portuguese

Accuracy

Language Accuracy Macro F1
EN 99.2% 99.2%
NL 98.8% 98.8%
FR 97.1% 97.1%
IT 97.0% 97.0%
PT 95.4% 95.4%
DE 85.6% 84.9%
ES 74.6% 72.7%

Inference

import onnxruntime as rt, numpy as np, json

sess = rt.InferenceSession("sentence_type_EN_0.8.0.onnx")
classes = json.loads(sess.get_modelmeta().custom_metadata_map["classes"])
inp = np.array(["Who invented the telephone?"], dtype=object)
label_idx, probs = sess.run(None, {"input": inp})
print(classes[int(label_idx[0])])   # wh_question