--- language: - en - de - es - fr - it - nl - pt license: apache-2.0 tags: - sentence-classification - text-classification - onnx - multilingual datasets: - TigreGotico/sentence-types-multilingual --- # sentence-types Multilingual sentence-type classifiers (ONNX) trained on [TigreGotico/sentence-types-multilingual](https://huggingface.co/datasets/TigreGotico/sentence-types-multilingual) (9,900 balanced samples per language, 6 classes). Used by [little_questions](https://github.com/OpenJarbas/little_questions). ## Classes `command`, `exclamation`, `polar_question`, `request`, `statement`, `wh_question` ## Models | File | Language | |------|----------| | `sentence_type_EN_0.8.0.onnx` | English | | `sentence_type_DE_0.8.0.onnx` | German | | `sentence_type_ES_0.8.0.onnx` | Spanish | | `sentence_type_FR_0.8.0.onnx` | French | | `sentence_type_IT_0.8.0.onnx` | Italian | | `sentence_type_NL_0.8.0.onnx` | Dutch | | `sentence_type_PT_0.8.0.onnx` | Portuguese | ## Accuracy | Language | Accuracy | Macro F1 | |----------|----------|----------| | EN | 99.2% | 99.2% | | NL | 98.8% | 98.8% | | FR | 97.1% | 97.1% | | IT | 97.0% | 97.0% | | PT | 95.4% | 95.4% | | DE | 85.6% | 84.9% | | ES | 74.6% | 72.7% | ## Inference ```python import onnxruntime as rt, numpy as np, json sess = rt.InferenceSession("sentence_type_EN_0.8.0.onnx") classes = json.loads(sess.get_modelmeta().custom_metadata_map["classes"]) inp = np.array(["Who invented the telephone?"], dtype=object) label_idx, probs = sess.run(None, {"input": inp}) print(classes[int(label_idx[0])]) # wh_question ```