Text Classification
ONNX
sentence-classification
multilingual
Jarbas's picture
Upload README.md with huggingface_hub
8d4e772 verified
---
language:
- en
- de
- es
- fr
- it
- nl
- pt
license: apache-2.0
tags:
- sentence-classification
- text-classification
- onnx
- multilingual
datasets:
- TigreGotico/sentence-types-multilingual
---
# sentence-types
Multilingual sentence-type classifiers (ONNX) trained on
[TigreGotico/sentence-types-multilingual](https://huggingface.co/datasets/TigreGotico/sentence-types-multilingual)
(9,900 balanced samples per language, 6 classes).
Used by [little_questions](https://github.com/OpenJarbas/little_questions).
## Classes
`command`, `exclamation`, `polar_question`, `request`, `statement`, `wh_question`
## Models
| File | Language |
|------|----------|
| `sentence_type_EN_0.8.0.onnx` | English |
| `sentence_type_DE_0.8.0.onnx` | German |
| `sentence_type_ES_0.8.0.onnx` | Spanish |
| `sentence_type_FR_0.8.0.onnx` | French |
| `sentence_type_IT_0.8.0.onnx` | Italian |
| `sentence_type_NL_0.8.0.onnx` | Dutch |
| `sentence_type_PT_0.8.0.onnx` | Portuguese |
## Accuracy
| Language | Accuracy | Macro F1 |
|----------|----------|----------|
| EN | 99.2% | 99.2% |
| NL | 98.8% | 98.8% |
| FR | 97.1% | 97.1% |
| IT | 97.0% | 97.0% |
| PT | 95.4% | 95.4% |
| DE | 85.6% | 84.9% |
| ES | 74.6% | 72.7% |
## Inference
```python
import onnxruntime as rt, numpy as np, json
sess = rt.InferenceSession("sentence_type_EN_0.8.0.onnx")
classes = json.loads(sess.get_modelmeta().custom_metadata_map["classes"])
inp = np.array(["Who invented the telephone?"], dtype=object)
label_idx, probs = sess.run(None, {"input": inp})
print(classes[int(label_idx[0])]) # wh_question
```