File size: 5,802 Bytes
ef731ed b7b4d6e ef731ed b7b4d6e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 | ---
license: mit
tags:
- pytorch
- nlp
- nlu
- text-classification
- intent-classification
- multilingual
- driver-commands
- fine-tuned
- encoder-only
- decoder-only
language:
- ru
- en
datasets:
- INFINITY1023/MultilingualDriverCommands
metrics:
- accuracy
- f1
- precision
- recall
pipeline_tag: text-classification
pretty_name: Multilingual Driver Command Models
---
# Multilingual Driver Command Models
## Model Summary
This repository contains **four fine-tuned models** for multilingual driver command intent classification.
The models were trained to classify short driver phrases in **Russian** and **English** into intent classes for an in-car voice assistant.
The repository is linked to the dataset:
- [`INFINITY1023/MultilingualDriverCommands`](https://huggingface.co/datasets/INFINITY1023/MultilingualDriverCommands)
## Models
| Model | Architecture Type | Description |
|---|---|---|
| `bge-m3` | Encoder-only | Multilingual encoder model |
| `e5-multilingual` | Encoder-only | Semantic multilingual encoder |
| `mmBERT-base` | Encoder-only | Compact multilingual BERT-style baseline |
| `gte-Qwen2-7B-instruct` | Decoder-only | Instruction-tuned decoder model adapted for classification |
## Task
The models solve a **multiclass intent classification** task:
> Given a short driver phrase, predict the corresponding intent class.
Example inputs:
- `Set the temperature to twenty two`
- `Turn on Bluetooth audio`
- `Позвони маме`
- `Включи обогрев сиденья`
- `Построй маршрут до дома`
Possible intent categories include climate control, navigation, media, calls, phone connection, lighting, seat control, cruise control, and other vehicle assistant actions.
## Training Dataset
The models were trained on **Multilingual Driver Commands Dataset**.
Dataset characteristics:
| Property | Value |
|---|---:|
| Dataset size | 153,062 examples |
| Languages | Russian + English |
| Language distribution | 50% RU / 50% EN |
| Final number of intents | 64 |
| Task | Intent classification |
The dataset was synthetically generated, manually validated, balanced across classes, and enriched with rare driving-related scenarios.
## Experimental Results
The following results were obtained on the test set after class balancing and merging semantically overlapping intents into 64 final classes.
| Model | Accuracy | Macro F1 | Macro Precision | Macro Recall |
|---|---:|---:|---:|---:|
| `e5-multilingual-base` | 0.864 | 0.862 | 0.868 | 0.859 |
| `mmBERT-base` | 0.857 | 0.854 | 0.859 | 0.853 |
| `bge-m3` | 0.868 | 0.863 | 0.868 | 0.864 |
| `gte-Qwen2-7B-instruct` | 0.872 | 0.870 | 0.878 | 0.865 |
A separate experiment with stronger intent merging into 45 classes showed that `gte-Qwen2-7B-instruct` reached **0.905 accuracy**, but this reduced the functional granularity of the assistant.
## Main Findings
The experiments show that larger models do not always provide a proportional improvement for short command classification.
Although `gte-Qwen2-7B-instruct` is much larger than `bge-m3`, the quality gap between them was relatively small. This suggests that, for this task, the main quality limitation is not only model size, but also:
- class taxonomy;
- semantic overlap between intents;
- synthetic data noise;
- incomplete or noisy parameter fields;
- dataset structure and balance.
For practical deployment, a smaller encoder-based model such as `bge-m3` may be more efficient, since it provides competitive quality with lower computational cost.
## Repository Structure
Recommended repository structure:
```text
best_models/
├── bge-m3/
│ └── model.pt
├── e5-multilingual/
│ └── model.pt
├── mmBERT-base/
│ └── model.pt
└── qwen2/
└── model.pt
```
If the checkpoints are saved as PyTorch `state_dict` files, the model architecture code is required to load them correctly.
## Loading PyTorch Checkpoints
Example loading pattern:
```python
import torch
# Example only: replace MyModel with the corresponding architecture class.
from model import MyModel
model = MyModel(...)
state_dict = torch.load("best_models/bge-m3/model.pt", map_location="cpu")
model.load_state_dict(state_dict)
model.eval()
```
If a checkpoint was saved as a full PyTorch model object rather than a `state_dict`, it can be loaded as:
```python
import torch
model = torch.load("best_models/bge-m3/model.pt", map_location="cpu")
model.eval()
```
The exact loading method depends on how the checkpoint was saved during training.
## Intended Use
These models are intended for:
- educational experiments;
- research on synthetic NLU datasets;
- multilingual intent classification;
- comparison of encoder-only and decoder-only architectures;
- prototyping voice assistant command recognition.
## Limitations
The models were trained on a synthetic dataset. Therefore, real-world performance may differ when applied to natural user traffic.
Known limitations:
- possible sensitivity to synthetic generation style;
- errors on semantically close intents;
- dependence on data quality and intent taxonomy;
- limited robustness to real-world noise, slang, ASR errors, and incomplete phrases;
- potential confusion between intents with similar surface forms.
For production use, the models should be evaluated on real driver commands and monitored for data drift.
## Citation
If you use these checkpoints, please cite or reference this repository:
```bibtex
@misc{multilingual-driver-command-models,
title = {Multilingual Driver Command Models},
author = {Nizhankovskiy, Ilya},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/INFINITY1023/multilingual-driver-command-models}}
}
```
|