| --- |
| license: mit |
| tags: |
| - pytorch |
| - nlp |
| - nlu |
| - text-classification |
| - intent-classification |
| - multilingual |
| - driver-commands |
| - fine-tuned |
| - encoder-only |
| - decoder-only |
| language: |
| - ru |
| - en |
| datasets: |
| - INFINITY1023/MultilingualDriverCommands |
| metrics: |
| - accuracy |
| - f1 |
| - precision |
| - recall |
| pipeline_tag: text-classification |
| pretty_name: Multilingual Driver Command Models |
| --- |
| |
| # Multilingual Driver Command Models |
|
|
| ## Model Summary |
|
|
| This repository contains **four fine-tuned models** for multilingual driver command intent classification. |
|
|
| The models were trained to classify short driver phrases in **Russian** and **English** into intent classes for an in-car voice assistant. |
|
|
| The repository is linked to the dataset: |
|
|
| - [`INFINITY1023/MultilingualDriverCommands`](https://huggingface.co/datasets/INFINITY1023/MultilingualDriverCommands) |
|
|
| ## Models |
|
|
| | Model | Architecture Type | Description | |
| |---|---|---| |
| | `bge-m3` | Encoder-only | Multilingual encoder model | |
| | `e5-multilingual` | Encoder-only | Semantic multilingual encoder | |
| | `mmBERT-base` | Encoder-only | Compact multilingual BERT-style baseline | |
| | `gte-Qwen2-7B-instruct` | Decoder-only | Instruction-tuned decoder model adapted for classification | |
|
|
| ## Task |
|
|
| The models solve a **multiclass intent classification** task: |
|
|
| > Given a short driver phrase, predict the corresponding intent class. |
|
|
| Example inputs: |
|
|
| - `Set the temperature to twenty two` |
| - `Turn on Bluetooth audio` |
| - `Позвони маме` |
| - `Включи обогрев сиденья` |
| - `Построй маршрут до дома` |
|
|
| Possible intent categories include climate control, navigation, media, calls, phone connection, lighting, seat control, cruise control, and other vehicle assistant actions. |
|
|
| ## Training Dataset |
|
|
| The models were trained on **Multilingual Driver Commands Dataset**. |
|
|
| Dataset characteristics: |
|
|
| | Property | Value | |
| |---|---:| |
| | Dataset size | 153,062 examples | |
| | Languages | Russian + English | |
| | Language distribution | 50% RU / 50% EN | |
| | Final number of intents | 64 | |
| | Task | Intent classification | |
|
|
| The dataset was synthetically generated, manually validated, balanced across classes, and enriched with rare driving-related scenarios. |
|
|
| ## Experimental Results |
|
|
| The following results were obtained on the test set after class balancing and merging semantically overlapping intents into 64 final classes. |
|
|
| | Model | Accuracy | Macro F1 | Macro Precision | Macro Recall | |
| |---|---:|---:|---:|---:| |
| | `e5-multilingual-base` | 0.864 | 0.862 | 0.868 | 0.859 | |
| | `mmBERT-base` | 0.857 | 0.854 | 0.859 | 0.853 | |
| | `bge-m3` | 0.868 | 0.863 | 0.868 | 0.864 | |
| | `gte-Qwen2-7B-instruct` | 0.872 | 0.870 | 0.878 | 0.865 | |
|
|
| A separate experiment with stronger intent merging into 45 classes showed that `gte-Qwen2-7B-instruct` reached **0.905 accuracy**, but this reduced the functional granularity of the assistant. |
|
|
| ## Main Findings |
|
|
| The experiments show that larger models do not always provide a proportional improvement for short command classification. |
|
|
| Although `gte-Qwen2-7B-instruct` is much larger than `bge-m3`, the quality gap between them was relatively small. This suggests that, for this task, the main quality limitation is not only model size, but also: |
|
|
| - class taxonomy; |
| - semantic overlap between intents; |
| - synthetic data noise; |
| - incomplete or noisy parameter fields; |
| - dataset structure and balance. |
|
|
| For practical deployment, a smaller encoder-based model such as `bge-m3` may be more efficient, since it provides competitive quality with lower computational cost. |
|
|
| ## Repository Structure |
|
|
| Recommended repository structure: |
|
|
| ```text |
| best_models/ |
| ├── bge-m3/ |
| │ └── model.pt |
| ├── e5-multilingual/ |
| │ └── model.pt |
| ├── mmBERT-base/ |
| │ └── model.pt |
| └── qwen2/ |
| └── model.pt |
| ``` |
|
|
| If the checkpoints are saved as PyTorch `state_dict` files, the model architecture code is required to load them correctly. |
|
|
| ## Loading PyTorch Checkpoints |
|
|
| Example loading pattern: |
|
|
| ```python |
| import torch |
| |
| # Example only: replace MyModel with the corresponding architecture class. |
| from model import MyModel |
| |
| model = MyModel(...) |
| state_dict = torch.load("best_models/bge-m3/model.pt", map_location="cpu") |
| model.load_state_dict(state_dict) |
| model.eval() |
| ``` |
|
|
| If a checkpoint was saved as a full PyTorch model object rather than a `state_dict`, it can be loaded as: |
|
|
| ```python |
| import torch |
| |
| model = torch.load("best_models/bge-m3/model.pt", map_location="cpu") |
| model.eval() |
| ``` |
|
|
| The exact loading method depends on how the checkpoint was saved during training. |
|
|
| ## Intended Use |
|
|
| These models are intended for: |
|
|
| - educational experiments; |
| - research on synthetic NLU datasets; |
| - multilingual intent classification; |
| - comparison of encoder-only and decoder-only architectures; |
| - prototyping voice assistant command recognition. |
|
|
| ## Limitations |
|
|
| The models were trained on a synthetic dataset. Therefore, real-world performance may differ when applied to natural user traffic. |
|
|
| Known limitations: |
|
|
| - possible sensitivity to synthetic generation style; |
| - errors on semantically close intents; |
| - dependence on data quality and intent taxonomy; |
| - limited robustness to real-world noise, slang, ASR errors, and incomplete phrases; |
| - potential confusion between intents with similar surface forms. |
|
|
| For production use, the models should be evaluated on real driver commands and monitored for data drift. |
|
|
| ## Citation |
|
|
| If you use these checkpoints, please cite or reference this repository: |
|
|
| ```bibtex |
| @misc{multilingual-driver-command-models, |
| title = {Multilingual Driver Command Models}, |
| author = {Nizhankovskiy, Ilya}, |
| year = {2026}, |
| publisher = {Hugging Face}, |
| howpublished = {\url{https://huggingface.co/INFINITY1023/multilingual-driver-command-models}} |
| } |
| ``` |
|
|