| .. _intent_slot: |
|
|
| Joint Intent and Slot Classification |
| ==================================== |
|
|
| Joint Intent and Slot classification is a NLU task for classifying an intent and detecting all |
| relevant slots (Entities) for the intent in a query. For example, in the query ``What is the weather in Santa Clara tomorrow morning?``, |
| we would like to classify the query as a ``weather intent``, detect ``Santa Clara`` as a `location slot`, |
| and ``tomorrow morning`` as a ``date_time slot``. Intent and Slot names are usually task-specific and |
| defined as labels in the training data. This is a fundamental step that is executed in any |
| task-driven conversational assistant. |
|
|
| Our BERT-based model implementation allows you to train and detect both of these tasks together. |
|
|
| .. note:: |
| |
| We recommend you try the Joint Intent and Slot Classification model in a Jupyter notebook (can run on `Google's Colab <https://colab.research.google.com/notebooks/intro.ipynb>`_.): `NeMo/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb <https://github.com/NVIDIA/NeMo/blob/stable/tutorials/nlp/Joint_Intent_and_Slot_Classification.ipynb>`__. |
|
|
| Connect to an instance with a GPU (**Runtime** -> **Change runtime type** -> select **GPU** for the hardware accelerator). |
|
|
| An example script on how to train the model can be found here: `NeMo/examples/nlp/intent_slot_classification <https://github.com/NVIDIA/NeMo/tree/stable/examples/nlp/intent_slot_classification>`__. |
|
|
|
|
| NeMo Data Format |
| ---------------- |
|
|
| When training the model, the dataset should be first converted to the required data format, which requires the following files: |
|
|
| - :code:`dict.intents.csv` - A list of all intent names in the data. One line per an intent name. The index of the intent line |
| (starting from ``0``) is used to identify the appropriate intent in ``train.tsv`` and ``test.tsv`` files. |
|
|
| .. code:: |
| |
| weather |
| alarm |
| meeting |
| ... |
|
|
| - :code:`dict.slots.csv` - A list of all slot names in the data. One line per slot name. The index of the slot line |
| (starting from ``0``) is used to identify the appropriate slots in the queries in ``train_slot.tsv`` and ``test_slot.tsv`` files. |
| In the last line of this dictionary ``O`` slot name is used to identify all ``out of scope`` slots, which are usually the majority of the tokens |
| in the queries. |
|
|
| .. code:: |
| |
| date |
| time |
| city |
| ... |
| O |
|
|
| - :code:`train.tsv/test.tsv` - A list of original queries, one per line, with the intent number |
| separated by a tab (e.g. "what alarms do i have set right now <TAB> 0"). Intent numbers are |
| set according to the intent line in the intent dictionary file (:code:`dict.intents.csv`), |
| starting from ``0``. The first line in these files should contain the header line ``sentence |
| <tab> label``. |
|
|
| - :code:`train_slot.tvs/test_slot.tsv` - A list that contains one line per query, when each word from the original text queries |
| is replaced by a token number from the slots dictionary file (``dict.slots.csv``), counted starting from ``0``. All the words |
| which do not contain a relevant slot are replaced by ``out-of scope`` token number, which is also a part of the slot dictionary file, |
| usually as the last entry there. For example a line from these files should look similar to: "54 0 0 54 54 12 12" (the numbers are |
| separated by a space). These files do not contain a header line. |
|
|
|
|
| Dataset Conversion |
| ------------------ |
|
|
| To convert to the format of the model data, use the ``import_datasets`` utility, which implements |
| the conversion for the Assistant dataset. Download the dataset `here <https://github.com/xliuhw/NLU-Evaluation-Data>`_ or you can |
| write your own converter for the format that you are using for data annotation. |
|
|
| For a dataset that follows your own annotation format, we recommend using one text file for all |
| samples of the same intent, with the name of the file as the name of the intent. Use one line per |
| query, with brackets to define slot names. This is very similar to the assistant format, and you can |
| adapt this converter utility or your own format with small changes: |
| |
| :: |
| |
| did i set an alarm to [alarm_type : wake up] in the [timeofday : morning] |
| |
| Run the ``dataset_converter`` command: |
| |
| .. code:: |
| |
| python examples/nlp/intent_slot_classification/data/import_datasets.py |
| --source_data_dir=`source_data_dir` \ |
| --target_data_dir=`target_data_dir` \ |
| --dataset_name=['assistant'|'snips'|'atis'] |
| |
| - :code:`source_data_dir`: the directory location of the your dataset |
| - :code:`target_data_dir`: the directory location where the converted dataset should be saved |
| - :code:`dataset_name`: one of the implemented dataset names |
| |
| After conversion, ``target_data_dir`` should contain the following files: |
| |
| .. code:: |
| |
| . |
| |--target_data_dir |
| |-- dict.intents.csv |
| |-- dict.slots.csv |
| |-- train.tsv |
| |-- train_slots.tsv |
| |-- test.tsv |
| |-- test_slots.tsv |
| |
| Model Training |
| -------------- |
| |
| This is a pretrained BERT based model with 2 linear classifier heads on the top of it, one for classifying an intent of the query and |
| another for classifying slots for each token of the query. This model is trained with the combined loss function on the Intent and Slot |
| classification task on the given dataset. The model architecture is based on the paper `BERT for Joint Intent Classification and Slot Filling <https://arxiv.org/pdf/1902.10909.pdf>`__:cite:`nlp-jis-chen2019bert`. |
| |
| For each query, the model classifies it as one the intents from the intent dictionary and for each word of the query it will classify |
| it as one of the slots from the slot dictionary, including out of scope slot for all the remaining words in the query which does not |
| fall in another slot category. Out of scope slot (``O``) is a part of slot dictionary that the model is trained on. |
|
|
| Example of model configuration file for training the model can be found at: `NeMo/examples/nlp/intent_slot_classification/conf/intent_slot_classification.yaml <https://github.com/NVIDIA/NeMo/blob/stable/examples/nlp/intent_slot_classification/conf/intent_slot_classification_config.yaml>`__. |
| In the configuration file, define the parameters of the training and the model, although most of the default values will work well. |
| |
| The specification can be roughly grouped into three categories: |
| |
| - Parameters that describe the training process: **trainer** |
| - Parameters that describe the model: **model** |
| - Parameters that describe the datasets: **model.train_ds**, **model.validation_ds**, **model.test_ds**, |
| |
| More details about parameters in the spec file can be found below: |
| |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **Parameter** | **Data Type** | **Default** | **Description** | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.data_dir** | string | -- | The path of the data converted to the specified format. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.class_balancing** | string | ``null`` | Choose from ``[null, weighted_loss]``. The ``weighted_loss`` enables weighted class balancing of the loss. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.intent_loss_weight** | float | ``0.6`` | The elation of intent-to-slot loss in the total loss. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.pad_label** | integer | ``-1`` | A value to pad the inputs. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.ignore_extra_tokens** | boolean | ``false`` | A flag that specifies whether to ignore extra tokens. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.ignore_start_end** | boolean | ``true`` | A flag that specifies whether to not use the first and last token for slot training. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.head.num_output_layers** | integer | ``2`` | The number of fully connected layers of the classifier on top of the BERT model. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **model.head.fc_dropout** | float | ``0.1`` | The dropout ratio of the fully connected layers. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **training_ds.prefix** | string | ``train`` | A prefix for the training file names. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **validation_ds.prefix** | string | ``dev`` | A prefix for the validation file names. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| | **test_ds.prefix** | string | ``test`` | A prefix for the test file names. | |
| +-------------------------------------------+-----------------+----------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------+ |
| |
| For additional config parameters common to all NLP models, refer to the `nlp_model doc <https://github.com/NVIDIA/NeMo/blob/stable/docs/source/nlp/nlp_model.rst#model-nlp>`__. |
| |
| The following is an example of the command for training the model: |
| |
| .. code:: |
| |
| python examples/nlp/intent_slot_classification/intent_slot_classification.py |
| model.data_dir=<PATH_TO_DATA_DIR> \ |
| trainer.max_epochs=<NUM_EPOCHS> \ |
| trainer.devices=[<CHANGE_TO_GPU(s)_YOU_WANT_TO_USE>] \ |
| trainer.accelerator='gpu' |
|
|
|
|
| Required Arguments for Training |
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
|
| - :code:`model.data_dir`: the dataset directory |
|
|
|
|
| Optional Arguments |
| ^^^^^^^^^^^^^^^^^^ |
|
|
| Most of the default parameters in the existing configuration file are already set appropriately, however, there are some parameters |
| you may want to experiment with. |
|
|
| - ``trainer.max_epochs``: the number of training epochs (reasonable to be between 10 to 100) |
| - ``model.class_balancing``: value ``weighted_loss`` may help to train the model when there is unbalanced set of classes |
| - ``model.intent_loss_weight``: a number between 0 to 1 that defines a weight of the intent lost versus a slot loss during training. A default value 0.6 gives a slight preference for the intent lose optimization. |
|
|
| Training Procedure |
| ^^^^^^^^^^^^^^^^^^ |
|
|
| At the start of evaluation, NeMo will print out a log of the experiment specification, a summary of the training dataset, and the |
| model architecture. |
|
|
| As the model starts training, you should see a progress bar per epoch. During training, after each epoch, NeMo will display accuracy |
| metrics on the validation dataset for every intent and slot separately, as well as the total accuracy. You can expect these numbers |
| to grow up to 50-100 epochs, depending on the size of the trained data. Since this is a joint iIntent and slot training, usually |
| intent's accuracy will grow first for the initial 10-20 epochs, and after that, slot's accuracy will start improving as well. |
|
|
| At the end of training, NeMo saves the best checkpoint on the validation dataset at the path specified by the experiment spec file |
| before finishing. |
|
|
| .. code:: |
|
|
| GPU available: True, used: True |
| TPU available: None, using: 0 TPU cores |
| LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0,1,2] |
| [NeMo W 2021-01-28 14:52:19 exp_manager:299] There was no checkpoint folder at checkpoint_dir :results/checkpoints. Training from scratch. |
| [NeMo I 2021-01-28 14:52:19 exp_manager:186] Experiments will be logged at results |
| ... |
| label precision recall f1 support |
| weather.weather (label_id: 0) 0.00 0.00 0.00 128 |
| weather.temperature (label_id: 1) 0.00 0.00 0.00 0 |
| weather.temperature_yes_no (label_id: 2) 0.00 0.00 0.00 0 |
| weather.rainfall (label_id: 3) 0.00 0.00 0.00 0 |
| weather.rainfall_yes_no (label_id: 4) 0.00 0.00 0.00 0 |
| weather.snow (label_id: 5) 0.00 0.00 0.00 0 |
| weather.snow_yes_no (label_id: 6) 0.00 0.00 0.00 0 |
| weather.humidity (label_id: 7) 0.00 0.00 0.00 0 |
| weather.humidity_yes_no (label_id: 8) 0.00 0.00 0.00 0 |
| weather.windspeed (label_id: 9) 0.00 0.00 0.00 0 |
| weather.sunny (label_id: 10) 0.00 0.00 0.00 0 |
| weather.cloudy (label_id: 11) 0.00 0.00 0.00 0 |
| weather.alert (label_id: 12) 0.00 0.00 0.00 0 |
| context.weather (label_id: 13) 0.00 0.00 0.00 0 |
| context.continue (label_id: 14) 0.00 0.00 0.00 0 |
| context.navigation (label_id: 15) 0.00 0.00 0.00 0 |
| context.rating (label_id: 16) 0.00 0.00 0.00 0 |
| context.distance (label_id: 17) 0.00 0.00 0.00 0 |
| ------------------- |
| micro avg 0.00 0.00 0.00 128 |
| macro avg 0.00 0.00 0.00 128 |
| weighted avg 0.00 0.00 0.00 128 |
|
|
| Model Evaluation and Inference |
| ------------------------------ |
|
|
| There is no separate script for the evaluation and inference of this model in NeMo, however, inside of the example file `examples/nlp/intent_slot_classification/intent_slot_classification.py` |
| after the training part is finished, you can see the code that evaluates the trained model on an evaluation test set and then an example of doing inference using a list of given queries. |
|
|
| For the deployment in the production environment, refer to `NVIDIA Riva <https://docs.nvidia.com/deeplearning/riva/user-guide/docs/quick-start-guide.html>`__ and `NVIDIA TLT documentation <https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/nlp/index.html>`__. |
|
|
| References |
| ---------- |
|
|
| .. bibliography:: nlp_all.bib |
| :style: plain |
| :labelprefix: NLP-JIS |
| :keyprefix: nlp-jis- |
|
|