File size: 8,119 Bytes
f43af3c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
============================================
Training a Model & Configuration Explanation
============================================

This tutorial shows how one can use ``EasyTPP`` to train the implemented models.

In principle, firstly we need to initialize a config yaml file, containing all the input configuration to guide the training and eval process. The overall structure of a config file is shown as below:

.. code-block:: yaml

    pipeline_config_id: ..  # name of the config for guiding the pipeline

    data:
        [Dataset ID]:  # name of the dataset, e.g, taxi
            ....

    [EXPERIMENT ID]:   # name of the experiment to run
        base_config:
          ....
        model_config:
           ...


After the config file is setup, we can run the script, by specifying the `config directory` and `experiment id`, to start the pipeline. We currently provide a preset script at `examples/train_nhp.py`.


Step 1: Setup the config file containing data and model configs
================================================================


To be specific, one needs to define the following entries in the config file:

- **pipeline_config_id**: registered name of EasyTPP.Config objects, such as `runner_config` or `hpo_runner_config`. By reading this, the corresponding configuration class will be loaded for constructing the pipeline.

.. code-block:: yaml

    pipeline_config_id: runner_config


- **data**:  dataset specifics. One can put multiple dataset specifics in the config file, but only one will be used in one experiment.

    - *[DATASET ID]*: name of the dataset, e.g., taxi.
    - *train_dir, valid_dir, test_dir*: directory of the datafile. For the moment we only accept pkl file (please see `Dataset <./dataset.html>`_ for details)
    - *data_spec*: define the event type information.

.. code-block:: yaml

    data:
      taxi:
        data_format: pkl
        train_dir: ../data/taxi/train.pkl
        valid_dir: ../data/taxi/dev.pkl
        test_dir: ../data/taxi/test.pkl
        data_spec:
            num_event_types: 7  # num of types excluding pad events.
            pad_token_id: 6    # event type index for pad events
            padding_side: right   # pad at the right end of the sequence
            truncation_side: right   # truncate at the right end of the sequence
            max_len: 100            # max sequence length used as model input

- **[EXPERIMENT ID]**: name of the experiment to run in the pipeline. It contains two blocks of configs:

*base_config* contains the pipeline framework related specifications.

.. code-block:: yaml

    base_config:
        stage: train       # train, eval and generate
        backend: tensorflow   # tensorflow and torch
        dataset_id: conttime   # name of the dataset
        runner_id: std_tpp     # registered name of the pipeline runner
        model_id: RMTPP # model name  # registered name of the implemented model
        base_dir: './checkpoints/'   # base dir to save the logs and models.



*model_config* contains the model related specifications.


.. code-block:: yaml

      model_config:
            hidden_size: 32
            time_emb_size: 16
            num_layers: 2
            num_heads: 2
            mc_num_sample_per_step: 20
            sharing_param_layer: False
            loss_integral_num_sample_per_step: 20
            dropout: 0.0
            use_ln: False
            thinning_params:   # thinning algorithm for event sampling
                  num_seq: 10
                  num_sample: 1
                  num_exp: 500 # number of i.i.d. Exp(intensity_bound) draws at one time in thinning algorithm
                  look_ahead_time: 10
                  patience_counter: 5 # the maximum iteration used in adaptive thinning
                  over_sample_rate: 5
                  num_samples_boundary: 5
                  dtime_max: 5


*trainer_config* contains the training related specifications.

.. code-block:: yaml

        trainer_config:   # trainer arguments
            seed: 2019
            gpu: 0
            batch_size: 256
            max_epoch: 10
            shuffle: False
            optimizer: adam
            learning_rate: 1.e-3
            valid_freq: 1
            use_tfb: False
            metrics: ['acc', 'rmse']




A complete example of these files can be seen at *examples/example_config*.


Step 2: Run the training script
===============================================

To run the training process, we simply need to call two functions:

1. ``Config``: it reads the directory of the configs specified in Step 1 and do some processing to form a complete configuration.
2. ``Runner``: it reads the configuration and setups the whole pipeline for training, evaluation and generation.


The following code is an example, which is a copy from *examples/train_nhp.py*.


.. code-block:: python

    import argparse
    from easy_tpp.config_factory import Config
    from easy_tpp.runner import Runner


    def main():
        parser = argparse.ArgumentParser()

        parser.add_argument('--config_dir', type=str, required=False, default='configs/experiment_config.yaml',
                            help='Dir of configuration yaml to train and evaluate the model.')

        parser.add_argument('--experiment_id', type=str, required=False, default='RMTPP_train',
                            help='Experiment id in the config file.')

        args = parser.parse_args()

        config = Config.build_from_yaml_file(args.config_dir, experiment_id=args.experiment_id)

        model_runner = Runner.build_from_config(config)

        model_runner.run()


    if __name__ == '__main__':
        main()





Checkout the output
========================


During training, the log, the best model based on valid set performance, the complete configuration file are all saved. The directory of the saved files is specified in 'base' of ``model_config.yaml``, i.e.,



In the `./checkpoints/` folder, one find the correct subfolder by concatenating the 'experiment_id' and running timestamps. Inside that subfolder, there is a complete configuration file, e.g., ``NHP_train_output.yaml`` that records all the information used in the pipeline. The

.. code-block:: yaml

    data_config:
      train_dir: ../data/conttime/train.pkl
      valid_dir: ../data/conttime/dev.pkl
      test_dir: ../data/conttime/test.pkl
      specs:
        num_event_types_pad: 6
        num_event_types: 5
        event_pad_index: 5
      data_format: pkl
    base_config:
      stage: train
      backend: tensorflow
      dataset_id: conttime
      runner_id: std_tpp
      model_id: RMTPP
      base_dir: ./checkpoints/
      exp_id: RMTPP_train
      log_folder: ./checkpoints/98888_4299965824_221205-153425
      saved_model_dir: ./checkpoints/98888_4299965824_221205-153425/models/saved_model
      saved_log_dir: ./checkpoints/98888_4299965824_221205-153425/log
      output_config_dir: ./checkpoints/98888_4299965824_221205-153425/RMTPP_train_output.yaml
    model_config:
      hidden_size: 32
      time_emb_size: 16
      num_layers: 2
      num_heads: 2
      mc_num_sample_per_step: 20
      sharing_param_layer: false
      loss_integral_num_sample_per_step: 20
      dropout: 0.0
      use_ln: false
      seed: 2019
      gpu: 0
      thinning_params:
        num_seq: 10
        num_sample: 1
        num_exp: 500
        look_ahead_time: 10
        patience_counter: 5
        over_sample_rate: 5
        num_samples_boundary: 5
        dtime_max: 5
        num_step_gen: 1
      trainer:
        batch_size: 256
        max_epoch: 10
        shuffle: false
        optimizer: adam
        learning_rate: 0.001
        valid_freq: 1
        use_tfb: false
        metrics:
        - acc
        - rmse
        seq_pad_end: true
      is_training: true
      num_event_types_pad: 6
      num_event_types: 5
      event_pad_index: 5
      model_id: RMTPP



If we set ``use_tfb`` to ``true``, it means we can launch the tensorboard to track the training process, one
can see `Running Tensorboard <../advanced/tensorboard.html>`_ for details.