File size: 2,893 Bytes
f43af3c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
====================
Quick Start
====================


We use the [Taxi]_ dataset as an example to show how to use ``EasyTPP`` to train a model. More details and results are provided in `Training Pipeline <../user_guide/run_train_pipeline.html>`_.


Download Dataset
===================



The Taxi dataset we used is preprocessed by `HYPRO <https://github.com/iLampard/hypro_tpp>`_ . You can either download the dataset (in pickle) from Google Drive `here <https://drive.google.com/drive/folders/1vNX2gFuGfhoh-vngoebaQlj2-ZIZMiBo>`_ or the dataset (in json) from `HuggingFace <https://huggingface.co/easytpp>`_.


Note that if the data sources are pickle files, we need to write the data config (in `Example Config <https://github.com/ant-research/EasyTemporalPointProcess/blob/main/examples/configs/experiment_config.yaml>`_) in the following way

.. code-block:: yaml

    data:
      taxi:
        data_format: pickle
        train_dir: ./data/taxi/train.pkl
        valid_dir: ./data/taxi/dev.pkl
        test_dir: ./data/taxi/test.pkl

If we choose to directly load from HuggingFace, we can put it this way:

.. code-block:: yaml

    data:
      taxi:
        data_format: json
        train_dir: easytpp/taxi
        valid_dir: easytpp/taxi
        test_dir: easytpp/taxi


Meanwhile, it is also feasible to put the local directory of json files downloaded from HuggingFace in the config:

.. code-block:: yaml

    data:
      taxi:
        data_format: json
        train_dir: ./data/taxi/train.json
        valid_dir: ./data/taxi/dev.json
        test_dir: ./data/taxi/test.json




Setup the configuration file
==============================

We provide a preset config file in `Example Config <https://github.com/ant-research/EasyTemporalPointProcess/blob/main/examples/configs/experiment_config.yaml>`_. The details of the configuration can be found in `Training Pipeline <../user_guide/run_train_pipeline.html>`_.




Train the Model
=========================

At this stage we need to write a script to run the training pipeline. There is a preset script `train_nhp.py <https://github.com/ant-research/EasyTemporalPointProcess/blob/main/examples/train_nhp.py>`_ and one can simply copy it.

Taking the pickle data source for example, after the setup of data, config and running script, the directory structure is as follows:

.. code-block:: bash

    data
     |______taxi
             |____ train.pkl
             |____ dev.pkl
             |____ test.pkl

    configs
     |______experiment_config.yaml

     train_nhp.py



The one can simply run the following command.


.. code-block:: bash

    python train_nhp.py



Reference
----------

.. [Taxi]

.. code-block:: bash

    @misc{whong-14-taxi,
      title = {F{OIL}ing {NYC}’s Taxi Trip Data},
      author={Whong, Chris},
      year = {2014},
      url = {https://chriswhong.com/open-data/foil_nyc_taxi/}
    }