| .. currentmodule:: socceraction.data | |
| ************* | |
| Loading data | |
| ************* | |
| Socceraction provides API clients for various popular event stream data | |
| sources. These clients enable fetching event streams and their corresponding | |
| metadata as Pandas DataFrames using a unified data model. | |
| Alternatively, you can also use `kloppy <https://kloppy.pysport.org/>`__ to | |
| load data. | |
| Loading data with socceraction | |
| ============================== | |
| All API clients implemented in socceraction inherit from the | |
| :class:`~base.EventDataLoader` interface. This interface provides the | |
| following methods to retrieve data as a Pandas DataFrames with a unified data | |
| model (i.e., :class:`~pandera.Schema`). The schema defines the minimal set of | |
| columns and their types that are returned by each method. Implementations of | |
| the :class:`~base.EventDataLoader` interface may add additional columns. | |
| .. list-table:: | |
| :widths: 40 20 40 | |
| :header-rows: 1 | |
| * - Method | |
| - Output schema | |
| - Description | |
| * - :meth:`competitions() <base.EventDataLoader.competitions>` | |
| - :class:`~schema.CompetitionSchema` | |
| - All available competitions and seasons | |
| * - :meth:`games(competition_id, season_id) <base.EventDataLoader.games>` | |
| - :class:`~schema.GameSchema` | |
| - All available games in a season | |
| * - :meth:`teams(game_id) <base.EventDataLoader.teams>` | |
| - :class:`~schema.TeamSchema` | |
| - Both teams that participated in a game | |
| * - :meth:`players(game_id) <base.EventDataLoader.players>` | |
| - :class:`~schema.PlayerSchema` | |
| - All players that participated in a game | |
| * - :meth:`events(game_id) <base.EventDataLoader.events>` | |
| - :class:`~schema.EventSchema` | |
| - The event stream of a game | |
| Currently, the following data providers are supported: | |
| .. toctree:: | |
| :maxdepth: 1 | |
| statsbomb | |
| wyscout | |
| opta | |
| Loading data with kloppy | |
| ========================= | |
| Similarly to socceraction, `kloppy <https://kloppy.pysport.org/>`__ implements | |
| a unified data model for soccer data. The main differences between kloppy and | |
| socceraction are: (1) kloppy supports more data sources (including tracking | |
| data), (2) kloppy uses a more flexible object-based data model in contrast to | |
| socceraction's dataframe-based model, and (3) kloppy covers a more complete | |
| set of events while socceraction focuses on-the-ball events. Thus, we recommend | |
| using kloppy if you want to load data from a source that is not supported by | |
| socceraction or when your analysis is not limited to on-the-ball events. | |
| The following code snippet shows how to load data from StatsBomb using | |
| kloppy:: | |
| from kloppy import statsbomb | |
| dataset = statsbomb.load_open_data(match_id=8657) | |
| Instructions for loading data from other sources can be found in the | |
| `kloppy documentation <https://kloppy.pysport.org/>`__. | |
| You can then convert the data to the SPADL format using the | |
| :func:`~socceraction.spadl.kloppy.convert_to_actions` function:: | |
| from socceraction.spadl.kloppy import convert_to_actions | |
| spadl_actions = convert_to_actions(dataset, game_id=8657) | |
| .. note:: | |
| Currently, the data model of kloppy is only complete for StatsBomb data. | |
| If you use kloppy to load data from other sources and convert it to the | |
| SPADL format, you may lose some information. | |