Spaces:

betterdataai
/

IRG

Running

File size: 1,437 Bytes

---
title: IRG
emoji: "🖥️"
colorFrom: "blue"
colorTo: "green"
sdk: static
sdk_version: "0.0.1"
pinned: false
---

# Incremental Relational Generator (IRG)

## Pre-requisites

1. `Python>=3.10`.
2. Dependencies `pip install -r requirements.txt`.

## Important Notes

The project is now in commercial usage, so the full code is not disclosed. This repository aims to provide
the code skeleton and reproduces the main logic for IRG. It is still executable, but some functions are filled with 
placeholders that do not do the same thing as what is described in the paper, and some core parameters are not exposed.

Methods that are filled by placeholders are annotated with `@placeholder`. 

The evaluation framework is also not exposed for the same reason.

## Quick Start

To train and generate a synthetic database, run the following command:

```shell
python main.py -c CONFIG_FILE -i DATA_DIR -o OUT_DIR
```

where `CONFIG_FILE` is the configuration yaml file, with a sample in `config/sample.yaml`, `DATA_DIR` is the directory
where real data is stored, and each table name mentioned in the config should have the file `TABLE_NAME.csv` found
in this directory, and `OUT_DIR` be the directory where trained results can be found.

The generated synthetic data can be found in `OUT_DIR/generated` csv files.

## Reproduction

Commands and scripts to reproduce the results shown in our paper, the commands are shown in the `Makefile`.