seeklhy commited on
Commit
323b79f
·
verified ·
1 Parent(s): a291407

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -16,7 +16,7 @@ tags:
16
  ## Introduction
17
  We present an automatic and scalable text-to-SQL data synthesis framework, illustrated below:
18
  <p align="center">
19
- <img src="assets/framework.png" alt="Description" style="width: 100%; max-width: 600px;"/>
20
  </p>
21
 
22
  Based on this framework, we introduce the first million-scale text-to-SQL dataset, **SynSQL-2.5M**, containing over **2.5 million diverse and high-quality data samples**, spanning more than **16,000 databases from various domains**.
@@ -50,7 +50,7 @@ For more statistics and quality evaluations, refer to our paper. As of March 202
50
  ## Performance Evaluation
51
  We evaluate OmniSQL on a wide range of datasets, including standard benchmarks (Spider and BIRD), challenging domain-specific benchmarks (Spider2.0-SQLite, ScienceBenchmark, EHRSQL), and three robustness benchmarks (Spider-DK, Spider-Syn, Spider-Realistic). The evaluation results are shown below:
52
  <p align="center">
53
- <img src="assets/main_results.png" alt="Description" style="width: 100%; max-width: 800px;"/>
54
  </p>
55
 
56
  "Gre" refers to greedy decoding, and "Maj" indicates major voting at 8. Spider (dev), Spider-Syn, and Spider-Realistic are evaluated using the test-suite accuracy (TS) metric, while the remaining datasets are evaluated using the execution accuracy (EX) metric.
 
16
  ## Introduction
17
  We present an automatic and scalable text-to-SQL data synthesis framework, illustrated below:
18
  <p align="center">
19
+ <img src="framework.png" alt="Description" style="width: 100%; max-width: 600px;"/>
20
  </p>
21
 
22
  Based on this framework, we introduce the first million-scale text-to-SQL dataset, **SynSQL-2.5M**, containing over **2.5 million diverse and high-quality data samples**, spanning more than **16,000 databases from various domains**.
 
50
  ## Performance Evaluation
51
  We evaluate OmniSQL on a wide range of datasets, including standard benchmarks (Spider and BIRD), challenging domain-specific benchmarks (Spider2.0-SQLite, ScienceBenchmark, EHRSQL), and three robustness benchmarks (Spider-DK, Spider-Syn, Spider-Realistic). The evaluation results are shown below:
52
  <p align="center">
53
+ <img src="main_results.png" alt="Description" style="width: 100%; max-width: 800px;"/>
54
  </p>
55
 
56
  "Gre" refers to greedy decoding, and "Maj" indicates major voting at 8. Spider (dev), Spider-Syn, and Spider-Realistic are evaluated using the test-suite accuracy (TS) metric, while the remaining datasets are evaluated using the execution accuracy (EX) metric.