seeklhy
/

OmniSQL-7B

Model card Files Files and versions

seeklhy commited on Mar 6, 2025

Commit

323b79f

·

verified ·

1 Parent(s): a291407

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ tags:
 ## Introduction
 We present an automatic and scalable text-to-SQL data synthesis framework, illustrated below:
 <p align="center">
-  <img src="assets/framework.png" alt="Description" style="width: 100%; max-width: 600px;"/>
 </p>
 Based on this framework, we introduce the first million-scale text-to-SQL dataset, **SynSQL-2.5M**, containing over **2.5 million diverse and high-quality data samples**, spanning more than **16,000 databases from various domains**.
@@ -50,7 +50,7 @@ For more statistics and quality evaluations, refer to our paper. As of March 202
 ## Performance Evaluation
 We evaluate OmniSQL on a wide range of datasets, including standard benchmarks (Spider and BIRD), challenging domain-specific benchmarks (Spider2.0-SQLite, ScienceBenchmark, EHRSQL), and three robustness benchmarks (Spider-DK, Spider-Syn, Spider-Realistic). The evaluation results are shown below:
 <p align="center">
-  <img src="assets/main_results.png" alt="Description" style="width: 100%; max-width: 800px;"/>
 </p>
 "Gre" refers to greedy decoding, and "Maj" indicates major voting at 8. Spider (dev), Spider-Syn, and Spider-Realistic are evaluated using the test-suite accuracy (TS) metric, while the remaining datasets are evaluated using the execution accuracy (EX) metric.

 ## Introduction
 We present an automatic and scalable text-to-SQL data synthesis framework, illustrated below:
 <p align="center">
+  <img src="framework.png" alt="Description" style="width: 100%; max-width: 600px;"/>
 </p>
 Based on this framework, we introduce the first million-scale text-to-SQL dataset, **SynSQL-2.5M**, containing over **2.5 million diverse and high-quality data samples**, spanning more than **16,000 databases from various domains**.
 ## Performance Evaluation
 We evaluate OmniSQL on a wide range of datasets, including standard benchmarks (Spider and BIRD), challenging domain-specific benchmarks (Spider2.0-SQLite, ScienceBenchmark, EHRSQL), and three robustness benchmarks (Spider-DK, Spider-Syn, Spider-Realistic). The evaluation results are shown below:
 <p align="center">
+  <img src="main_results.png" alt="Description" style="width: 100%; max-width: 800px;"/>
 </p>
 "Gre" refers to greedy decoding, and "Maj" indicates major voting at 8. Spider (dev), Spider-Syn, and Spider-Realistic are evaluated using the test-suite accuracy (TS) metric, while the remaining datasets are evaluated using the execution accuracy (EX) metric.