You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

SchemaSage-SQL Model Card

This is a pre-release model card for the SchemaSage-SQL project. A 10-step QLoRA smoke adapter has been trained and uploaded to validate the cloud training and Hugging Face upload path. A release-quality adapter has not been trained yet.

Model Details

Project name: SchemaSage-SQL
Model status: cloud smoke adapter trained; release model not trained yet
Planned base model: configurable through configs/model.yaml
Current default base model: Qwen/Qwen3-4B-Instruct-2507
Planned method: supervised fine-tuning with LoRA or QLoRA
Release format: adapter-first, optional merged model later
Smoke adapter: rishhh/schemasage-sql-qwen3-4b-smoke

Intended Use

SchemaSage-SQL is intended for:

Research and prototyping around Text-to-SQL.
Portfolio demonstration of LLM data, training, evaluation, and deployment engineering.
Generating draft read-only analytical SQL from a schema and natural-language question.

Generated SQL should be reviewed before execution.

Out-of-Scope Use

SchemaSage-SQL is not intended for:

Autonomous execution against production databases.
Destructive database operations.
Legal, financial, medical, or compliance-critical decision making.
Credential extraction, filesystem access, or network exfiltration.
SQL generation without a provided schema.

Datasets

Current data pipeline uses public Hugging Face datasets:

gretelai/synthetic_text_to_sql
b-mc2/sql-create-context

Current processed split summary:

Split	Examples
Train	151,723
Validation	16,858
Test	5,248

Processing removed:

Empty or incomplete examples.
Exact duplicates.
Destructive SQL examples by default.
Sample INSERT rows from schema context where source datasets included data values.

The validation split is a deterministic train holdout because the selected sources do not provide a native validation split.

Training Procedure

Training is implemented in src/training/train_sft.py but has not been run for a final adapter. A tiny local trainer smoke test was completed with sshleifer/tiny-gpt2; that artifact is not a release model and should not be used for model-quality claims.

A cloud QLoRA smoke run completed on Hugging Face Jobs:

Field	Value
Job ID	`6a0c94bd2dc5b1243da4ffee`
Hardware	`a10g-large`
Base model	`Qwen/Qwen3-4B-Instruct-2507`
Train rows	112
Eval rows	16
Steps	10
Final train loss	1.908
Final eval loss	1.188
Adapter repo	`rishhh/schemasage-sql-qwen3-4b-smoke`

Smoke adapter commit:

https://huggingface.co/rishhh/schemasage-sql-qwen3-4b-smoke/commit/6a8241d2f6cc3f0a917ba527f2f32b0cc4bf9933

This smoke adapter validates the training and upload path. It is not a final model-quality release.

Planned command:

python -m src.training.train_sft --config configs/train_qlora.yaml

Smoke/dry-run validation command:

python -m src.training.train_sft --config configs/train_qlora.yaml --smoke-test --dry-run

Apple Silicon local development should use:

python -m src.training.train_sft --config configs/train_local_smoke.yaml --smoke-test --dry-run

Trainer-path validation command that has been run locally:

python -m src.training.train_sft --config configs/train_local_smoke.yaml --smoke-test

Evaluation

The full local reference-answer evaluation is a sanity check for the evaluator and processed data, not trained-model performance.

Metric	Value
Exact match	1.0000
Normalized exact match	1.0000
SQL parse validity	0.9998
Schema adherence rate	0.8994
Hallucinated table rate	0.0494
Hallucinated column rate	0.0878
Unsafe query rate	0.0071
Execution accuracy	1.0000
Execution comparable examples	4,007

These metrics validate the evaluator and processed reference data. They should not be presented as model quality.

The uploaded smoke adapter was also evaluated on 64 held-out examples from gretelai/synthetic_text_to_sql test. These metrics describe the 10-step smoke adapter only:

Metric	Value
Exact match	0.2188
Normalized exact match	0.2344
SQL parse validity	1.0000
Schema adherence rate	0.9688
Hallucinated table rate	0.0156
Hallucinated column rate	0.0312
Unsafe query rate	0.0000
Execution accuracy	0.8409
Execution comparable examples	44
Mean generated SQL length	11.80
Mean latency seconds	23.57

Evaluation artifacts:

evaluation/smoke_64/eval_results.json
evaluation/smoke_64/eval_report.md
evaluation/smoke_64/predictions.jsonl
evaluation/smoke_64/metrics_overview.svg
evaluation/smoke_64/risk_rates.svg

Safety Policy

The project defaults to read-only analytical SQL.

The safety layer blocks or refuses:

DROP
DELETE
TRUNCATE
ALTER
UPDATE
INSERT
MERGE
REPLACE
CREATE DATABASE
CREATE USER
GRANT
REVOKE
EXEC and EXECUTE
CALL
COPY, LOAD, and UNLOAD
Multiple SQL statements
Prompt-injection-like requests to ignore safety or schema rules

Safety checks are implemented in src/inference/safety.py using lexical checks plus sqlglot parsing.

Limitations

No release trained adapter is available yet.
The uploaded Qwen3 4B smoke adapter exists only to validate the cloud trainer and Hub upload path.
A tiny local smoke adapter exists only to validate the local trainer path.
Real smoke-adapter predictions have been evaluated on a small held-out subset, but final release-model predictions have not been evaluated yet.
The smoke adapter is undertrained; exact match is low despite good parse validity and schema adherence.
The model can continue generating prompt-like text after the canonical response, so inference uses canonical-response parsing and should add stricter stop criteria.
QLoRA training is expected to require CUDA-capable GPU hardware.
Apple Silicon is suitable for development, tests, data prep, dry-runs, and the Gradio app, but not the default 4-bit QLoRA path.
SQL equivalence is difficult; exact match is not enough to judge model quality.
SQLite execution evaluation skips dialect-incompatible examples.
Schema adherence checks can still miss complex aliasing or dialect-specific constructs.

Example Usage

python -m src.inference.generate_sql \
  --schema-file data/samples/schema.sql \
  --question "Which product categories had the highest revenue in Q4 2025?" \
  --dry-run

Model-backed inference requires a configured model or adapter path.

Hardware Notes

The repository was developed and validated locally on a MacBook Pro M5 Pro environment using Python 3.11. Full QLoRA training should be run on CUDA GPU hardware or a managed GPU environment.

How to Improve

Run a longer QLoRA training job on the full normalized train split or a carefully filtered high-quality subset.
Add Spider-style held-out execution evaluation.
Add refusal and unanswerable-question training examples.
Add batched inference and stop strings to reduce latency and trailing prompt continuation.
Publish final metrics only after evaluating generated predictions, not references.

License

Project code is MIT licensed. Dataset and base-model licenses must be reviewed before publishing trained artifacts.

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rishhh/schemasage-sql-qwen3-4b-smoke

Base model

Qwen/Qwen3-4B-Instruct-2507

Adapter

(5504)

this model

rishhh
/

schemasage-sql-qwen3-4b-smoke