Spaces:
Sleeping
Sleeping
File size: 1,268 Bytes
5c948aa 9b2cded 8464aea 5c948aa 9b2cded 5c948aa 9b2cded | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | ---
title: SQL Error Classifier Training
emoji: 🧠
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
license: mit
hardware: t4-small
---
# SQL Error Classifier — CodeBERT Training Space
Train `microsoft/codebert-base` as a **cross-encoder** for multi-label SQL error classification.
## Setup
1. **Hardware:** Settings → Hardware → **GPU t4-small** (recommended)
2. **Secrets:** Settings → Secrets → add `HF_TOKEN` (Hugging Face write token) to push models to your account
3. **Data:** Include `data/sql_errors_dev.parquet` in this Space repo, or upload parquet at runtime
## Usage
1. Choose bundled dataset or upload your own parquet
2. Set epochs, batch size, max samples
3. Click **Start Training**
4. Optionally enable **Push to Hub** with model id `your-username/sql-codebert-classifier`
## Dataset columns
Required (aliases supported):
| Column | Aliases |
|--------|---------|
| `question` | — |
| `schema` | — |
| `student_sql` | `query` |
| `correct_sql` | `correct_query` |
| `error_labels` | `label_name` |
## Labels (9-class multi-label)
`JOIN_ERROR`, `AGGREGATION_ERROR`, `FILTER_ERROR`, `WINDOW_FUNCTION_ERROR`,
`SUBQUERY_ERROR`, `NULL_HANDLING_ERROR`, `PERFORMANCE_ERROR`, `LOGICAL_ERROR`, `SYNTAX_ERROR`
|