Spaces:
Sleeping
Sleeping
| title: SQL Error Classifier Training | |
| emoji: π§ | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| license: mit | |
| hardware: t4-small | |
| # SQL Error Classifier β CodeBERT Training Space | |
| Train `microsoft/codebert-base` as a **cross-encoder** for multi-label SQL error classification. | |
| ## Setup | |
| 1. **Hardware:** Settings β Hardware β **GPU t4-small** (recommended) | |
| 2. **Secrets:** Settings β Secrets β add `HF_TOKEN` (Hugging Face write token) to push models to your account | |
| 3. **Data:** Include `data/sql_errors_dev.parquet` in this Space repo, or upload parquet at runtime | |
| ## Usage | |
| 1. Choose bundled dataset or upload your own parquet | |
| 2. Set epochs, batch size, max samples | |
| 3. Click **Start Training** | |
| 4. Optionally enable **Push to Hub** with model id `your-username/sql-codebert-classifier` | |
| ## Dataset columns | |
| Required (aliases supported): | |
| | Column | Aliases | | |
| |--------|---------| | |
| | `question` | β | | |
| | `schema` | β | | |
| | `student_sql` | `query` | | |
| | `correct_sql` | `correct_query` | | |
| | `error_labels` | `label_name` | | |
| ## Labels (9-class multi-label) | |
| `JOIN_ERROR`, `AGGREGATION_ERROR`, `FILTER_ERROR`, `WINDOW_FUNCTION_ERROR`, | |
| `SUBQUERY_ERROR`, `NULL_HANDLING_ERROR`, `PERFORMANCE_ERROR`, `LOGICAL_ERROR`, `SYNTAX_ERROR` | |