Spaces:
Running
Running
COLE Architecture
System Overview
COLE runs as a single Docker container with three services behind an nginx reverse proxy, deployed on HuggingFace Spaces.
graph LR
User([User]) -->|:7860| Nginx
subgraph Docker Container
Nginx -->|/api/*| FastAPI[FastAPI :8000]
Nginx -->|/*| NextJS[Next.js :8001]
FastAPI -->|reads| HF[(HuggingFace\ngraalul/COLE)]
FastAPI -->|writes| Results[(results/*.json)]
end
Backend (FastAPI)
API Endpoints
graph TD
subgraph API
POST[POST /submit] -->|ZIP upload| Validate[Validate format]
Validate -->|OK| Evaluate[Evaluate predictions]
Evaluate -->|Save| JSON[results/uuid.json]
GET_LB[GET /leaderboard] -->|Read all| JSON
GET_H[GET /health] -->|200| OK[status: healthy]
end
| Endpoint | Method | Description |
|---|---|---|
/submit |
POST | Upload predictions ZIP, evaluate, save results |
/leaderboard |
GET | Return all submissions with metrics |
/health |
GET | Health check |
Security
- Rate limiting: 5 submissions/minute per IP (slowapi)
- ZIP validation: Max 50MB compressed, 200MB decompressed
- Input validation: Email (max 320 chars, must contain @), display name (max 200 chars)
- CORS: Open origins (proxied through nginx)
Key Modules
graph TD
API[submission_api.py] --> VT[validation_tools.py]
API --> ST[submit_tools.py]
API --> EV[evaluation.py]
VT --> TN[task_names.py]
EV --> TF[task_factory.py]
TF --> T[task.py]
T --> MF[metric_factory.py]
T --> DS[dataset.py]
MF --> MW[metrics_wrapper.py]
MF --> FQ[fquad_metric.py]
DS --> HF[(HuggingFace)]
Frontend (Next.js)
Pages
graph LR
subgraph Pages
Home[/ Home]
Guide[/guide]
FAQ[/FAQ]
Contact[/contact]
Papers[/papers]
Benchmarks[/benchmarks]
Leaderboard[/leaderboard]
Results[/results/id]
end
subgraph Features
i18n[EN/FR i18n]
Responsive[Mobile responsive]
Pagination[Leaderboard pagination]
Submit[ZIP submission modal]
end
| Page | Description |
|---|---|
/ |
What is COLE, links to paper and GLUE/SuperGLUE |
/guide |
How to train, test, and format submissions |
/FAQ |
6 questions with code formatting support |
/benchmarks |
23 tasks organized by 9 NLU categories |
/leaderboard |
Sortable table, 25/page, loading skeleton, error states |
/papers |
Embedded arxiv PDF viewer |
/results/[id] |
Per-submission detailed results |
/contact |
Email contact |
i18n
Full English and French translations in frontend/src/app/en/translation.json and fr/translation.json. Language switcher in the header persists selection to localStorage.
Evaluation Pipeline
Task Flow
graph TD
Submit[User submits ZIP] --> Unzip[Extract predictions.json]
Unzip --> Validate[Validate task names & format]
Validate --> Factory[task_factory creates Task objects]
Factory --> Compute[Task.compute per task]
Compute --> Dataset[Load ground truths from HF]
Compute --> Metric[metric_factory selects metric]
Metric --> Score[Compute score]
Score --> Save[Save results JSON]
Tasks (30 total)
Grouped by capability:
| Category | Tasks |
|---|---|
| Sentiment | allocine, mms |
| NLI | fracas, gqnli, lingnli, mnli-nineeleven-fr-mt, rte3-french, sickfr, xnli, daccord |
| QA | fquad, french_boolq, piaf |
| Paraphrase | paws_x, qfrblimp |
| Grammar | multiblimp, qfrcola |
| Similarity | sts22 |
| WSD | wsd |
| Quebec French | qfrcore, qfrcort |
| Coreference | wino_x_lm, wino_x_mt |
| Other | frcoe, timeline, lqle, qccp, qccy, qccr, piqafr, piqaqfr |
Metrics
| Metric | Implementation | Used by |
|---|---|---|
| Accuracy | HuggingFace evaluate |
Most classification tasks |
| Pearson | HuggingFace evaluate |
sickfr, sts22 |
| FQuAD | Custom (F1 + Exact Match) | fquad, piaf |
| ExactMatch | Custom string comparison | wsd |
| F1 | HuggingFace evaluate |
Classification variants |
CI/CD Pipeline
graph TD
Push[git push to main] --> F[Formatting\nblack --check]
Push --> L[Linting\npylint src/ tests/]
Push --> T[Tests\npytest]
Push --> FB[Frontend Build\nnpm ci + lint + build]
Push --> HF[HF Sync\nDeploy to Space]
F -->|Python 3.12| Pass
L -->|Python 3.10-3.12| Pass
T -->|Python 3.12\nHF_TOKEN required| Pass
FB -->|Node 20| Pass
HF -->|Orphan branch\nLFS for .jsonl/.pdf| Space[davebulaval/cole]
Deployment
The HF Space deployment uses an orphan branch strategy to handle large .jsonl files in git history:
- Checkout main with LFS
- Create fresh orphan branch
- Track
.jsonland.pdfwith Git LFS - Remove CI/test files not needed in production
- Force push to
davebulaval/coleSpace
The Space builds the Docker image and runs the container with nginx on port 7860.