SpadaLab
French legal datasets, curated for AI builders.
SpadaLab produces vector-ready French legal datasets for builders of legal AI assistants, RAG pipelines, and decision-support tools — with a focus on the realities of micro-entrepreneurs and regulated professions.
🇫🇷 SpadaLab produit des datasets juridiques francais prets a vectoriser pour les developpeurs d'assistants IA, pipelines RAG et outils d'aide a la decision.
Why SpadaLab
French micro-entrepreneurs and small business owners face a wall of legal complexity (URSSAF, RGPD, accessibility, food safety, trade regulations…) — most of it scattered across Legifrance and EU CELLAR in formats that don't fit modern AI pipelines.
We curate, clean, and chunk this content into production-ready datasets that any team can drop into their vector store of choice (Qdrant, Weaviate, pgvector, etc.) — no embedding lock-in.
What we ship
Four commercial packs (launching May 2026) :
| Pack | Coverage | Format |
|---|---|---|
| Micro-Entrepreneur Complet | 7 collections : CGI, LPF, Code commerce, Conso, Securite sociale, CNIL, RGPD | JSON / Parquet |
| Artisanat Reglemente | Code de l'artisanat + LODA related | JSON / Parquet |
| HACCP & Hygiene Alimentaire | EU Reg. 178/2002, 852/2004, 853/2004 + LODA | JSON / Parquet |
| Accessibilite PMR | LODA accessibility (ERP, batiments) | JSON / Parquet |
Format : pre-chunked text + metadata + Gebru-style datasheets. No embeddings included by default — you bring your own model (avoids OpenAI lock-in).
Sample datasets (CC BY-NC 4.0) and gated full datasets (custom commercial license) will be published here.
How we work
- Source-of-truth-only : Legifrance PISTE Production API + EU CELLAR (not scraped websites)
- Reproducible pipelines : every dataset shipped with manifest, SHA-256 hashes, ingestion scripts
- Versioned : semantic versioning per dataset, transparent changelog
- Local-first AI : built using on-premise models (Ollama) where appropriate
Stay in touch
- Email :
contact@spadalab.fr - Website : coming soon
- Datasets marketplace : also available on Datarade (coming soon)
SpadaLab is a French micro-enterprise (SIREN 103 696 993) based in France.