SpadaLab

French legal datasets, curated for AI builders.

SpadaLab produces vector-ready French legal datasets for builders of legal AI assistants, RAG pipelines, and decision-support tools — with a focus on the realities of micro-entrepreneurs and regulated professions.

🇫🇷 SpadaLab produit des datasets juridiques francais prets a vectoriser pour les developpeurs d'assistants IA, pipelines RAG et outils d'aide a la decision.


Why SpadaLab

French micro-entrepreneurs and small business owners face a wall of legal complexity (URSSAF, RGPD, accessibility, food safety, trade regulations…) — most of it scattered across Legifrance and EU CELLAR in formats that don't fit modern AI pipelines.

We curate, clean, and chunk this content into production-ready datasets that any team can drop into their vector store of choice (Qdrant, Weaviate, pgvector, etc.) — no embedding lock-in.

What we ship

Four commercial packs (launching May 2026) :

Pack Coverage Format
Micro-Entrepreneur Complet 7 collections : CGI, LPF, Code commerce, Conso, Securite sociale, CNIL, RGPD JSON / Parquet
Artisanat Reglemente Code de l'artisanat + LODA related JSON / Parquet
HACCP & Hygiene Alimentaire EU Reg. 178/2002, 852/2004, 853/2004 + LODA JSON / Parquet
Accessibilite PMR LODA accessibility (ERP, batiments) JSON / Parquet

Format : pre-chunked text + metadata + Gebru-style datasheets. No embeddings included by default — you bring your own model (avoids OpenAI lock-in).

Sample datasets (CC BY-NC 4.0) and gated full datasets (custom commercial license) will be published here.

How we work

  • Source-of-truth-only : Legifrance PISTE Production API + EU CELLAR (not scraped websites)
  • Reproducible pipelines : every dataset shipped with manifest, SHA-256 hashes, ingestion scripts
  • Versioned : semantic versioning per dataset, transparent changelog
  • Local-first AI : built using on-premise models (Ollama) where appropriate

Stay in touch

  • Email : contact@spadalab.fr
  • Website : coming soon
  • Datasets marketplace : also available on Datarade (coming soon)

SpadaLab is a French micro-enterprise (SIREN 103 696 993) based in France.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support