Spaces:
Configuration error
Configuration error
Contributing
Repo layout
src/guichetoi/ Library (importable as `guichetoi.…`)
inference.py LayoutLMv3 classifier + extractor pipeline
recommendation.py Rule engine that verdicts demande complétude
cms.py Pre-fill CMS IMMO 9 BANBOU xlsx from a Verdict
api/main.py FastAPI service wrapping all of the above
scripts/ Training pipeline + batch utilities (CLIs)
apps/ UI applications (Streamlit demo)
tools/ One-off dev / debug scripts
tests/ Pytest suite
assets/ Templates, logos, non-data static files
data/ label_mappings.json (other data dirs are gitignored)
docs/ Internal markdown docs
.github/ CI workflow + PR/issue templates
Local setup
python -m venv .venv
source .venv/bin/activate # or: .venv\Scripts\activate on Windows
pip install -e ".[dev,ui]"
pip install -r requirements.txt # exact pins (optional; for reproducibility)
External requirement: Tesseract OCR with the French language pack
must be on PATH for inference to work.
Branch strategy
GitHub Flow:
mainis always deployable. Protected: requires PR + green CI to merge.- One topic branch per work unit. Naming:
feature/<short-slug>— new capabilityfix/<short-slug>— bug fixchore/<short-slug>— refactor, infra, depsdocs/<short-slug>— docs only
- Squash-merge into
main. Delete the branch after merge.
Workflow
git checkout main && git pull
git checkout -b feature/my-thing
# … edits …
pytest -q # local sanity check
ruff check src/ tests/ # lint
mypy --config-file mypy.ini src/guichetoi/cms.py src/guichetoi/recommendation.py
git push -u origin feature/my-thing
gh pr create # or open via github.com
CI runs lint + tests on every PR. Both must be green to merge.
What never goes in git
- Customer documents (
DataSet1/,DataSet2/,DataRef/) - Real extracted PII (
assets/sample_verdicts.json) - Model weights (
models/) - Label Studio raw exports (
project-*-at-*.json)
.gitignore enforces these — when in doubt, check git status before
committing and never use git add -f to override an ignore rule.