Spaces:
Configuration error
Configuration error
| # Contributing | |
| ## Repo layout | |
| ``` | |
| src/guichetoi/ Library (importable as `guichetoi.…`) | |
| inference.py LayoutLMv3 classifier + extractor pipeline | |
| recommendation.py Rule engine that verdicts demande complétude | |
| cms.py Pre-fill CMS IMMO 9 BANBOU xlsx from a Verdict | |
| api/main.py FastAPI service wrapping all of the above | |
| scripts/ Training pipeline + batch utilities (CLIs) | |
| apps/ UI applications (Streamlit demo) | |
| tools/ One-off dev / debug scripts | |
| tests/ Pytest suite | |
| assets/ Templates, logos, non-data static files | |
| data/ label_mappings.json (other data dirs are gitignored) | |
| docs/ Internal markdown docs | |
| .github/ CI workflow + PR/issue templates | |
| ``` | |
| ## Local setup | |
| ```bash | |
| python -m venv .venv | |
| source .venv/bin/activate # or: .venv\Scripts\activate on Windows | |
| pip install -e ".[dev,ui]" | |
| pip install -r requirements.txt # exact pins (optional; for reproducibility) | |
| ``` | |
| External requirement: **Tesseract OCR with the French language pack** | |
| must be on `PATH` for inference to work. | |
| ## Branch strategy | |
| GitHub Flow: | |
| - `main` is always deployable. Protected: requires PR + green CI to merge. | |
| - One topic branch per work unit. Naming: | |
| - `feature/<short-slug>` — new capability | |
| - `fix/<short-slug>` — bug fix | |
| - `chore/<short-slug>` — refactor, infra, deps | |
| - `docs/<short-slug>` — docs only | |
| - Squash-merge into `main`. Delete the branch after merge. | |
| ## Workflow | |
| ```bash | |
| git checkout main && git pull | |
| git checkout -b feature/my-thing | |
| # … edits … | |
| pytest -q # local sanity check | |
| ruff check src/ tests/ # lint | |
| mypy --config-file mypy.ini src/guichetoi/cms.py src/guichetoi/recommendation.py | |
| git push -u origin feature/my-thing | |
| gh pr create # or open via github.com | |
| ``` | |
| CI runs lint + tests on every PR. Both must be green to merge. | |
| ## What never goes in git | |
| - Customer documents (`DataSet1/`, `DataSet2/`, `DataRef/`) | |
| - Real extracted PII (`assets/sample_verdicts.json`) | |
| - Model weights (`models/`) | |
| - Label Studio raw exports (`project-*-at-*.json`) | |
| `.gitignore` enforces these — when in doubt, check `git status` before | |
| committing and never use `git add -f` to override an ignore rule. | |