FiberGate / CONTRIBUTING.md
AzizMiladi's picture
chore: scaffold src/guichetoi package, pyproject.toml, CI, PR/issue templates
bbc5f0a
|
Raw
History Blame
2.28 kB

Contributing

Repo layout

src/guichetoi/      Library (importable as `guichetoi.…`)
  inference.py      LayoutLMv3 classifier + extractor pipeline
  recommendation.py Rule engine that verdicts demande complétude
  cms.py            Pre-fill CMS IMMO 9 BANBOU xlsx from a Verdict
  api/main.py       FastAPI service wrapping all of the above

scripts/            Training pipeline + batch utilities (CLIs)
apps/               UI applications (Streamlit demo)
tools/              One-off dev / debug scripts
tests/              Pytest suite
assets/             Templates, logos, non-data static files
data/               label_mappings.json (other data dirs are gitignored)
docs/               Internal markdown docs
.github/            CI workflow + PR/issue templates

Local setup

python -m venv .venv
source .venv/bin/activate   # or: .venv\Scripts\activate on Windows
pip install -e ".[dev,ui]"
pip install -r requirements.txt   # exact pins (optional; for reproducibility)

External requirement: Tesseract OCR with the French language pack must be on PATH for inference to work.

Branch strategy

GitHub Flow:

  • main is always deployable. Protected: requires PR + green CI to merge.
  • One topic branch per work unit. Naming:
    • feature/<short-slug> — new capability
    • fix/<short-slug> — bug fix
    • chore/<short-slug> — refactor, infra, deps
    • docs/<short-slug> — docs only
  • Squash-merge into main. Delete the branch after merge.

Workflow

git checkout main && git pull
git checkout -b feature/my-thing
# … edits …
pytest -q                # local sanity check
ruff check src/ tests/   # lint
mypy --config-file mypy.ini src/guichetoi/cms.py src/guichetoi/recommendation.py
git push -u origin feature/my-thing
gh pr create             # or open via github.com

CI runs lint + tests on every PR. Both must be green to merge.

What never goes in git

  • Customer documents (DataSet1/, DataSet2/, DataRef/)
  • Real extracted PII (assets/sample_verdicts.json)
  • Model weights (models/)
  • Label Studio raw exports (project-*-at-*.json)

.gitignore enforces these — when in doubt, check git status before committing and never use git add -f to override an ignore rule.