bharatgraph / CONTRIBUTING.md
abinazebinoy's picture
Add branch strategy section to CONTRIBUTING.md
b23f7c9 unverified
|
Raw
History Blame Contribute Delete
4.21 kB
# Contributing to BharatGraph
Read this before opening a pull request.
---
## Before You Start
Check the open issues before writing code. For any change that adds a new
file, modifies the graph schema, or adds a dependency, open an issue first
to discuss the approach. Small bug fixes can go straight to a pull request.
---
## Branch Strategy
One long-lived branch: `main`. All work happens on short-lived branches
that merge directly into `main`. There is no develop branch.
Naming:
```
feature/phase-N-short-name New phase functionality
fix/issue-N-short-name Bug fix referencing an issue number
docs/short-name Documentation only
test/short-name Tests only
```
Always branch from the latest `main`:
```bash
git checkout main
git pull origin main
git checkout -b feature/phase-5-risk-scoring
```
---
## Commit Messages
Format:
```
type(scope): imperative description under 72 characters
Optional body. Wrap at 72 characters. Explain why, not what.
Closes #12
```
Types: `feat`, `fix`, `docs`, `test`, `chore`, `refactor`.
---
## Python Code Standards
**Style**
Python 3.10 minimum. PEP 8. Maximum line length 88 characters.
**Comments**
No inline comments that describe what the code does. Write self-explanatory
code. Use comments only to explain non-obvious decisions or constraints.
Do not commit commented-out code blocks.
**Logging**
No `print` statements in any module except `__main__` blocks. Use `loguru`
for all diagnostic output.
**Type hints**
All function signatures have type hints. Return types are always annotated.
**Docstrings**
Every class and every public method has a one-line docstring. Multi-line
docstrings use the Google format.
**Environment variables**
All credentials and configuration via `os.getenv()` loaded by `python-dotenv`.
No hardcoded values for any API key, password, or URL that changes between
environments.
**Encoding**
All file reads and writes specify `encoding="utf-8"`. All Python files are
ASCII-safe in their source text. No Unicode currency symbols or special
characters embedded in string literals.
**Imports**
Standard library imports first, then third-party, then local. One import
per line. No wildcard imports.
---
## Adding a New Scraper
Place the file in `scrapers/`. Inherit from `BaseScraper`. The class must
implement a method named `fetch_and_save` that saves to `data/samples/`
and returns the list of records. Add the scraper to `processing/pipeline.py`
under the appropriate method name. Document the source in the README data
sources table.
---
## Adding a New Dependency
Add to `requirements.txt` with a minimum version constraint using `>=`.
Never pin to an exact version with `==` unless there is a specific
incompatibility reason documented in a comment. Verify the package is
available under a licence compatible with MIT.
---
## Pull Request Process
1. Push your branch:
```bash
git push origin feature/phase-5-risk-scoring
```
2. Open a pull request on GitHub targeting `main`.
3. Title format: `type(scope): description`
4. Description must include:
- What the change does
- How to test it locally
- Issues it closes: `Closes #N`
5. Do not merge your own pull request without a review unless you are the
sole contributor.
6. Resolve merge conflicts locally, never in the GitHub web editor:
```bash
git checkout main && git pull origin main
git checkout your-branch && git merge main
# resolve conflicts in your editor
git add . && git commit -m "chore: resolve merge conflicts"
git push origin your-branch
```
---
## What Not to Contribute
- Scrapers targeting login-protected pages, private APIs, or sources not
listed in the approved data sources section of the README.
- Any code that stores personally identifiable information beyond what
appears in official public government filings.
- Language in any output that accuses a named individual of a crime without
a court judgment as the source.
- Dependencies under GPL, AGPL, or other copyleft licences.
- Paid API calls. Every external service used must have a usable free tier.
---
## Questions
Open a GitHub issue with the label `question`.