Spaces:
Running
Contributing to BharatGraph
Read this before opening a pull request.
Before You Start
Check the open issues before writing code. For any change that adds a new file, modifies the graph schema, or adds a dependency, open an issue first to discuss the approach. Small bug fixes can go straight to a pull request.
Branch Strategy
One long-lived branch: main. All work happens on short-lived branches
that merge directly into main. There is no develop branch.
Naming:
feature/phase-N-short-name New phase functionality
fix/issue-N-short-name Bug fix referencing an issue number
docs/short-name Documentation only
test/short-name Tests only
Always branch from the latest main:
git checkout main
git pull origin main
git checkout -b feature/phase-5-risk-scoring
Commit Messages
Format:
type(scope): imperative description under 72 characters
Optional body. Wrap at 72 characters. Explain why, not what.
Closes #12
Types: feat, fix, docs, test, chore, refactor.
Python Code Standards
Style Python 3.10 minimum. PEP 8. Maximum line length 88 characters.
Comments No inline comments that describe what the code does. Write self-explanatory code. Use comments only to explain non-obvious decisions or constraints. Do not commit commented-out code blocks.
Logging
No print statements in any module except __main__ blocks. Use loguru
for all diagnostic output.
Type hints All function signatures have type hints. Return types are always annotated.
Docstrings Every class and every public method has a one-line docstring. Multi-line docstrings use the Google format.
Environment variables
All credentials and configuration via os.getenv() loaded by python-dotenv.
No hardcoded values for any API key, password, or URL that changes between
environments.
Encoding
All file reads and writes specify encoding="utf-8". All Python files are
ASCII-safe in their source text. No Unicode currency symbols or special
characters embedded in string literals.
Imports Standard library imports first, then third-party, then local. One import per line. No wildcard imports.
Adding a New Scraper
Place the file in scrapers/. Inherit from BaseScraper. The class must
implement a method named fetch_and_save that saves to data/samples/
and returns the list of records. Add the scraper to processing/pipeline.py
under the appropriate method name. Document the source in the README data
sources table.
Adding a New Dependency
Add to requirements.txt with a minimum version constraint using >=.
Never pin to an exact version with == unless there is a specific
incompatibility reason documented in a comment. Verify the package is
available under a licence compatible with MIT.
Pull Request Process
Push your branch:
git push origin feature/phase-5-risk-scoringOpen a pull request on GitHub targeting
main.Title format:
type(scope): descriptionDescription must include:
- What the change does
- How to test it locally
- Issues it closes:
Closes #N
Do not merge your own pull request without a review unless you are the sole contributor.
Resolve merge conflicts locally, never in the GitHub web editor:
git checkout main && git pull origin main git checkout your-branch && git merge main # resolve conflicts in your editor git add . && git commit -m "chore: resolve merge conflicts" git push origin your-branch
What Not to Contribute
- Scrapers targeting login-protected pages, private APIs, or sources not listed in the approved data sources section of the README.
- Any code that stores personally identifiable information beyond what appears in official public government filings.
- Language in any output that accuses a named individual of a crime without a court judgment as the source.
- Dependencies under GPL, AGPL, or other copyleft licences.
- Paid API calls. Every external service used must have a usable free tier.
Questions
Open a GitHub issue with the label question.