BioRLHF / CONTRIBUTING.md

Phase 4: V1-aware calibration verifier, eval tools, cleanup

2145d80 about 1 month ago

5.39 kB

	# Contributing to BioRLHF

	Thank you for your interest in contributing to BioRLHF! This document provides guidelines and instructions for contributing.

	## Table of Contents

	- [Code of Conduct](#code-of-conduct)
	- [Getting Started](#getting-started)
	- [Development Setup](#development-setup)
	- [Making Changes](#making-changes)
	- [Testing](#testing)
	- [Submitting Changes](#submitting-changes)
	- [Style Guidelines](#style-guidelines)

	## Code of Conduct

	Please be respectful and constructive in all interactions. We welcome contributors of all backgrounds and experience levels.

	## Getting Started

	1. Fork the repository on GitHub
	2. Clone your fork locally:
	```bash
	git clone https://github.com/YOUR_USERNAME/BioRLHF.git
	cd BioRLHF
	```
	3. Add upstream remote:
	```bash
	git remote add upstream https://github.com/jang1563/BioRLHF.git
	```

	## Development Setup

	### Prerequisites

	- Python 3.9 or higher
	- CUDA-compatible GPU (recommended for training)
	- Git

	### Installation

	1. Create a virtual environment:
	```bash
	python -m venv venv
	source venv/bin/activate # On Windows: venv\Scripts\activate
	```

	2. Install the package in development mode with all dependencies:
	```bash
	pip install -e ".[dev]"
	```

	3. Install pre-commit hooks:
	```bash
	pre-commit install
	```

	### Verify Installation

	```bash
	# Run tests
	pytest

	# Check code formatting
	black --check src/ tests/
	ruff check src/ tests/
	```

	## Making Changes

	### Branch Naming

	Create a descriptive branch for your changes:

	- `feature/description` - New features
	- `fix/description` - Bug fixes
	- `docs/description` - Documentation updates
	- `refactor/description` - Code refactoring

	Example:
	```bash
	git checkout -b feature/add-new-evaluation-metric
	```

	### Commit Messages

	Write clear, concise commit messages:

	- Use the present tense ("Add feature" not "Added feature")
	- Use the imperative mood ("Move cursor to..." not "Moves cursor to...")
	- Limit the first line to 72 characters
	- Reference issues when applicable

	Example:
	```
	Add calibration accuracy metric to evaluation module

	- Implement uncertainty detection in model responses
	- Add tests for calibration scoring
	- Update documentation with new metric

	Closes #42
	```

	## Testing

	### Running Tests

	```bash
	# Run all tests
	pytest

	# Run with coverage
	pytest --cov=biorlhf --cov-report=html

	# Run specific test file
	pytest tests/test_dataset.py

	# Run tests matching a pattern
	pytest -k "test_evaluation"
	```

	### Writing Tests

	- Place tests in the `tests/` directory
	- Mirror the source structure (e.g., `src/biorlhf/data/dataset.py` → `tests/test_dataset.py`)
	- Use descriptive test names
	- Include docstrings explaining what the test verifies

	Example:
	```python
	def test_load_dataset_returns_expected_format():
	"""Verify that load_dataset returns a HuggingFace Dataset object."""
	dataset = load_dataset("kmp_sft_final.json")
	assert isinstance(dataset, Dataset)
	assert "text" in dataset.column_names
	```

	## Submitting Changes

	### Before Submitting

	1. Sync with upstream:
	```bash
	git fetch upstream
	git rebase upstream/main
	```

	2. Run all checks:
	```bash
	# Format code
	black src/ tests/

	# Check linting
	ruff check src/ tests/

	# Run tests
	pytest
	```

	3. Update documentation if needed

	### Pull Request Process

	1. Push your branch to your fork:
	```bash
	git push origin feature/your-feature
	```

	2. Open a Pull Request on GitHub

	3. Fill in the PR template with:
	- Description of changes
	- Related issue numbers
	- Testing performed
	- Screenshots (if UI changes)

	4. Wait for review and address feedback

	### Review Checklist

	- [ ] Code follows style guidelines
	- [ ] Tests pass locally
	- [ ] New code has appropriate test coverage
	- [ ] Documentation is updated
	- [ ] Commit messages are clear

	## Style Guidelines

	### Python Code Style

	We use [Black](https://black.readthedocs.io/) for code formatting and [Ruff](https://docs.astral.sh/ruff/) for linting.

	Key conventions:
	- Line length: 88 characters (Black default)
	- Use type hints where practical
	- Write docstrings for public functions and classes
	- Use meaningful variable names

	### Docstring Format

	Use Google-style docstrings:

	```python
	def evaluate_model(model_path: str, test_data: str) -> dict:
	"""Evaluate a trained model on test data.

	Args:
	model_path: Path to the trained model directory.
	test_data: Path to the test dataset JSON file.

	Returns:
	Dictionary containing evaluation metrics including
	factual_accuracy, reasoning_accuracy, and calibration_score.

	Raises:
	FileNotFoundError: If model_path or test_data doesn't exist.

	Example:
	>>> results = evaluate_model("./model", "test.json")
	>>> print(results["factual_accuracy"])
	0.90
	"""
	```

	### Import Order

	Organize imports in this order:
	1. Standard library
	2. Third-party packages
	3. Local imports

	Example:
	```python
	import json
	from pathlib import Path

	import torch
	from transformers import AutoModelForCausalLM

	from biorlhf.data import load_dataset
	from biorlhf.utils import setup_quantization
	```

	## Questions?

	If you have questions about contributing, feel free to:
	- Open an issue for discussion
	- Reach out to the maintainers

	Thank you for contributing to BioRLHF!