Development Guidelines

This document contains critical information about working with this codebase. Follow these guidelines precisely.

Environment Setup

To ensure everyone uses the same environment, follow these steps:

Initial Setup: Run uv sync to create/update your environment from the lockfile
After Pulling Changes: If uv.lock has changed, run uv sync again
Adding Dependencies: Use uv add <package> which updates both pyproject.toml and uv.lock
Removing Dependencies: Use uv remove <package>

The uv.lock file ensures all developers and CI/CD systems use exactly the same package versions.

Core Development Rules

Package Management
- ONLY use uv, NEVER pip
- Environment setup: uv sync (creates consistent environment from uv.lock)
- Installation: uv add package
- Running tools: uv run tool
- Upgrading: uv add --dev package --upgrade-package package
- FORBIDDEN: uv pip install, @latest syntax
Code Quality
- Type hints required for all code
- Public APIs must have docstrings
- Functions must be focused and small
- Follow existing patterns exactly
- Line length: 88 chars maximum
Testing Requirements
- Framework: uv run pytest
- Async testing: use anyio, not asyncio
- Coverage: test edge cases and errors
- New features require tests
- Bug fixes require regression tests
Code Style
- PEP 8 naming (snake_case for functions/variables)
- Class names in PascalCase
- Constants in UPPER_SNAKE_CASE
- Document with docstrings
- Use f-strings for formatting

For commits fixing bugs or adding features based on user reports add:
```
git commit --trailer "Reported-by:<name>"
```
Where <name> is the name of the user.

For commits related to a Github issue, add

git commit --trailer "Github-Issue:#<number>"

NEVER ever mention a co-authored-by or similar aspects. In particular, never mention the tool used to create the commit message or PR.

Development Philosophy

Simplicity: Write simple, straightforward code
Readability: Make code easy to understand
Performance: Consider performance without sacrificing readability
Maintainability: Write code that's easy to update
Testability: Ensure code is testable
Reusability: Create reusable components and functions
Less Code = Less Debt: Minimize code footprint

Coding Best Practices

Early Returns: Use to avoid nested conditions
Descriptive Names: Use clear variable/function names (prefix handlers with "handle")
Constants Over Functions: Use constants where possible
DRY Code: Don't repeat yourself
Functional Style: Prefer functional, immutable approaches when not verbose
Minimal Changes: Only modify code related to the task at hand
Function Ordering: Define composing functions before their components
TODO Comments: Mark issues in existing code with "TODO:" prefix
Simplicity: Prioritize simplicity and readability over clever solutions
Build Iteratively Start with minimal functionality and verify it works before adding complexity
Run Tests: Test your code frequently with realistic inputs and validate outputs
Build Test Environments: Create testing environments for components that are difficult to validate directly
Functional Code: Use functional and stateless approaches where they improve clarity
Clean logic: Keep core logic clean and push implementation details to the edges
File Organsiation: Balance file organization with simplicity - use an appropriate number of files for the project scale

Core Components

__main__.py: Main entry point
api: API for the project
tasks: Tasks for the project
models: Models for the project
loggers: Loggers for the project
utils: Utility functions for the project
tests: Tests for the project
configs: Configs for the project
data: Data for the project

Launch Command:

python -m lmms_eval --model qwen2_5_vl --model_args pretrained=Qwen/Qwen2.5-VL-3B-Instruct,max_pixels=12845056,attn_implementation=sdpa --tasks mmmu,mme,mmlu_flan_n_shot_generative --batch_size 128 --limit 8 --device cuda:0

Pull Requests

Create a detailed message of what changed. Focus on the high level description of the problem it tries to solve, and how it is solved. Don't go into the specifics of the code unless it adds clarity.
NEVER ever mention a co-authored-by or similar aspects. In particular, never mention the tool used to create the commit message or PR.

Python Tools

Code Formatting

Ruff
- Format: uv run ruff format .
- Check: uv run ruff check .
- Fix: uv run ruff check . --fix
- Critical issues:
  - Line length (88 chars)
  - Import sorting (I001)
  - Unused imports
- Line wrapping:
  - Strings: use parentheses
  - Function calls: multi-line with proper indent
  - Imports: split into multiple lines
Type Checking
- Tool: uv run pyright
- Requirements:
  - Explicit None checks for Optional
  - Type narrowing for strings
  - Version warnings can be ignored if checks pass
Pre-commit
- Config: .pre-commit-config.yaml
- Runs: on git commit
- Tools: Prettier (YAML/JSON), Ruff (Python)
- Ruff updates:
  - Check PyPI versions
  - Update config rev
  - Commit config first

Error Resolution

CI Failures
- Fix order:
  1. Formatting
  2. Type errors
  3. Linting
- Type errors:
  - Get full line context
  - Check Optional types
  - Add type narrowing
  - Verify function signatures
Common Issues
- Line length:
  - Break strings with parentheses
  - Multi-line function calls
  - Split imports
- Types:
  - Add None checks
  - Narrow string types
  - Match existing patterns
Best Practices
- Check git status before commits
- Run formatters before type checks
- Keep changes minimal
- Follow existing patterns
- Document public APIs
- Test thoroughly