| # Practical Guide: Writing Clear and Reusable Python Code | |
| **Objective**: Enable future maintainers to understand your code efficiently after several months. | |
| ## Two Common Challenges | |
| - **Overly large files**: Linear code in extensive files leads to cognitive overload. Never write 1000-line functions. | |
| - **Excessive abstraction layers**: Multiple nested classes make it difficult to locate actual business logic. | |
| **Goal**: Find balance between these extremes. | |
| ## Target Outcomes | |
| - Immediately comprehensible structure with well-named functions | |
| - Code that reads naturally and fails explicitly when something is wrong | |
| > **Core Principle**: Code readability is our primary concern. If understanding existing code takes more effort than rewriting it, the implementation needs improvement. | |
| --- | |
| ## Core Development Principles | |
| ### Fail Fast, Fail Loud | |
| **NEVER use fallback to degraded solutions that silently create incorrect output.** | |
| When something goes wrong, stop with a clear error rather than continue with incorrect results. Fallbacks create inconsistent behavior, silent data degradation, and false confidence. | |
| Even when coding, never leave unfinished stubs without explicit failing. Silent, forgotten, unfinished implementations are a source of bugs. | |
| ```python | |
| # BAD - Silent degradation | |
| def calculate_average(data): | |
| if not data: | |
| return 0 # Silently masks the issue | |
| return sum(data) / len(data) | |
| # GOOD - Explicit failure | |
| def calculate_average(data): | |
| if not data: | |
| raise ValueError("Cannot calculate average: data list is empty") | |
| return sum(data) / len(data) | |
| ``` | |
| **Key**: Failed execution with clear error > Successful execution with wrong data | |
| ### Clear Error Messages | |
| Error messages should be **Specific + Actionable + Contextual**: what, where, why, how to fix. | |
| ```python | |
| # BAD: Vague | |
| raise ValueError("Invalid data") | |
| # GOOD: Specific, actionable, contextual | |
| raise ValueError(f"User age must be between 0 and 150, got {age}") | |
| raise FileNotFoundError(f"Model not found at {model_path}. Run: python download_model.py") | |
| ``` | |
| ### Input and Output Validation | |
| Validate inputs before processing and outputs before saving. | |
| ```python | |
| def resize_image(image: np.ndarray, target_size: tuple[int, int]) -> np.ndarray: | |
| if image is None or len(image.shape) not in (2, 3): | |
| raise ValueError(f"Expected 2D/3D image, got {image.shape if image is not None else None}") | |
| if target_size[0] <= 0 or target_size[1] <= 0: | |
| raise ValueError(f"Target size must be positive, got {target_size}") | |
| return cv2.resize(image, target_size) | |
| def generate_report(data: list[dict]) -> str: | |
| report = _create_report(data) | |
| if not report or len(report) < 10: | |
| raise ValueError(f"Report suspiciously short ({len(report)} chars)") | |
| return report | |
| ``` | |
| --- | |
| ## 1. Maintain Simplicity | |
| Implement the **most straightforward solution** that meets requirements. | |
| Avoid adding complex layers for single use cases. | |
| ## 2. Avoid Unnecessary Structures | |
| ### Redundant Try-Catch / Single-Method Classes / Premature Abstractions | |
| ```python | |
| # BAD: Simply re-raising | |
| try: | |
| result = process_image(img_path) | |
| except Exception as e: | |
| raise e # Adds no value | |
| # BAD: Hiding errors | |
| try: | |
| predictions = model.predict(data) | |
| except: | |
| predictions = [] # Lost error info | |
| # BAD: Unnecessary class | |
| class ImageProcessor: | |
| def process(self, image): | |
| return cv2.resize(image, (224, 224)) | |
| # GOOD: Simple and direct | |
| def process_image(img_path): | |
| return cv2.imread(img_path) # Let exceptions propagate | |
| def load_and_predict(model_path, data): | |
| if not os.path.exists(model_path): | |
| raise FileNotFoundError(f"Model not found: {model_path}") | |
| return load_model(model_path).predict(data) | |
| def resize_image(image, size=(224, 224)): | |
| return cv2.resize(image, size) | |
| ``` | |
| --- | |
| ## 3. When to Create Functions | |
| **Pipelines**: Don't create 1000-line functions. One function = One purpose. | |
| **Reusable code**: Code written once → keep as is. Twice → consider refactoring. Three+ times → definitely refactor. | |
| ## 4. Function Design Guidelines | |
| - **Single Responsibility**: One clear purpose per function | |
| - **Descriptive names**: `remove_outliers` not `do_stuff` | |
| - **Size**: Under 20 lines (split larger functions) | |
| ```python | |
| # BAD: Too many things | |
| def process(df): | |
| # cleaning, normalization, encoding, split, training... | |
| return model | |
| # GOOD: One task per function | |
| def clean_columns(df): ... | |
| def normalize(df): ... | |
| def train_model(train, test): ... | |
| ``` | |
| --- | |
| ## 5. Effective Documentation | |
| **Philosophy: Documentation lives in code, not in separate markdown files.** | |
| ### Documentation Hierarchy | |
| 1. **Docstrings** (mandatory for public functions/classes) | |
| 2. **Inline comments** (explain "why", not "what") | |
| 3. **README.md** (project overview, setup, quick start) | |
| 4. **Auto-generated docs** (from docstrings) | |
| **Do NOT create proliferating documentation files** - they get out of sync and duplicate docstrings. | |
| Temporary reports shall go in the ./.temp directory, that is added to .gitignore. | |
| ### Docstring Format | |
| Use **Google Style Docstrings**. | |
| ```python | |
| def normalize_text(text: str, stopwords: list[str] | None = None) -> str: | |
| """Convert text to lowercase and remove punctuation. | |
| Args: | |
| text: Input text to normalize. | |
| stopwords: Optional list of words to remove. | |
| Returns: | |
| Normalized text in lowercase with punctuation removed. | |
| Raises: | |
| ValueError: If text is empty or whitespace only. | |
| """ | |
| if not text or not text.strip(): | |
| raise ValueError("Text cannot be empty") | |
| text = text.lower().strip() | |
| text = re.sub(r'[^\w\s]', '', text) | |
| if stopwords: | |
| text = " ".join([w for w in text.split() if w not in stopwords]) | |
| return text | |
| ``` | |
| ### Inline Comments | |
| ```python | |
| # BAD: Repeats code | |
| timestamp = frame.timestamp # Get the timestamp | |
| # GOOD: Explains why | |
| if frame.timestamp <= 0: # Skip pre-initialization frames | |
| continue | |
| ``` | |
| --- | |
| ## 6. Type Annotations | |
| **Always annotate parameters and return types** for public functions. | |
| ```python | |
| # BAD: Unclear | |
| def process_data(data, config, threshold): | |
| return result | |
| # GOOD: Clear types | |
| def process_data(data: np.ndarray, config: dict, threshold: float) -> list[dict]: | |
| return result | |
| ``` | |
| ### Pydantic Models (Company Standard) | |
| **MANDATORY: Use Pydantic for complex data structures and interfaces between business bricks.** | |
| Benefits: Runtime validation, clear errors, JSON serialization, self-documenting. | |
| ```python | |
| from pydantic import BaseModel, Field, field_validator | |
| class Detection(BaseModel): | |
| """Object detection result.""" | |
| bbox: tuple[int, int, int, int] = Field(..., description="(x1, y1, x2, y2)") | |
| confidence: float = Field(..., ge=0.0, le=1.0) | |
| class_id: int = Field(..., ge=0) | |
| class_name: str | |
| @field_validator('bbox') | |
| @classmethod | |
| def validate_bbox(cls, v: tuple[int, int, int, int]) -> tuple[int, int, int, int]: | |
| x1, y1, x2, y2 = v | |
| if x2 <= x1 or y2 <= y1: | |
| raise ValueError(f"Invalid bbox, got {v}") | |
| return v | |
| ``` | |
| **CRITICAL: Use the same Pydantic models for shared concepts across all projects** (Camera, GPS, Detection, etc.) to ensure consistent validation and easy integration. | |
| --- | |
| ## 7. Prioritize Readability | |
| Clear code takes precedence over premature optimization. Use descriptive names (`average_daily_temperature` not `adt`). | |
| --- | |
| ## 8. Remove Dead Code | |
| **If code is no longer used, delete it.** Dead code pollutes the codebase, creates confusion, and becomes outdated. Git history preserves everything. | |
| **Exceptions**: Library coherence, public API compatibility. | |
| Delete: commented-out code, unused imports/functions/classes/variables. Trust git history. | |
| --- | |
| ## 9. No Hardcoding of Frequently Changing Values | |
| **Never hardcode values that change frequently or vary between environments.** | |
| ### Configuration Format | |
| **Prefer**: JSON (company standard), .env (secrets), YAML/TOML (if needed). **Avoid**: Python files. | |
| ```json | |
| {"database": {"host": "localhost", "password": "${DB_PASSWORD}"}} | |
| ``` | |
| ### What to Configure | |
| Credentials, URLs, file paths, thresholds, environment-specific values. | |
| **Security**: Never commit secrets. Use .gitignore for `.env`, `secrets/`, `*.key`. | |
| **Acceptable hardcoding**: Constants (`PI = 3.14159`), enums, default parameters. | |
| --- | |
| ## 10. No "Hacks to Make It Compile" | |
| **Write code that is correct by design, not code that tricks the type checker or linter.** | |
| Hacks = code modifications made solely to silence errors without addressing the root cause. | |
| ### Common Hacks to Avoid | |
| ```python | |
| # BAD: Silencing errors | |
| result = process_data(input) # type: ignore | |
| try: | |
| result = risky_operation() | |
| except: | |
| pass | |
| from module import * # noqa | |
| return User(id=-1, name="") # Dummy values | |
| # GOOD: Fix the root cause | |
| def process_data(input: list[dict]) -> list[dict]: | |
| return input | |
| result = process_data(input) # Properly typed | |
| try: | |
| result = risky_operation() | |
| except ValueError as e: | |
| raise ValueError(f"Failed: {e}") from e | |
| from module import SpecificClass, specific_function | |
| def get_user(user_id: int) -> User | None: | |
| if user_id < 0: | |
| return None # Or raise ValueError | |
| return User(id=user_id, name="John") | |
| ``` | |
| ### When Suppression is Acceptable | |
| Only when you have a legitimate reason AND document why: | |
| ```python | |
| arr[mask] = values # type: ignore[index] # numpy advanced indexing | |
| result = external_lib.process(data) # type: ignore[no-untyped-call] # no type stubs | |
| ``` | |
| **Approach**: Understand error → Fix root cause → If suppression needed, document WHY → Review regularly. | |
| **Remember**: Hacks hide bugs and accumulate technical debt. Fix the issue, don't silence the messenger. | |
| --- | |
| ## File Organization | |
| **ALL temporary files** → `./.temp/` directory. | |
| ```bash | |
| # Good: src/, tests/, .temp/, README.md | |
| # Bad: test_something.py, output_analysis.csv, debug_report.txt in root | |
| ``` | |
| Code modification reports and temporary tests created by A.I. shall be in .temp/ too. | |
| --- | |
| ## Testing | |
| Test error paths, not just success cases. Basic verification saves debugging time. | |
| ```python | |
| def test_resize_image(): | |
| assert resize_image(np.zeros((100, 100, 3)), (50, 50)).shape == (50, 50, 3) | |
| with pytest.raises(ValueError): | |
| resize_image(None, (50, 50)) | |
| ``` | |
| --- | |
| ## Version Control | |
| **Commit strategy**: Create commits before and after major changes (core algorithms, multi-file refactoring, new features). | |
| **Messages**: Use conventional commits (`feat:`, `fix:`, `refactor:`). Be descriptive. | |
| ```bash | |
| # BAD: "fix stuff", "wip" | |
| # GOOD: "fix: resolve race condition in data loader" | |
| ``` | |
| **Always commit**: Production code, tests, docs, non-sensitive config. | |
| **Never commit**: Secrets (use `.env`, add to `.gitignore`). | |
| --- | |
| ## Using AI Tools Responsibly | |
| **You are responsible for every line of code, whether written by you or AI.** | |
| AI shall not modify **core business logic data structures** without review and explicit approval. | |
| Pydantic models forming interfaces between business bricks require human judgment. | |
| --- | |
| ## Pre-Commit Checklist | |
| **Code Quality**: Clear, descriptive names | Type hints on public functions | Pydantic for complex structures | Reusable without copying | |
| **Error Handling**: No fallback logic | Specific error messages | Input/output validation | |
| **AI Code**: All AI code reviewed and understood | Core interfaces human-designed | No black boxes | |
| **Organization**: No temp files in root (use `.temp/`) | Docstrings on public functions | README updated | |
| **Version Control**: Conventional commit format | No secrets | Logical grouping | |
| --- | |
| ## Summary | |
| 1. **Fail fast, fail loud** - Never silently degrade | |
| 2. **Clear errors** - Specific, actionable, contextual | |
| 3. **Validate I/O** - Catch problems early | |
| 4. **Straightforward code** - Simplest solution | |
| 5. **Small functions** - One purpose, <20 lines | |
| 6. **Type annotations** - Mandatory for public functions | |
| 7. **Pydantic for interfaces** - Complex structures, shared schemas | |
| 8. **Docs in code** - Docstrings, not proliferating files | |
| 9. **Organized projects** - `.temp/` for temporary files | |
| 10. **Remove dead code** - Git preserves history | |
| 11. **No hardcoding** - Use config files (JSON standard) | |
| 12. **No hacks** - Fix root cause, don't silence errors | |
| 13. **AI responsibly** - Review everything, humans decide | |
| **When in doubt**: Raise clear error > guess behavior. Ask for clarification rather than corrupt data. | |
| **Success metric**: Future maintainers understand, trust, and reuse your code without assistance. | |