Buckets:
| # Why magi_depyf | |
| `magi_depyf` is the compilation observability layer for `MagiCompiler`. | |
| It answers three practical questions: | |
| 1. **What exactly happened during compilation?** | |
| 2. **Did the outcome match expectations?** (for example: graph split shape, cache reuse, retry behavior) | |
| 3. **If something failed, where should I look first?** | |
| --- | |
| ## 1. Positioning: Built-in observability for MagiCompiler | |
| In a real MagiCompiler pipeline, compilation spans multiple stages: Dynamo capture, Magi backend graph transforms/splitting, Inductor codegen, and AOT/JIT reuse. | |
| Plain logs are often not enough to reliably answer: | |
| - Did the failure happen at full-graph level or in a specific subgraph? | |
| - Did a pass change graph behavior unexpectedly? | |
| - Why did this run miss cache? | |
| `magi_depyf` writes these signals as structured events and artifacts on disk, so you can replay, compare, and debug deterministically. | |
| --- | |
| ## 2. Primary scenario: `magi_compile` | |
| ### 2.1 Recommended usage model | |
| For `magi_compile` users, `magi_depyf` is most useful as a built-in observability layer: | |
| - Streams events to `timeline_events/timeline.jsonl` during compilation | |
| - Writes event artifacts under `timeline_events/files/` | |
| - Exports a structured `compiled_functions/` artifact tree | |
| In most cases, you do not need to manually wire custom hooks. Run through the MagiCompiler path and inspect the output directory. | |
| ### 2.2 What you get automatically in the Magi compile path | |
| For the `magi_compile` scenario, you do **not** need to call `explain_compilation` manually. | |
| As long as the model runs through MagiCompiler, `magi_depyf` outputs are written automatically. | |
| For example, one real run produced: | |
| - `magi_depyf_torch_compile_debug/magi_depyf/run_20260322_192147/model_1_TwoLayerTransformer_rank_0/timeline_events` | |
| In general, the output pattern is: | |
| - `<cache_root_dir>/magi_depyf/run_*/model_*_rank_*/timeline_events/` | |
| Key artifacts under this directory: | |
| - `timeline.jsonl`: streaming event log written during compilation | |
| - `files/0000_*`, `files/0001_*`, ... event folders with attached artifacts | |
| - `files/*/attributes.json`: structured metadata for each event folder | |
| Related compiled artifact view is also emitted under: | |
| - `<cache_root_dir>/magi_depyf/run_*/model_*_rank_*/compiled_functions/` | |
| See the full runnable example: | |
| - `magi_compiler/magi_depyf/example/magi_compile_transformer_example.py` | |
| Typical questions this path answers quickly: | |
| - Did fullgraph/subgraph events happen in the expected order? | |
| - Which pass changed graph structure? | |
| - Did `RestartAnalysis` happen, and was cache load skipped as expected? | |
| - Which event folder contains the exact graph/code snapshot for debugging? | |
| --- | |
| ## 3. Secondary scenario: plain `torch.compile` (manual entry) | |
| `magi_depyf` also works with plain `torch.compile`, but this is a secondary workflow and requires manual context wrapping: | |
| ```python | |
| import torch | |
| from magi_compiler.magi_depyf.inspect import explain_compilation | |
| @torch.compile | |
| def toy_example(a, b): | |
| x = a / (torch.abs(a) + 1) | |
| if b.sum() < 0: | |
| b = -b | |
| return x * b | |
| with explain_compilation("./magi_depyf_torch_compile_debug"): | |
| for _ in range(10): | |
| toy_example(torch.randn(10), torch.randn(10)) | |
| ``` | |
| See the concrete example here: | |
| - `magi_compiler/magi_depyf/example/torch_compile_toy_example.py` | |
| This demo shows the minimal way to use `explain_compilation` in a pure `torch.compile` setup. | |
| --- | |
| ## 4. Event model (concise) | |
| Each event contains: | |
| - `name`: event name (with `fullgraph_` or `subgraph_N_` prefix) | |
| - `attributes`: structured metadata (for example `lifecycle_name`, `runtime_shape`, `graph_index`, `reason`) | |
| - `files`: attached text artifacts (graph code, inductor output, etc.) | |
| ### 4.1 Lifecycle naming pattern | |
| Lifecycle events are normalized into the following pattern: | |
| - `*_before_<lifecycle_name>` | |
| - `*_after_<lifecycle_name>` | |
| - `*_failed_<lifecycle_name>` | |
| - `*_skip_<lifecycle_name>` | |
| Where `*` is `fullgraph` or `subgraph_<N>`. | |
| ### 4.2 Representative examples | |
| - `fullgraph_before_graph_split` | |
| - `fullgraph_after_graph_split` | |
| - `fullgraph_before_compiler_manager_compile` | |
| - `fullgraph_after_compiler_manager_load` | |
| - `subgraph_2_before_postcleanuppass` | |
| - `subgraph_2_after_postcleanuppass` | |
| --- | |
| ## 5. Output layout (current implementation) | |
| ```text | |
| debug_output/ | |
| timeline_events/ | |
| timeline.jsonl | |
| files/ | |
| 0000_fullgraph_.../ | |
| attributes.json | |
| *.py / *.txt | |
| 0001_subgraph_.../ | |
| submod_<N>/ | |
| attributes.json | |
| *.py / *.txt | |
| compiled_functions/ | |
| <function_name>/ | |
| overview.md | |
| decompiled_code.py | |
| entry_0/ | |
| guards.txt | |
| compiled_fns/ | |
| piecewise_subgraphs/ | |
| ``` | |
| --- | |
| ## 6. Recommended triage order | |
| When debugging a compile issue, use this order: | |
| 1. `timeline.jsonl`: verify event ordering first | |
| 2. `*_failed_*` / cache-related lifecycle events (`compiler_manager_load`, `compiler_manager_cache_store`) | |
| 3. Graph files attached to relevant events: verify before/after pass graph state | |
| 4. `compiled_functions/*/overview.md`: verify final compiled artifact structure | |
| The key is not just absolute timing. The key is: | |
| - what happened, | |
| - whether it matched expectations, | |
| - and where to drill down when it failed. | |
| --- | |
| ## 7. Compatibility and entry-point summary | |
| - **Primary entry**: MagiCompiler main path (`magi_compile`) | |
| - **Optional entry**: plain `torch.compile` + manual `explain_compilation` | |
| One-line summary: | |
| `magi_depyf` is first and foremost MagiCompiler’s internal observability infrastructure, while still being usable as a manual troubleshooting tool for plain `torch.compile` when needed. | |
Xet Storage Details
- Size:
- 5.66 kB
- Xet hash:
- bef37d652254fcd97d54a061765cd1817deebd64c2746eba7ca0479364eec682
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.