|
download
raw
5.66 kB
# Why magi_depyf
`magi_depyf` is the compilation observability layer for `MagiCompiler`.
It answers three practical questions:
1. **What exactly happened during compilation?**
2. **Did the outcome match expectations?** (for example: graph split shape, cache reuse, retry behavior)
3. **If something failed, where should I look first?**
---
## 1. Positioning: Built-in observability for MagiCompiler
In a real MagiCompiler pipeline, compilation spans multiple stages: Dynamo capture, Magi backend graph transforms/splitting, Inductor codegen, and AOT/JIT reuse.
Plain logs are often not enough to reliably answer:
- Did the failure happen at full-graph level or in a specific subgraph?
- Did a pass change graph behavior unexpectedly?
- Why did this run miss cache?
`magi_depyf` writes these signals as structured events and artifacts on disk, so you can replay, compare, and debug deterministically.
---
## 2. Primary scenario: `magi_compile`
### 2.1 Recommended usage model
For `magi_compile` users, `magi_depyf` is most useful as a built-in observability layer:
- Streams events to `timeline_events/timeline.jsonl` during compilation
- Writes event artifacts under `timeline_events/files/`
- Exports a structured `compiled_functions/` artifact tree
In most cases, you do not need to manually wire custom hooks. Run through the MagiCompiler path and inspect the output directory.
### 2.2 What you get automatically in the Magi compile path
For the `magi_compile` scenario, you do **not** need to call `explain_compilation` manually.
As long as the model runs through MagiCompiler, `magi_depyf` outputs are written automatically.
For example, one real run produced:
- `magi_depyf_torch_compile_debug/magi_depyf/run_20260322_192147/model_1_TwoLayerTransformer_rank_0/timeline_events`
In general, the output pattern is:
- `<cache_root_dir>/magi_depyf/run_*/model_*_rank_*/timeline_events/`
Key artifacts under this directory:
- `timeline.jsonl`: streaming event log written during compilation
- `files/0000_*`, `files/0001_*`, ... event folders with attached artifacts
- `files/*/attributes.json`: structured metadata for each event folder
Related compiled artifact view is also emitted under:
- `<cache_root_dir>/magi_depyf/run_*/model_*_rank_*/compiled_functions/`
See the full runnable example:
- `magi_compiler/magi_depyf/example/magi_compile_transformer_example.py`
Typical questions this path answers quickly:
- Did fullgraph/subgraph events happen in the expected order?
- Which pass changed graph structure?
- Did `RestartAnalysis` happen, and was cache load skipped as expected?
- Which event folder contains the exact graph/code snapshot for debugging?
---
## 3. Secondary scenario: plain `torch.compile` (manual entry)
`magi_depyf` also works with plain `torch.compile`, but this is a secondary workflow and requires manual context wrapping:
```python
import torch
from magi_compiler.magi_depyf.inspect import explain_compilation
@torch.compile
def toy_example(a, b):
x = a / (torch.abs(a) + 1)
if b.sum() < 0:
b = -b
return x * b
with explain_compilation("./magi_depyf_torch_compile_debug"):
for _ in range(10):
toy_example(torch.randn(10), torch.randn(10))
```
See the concrete example here:
- `magi_compiler/magi_depyf/example/torch_compile_toy_example.py`
This demo shows the minimal way to use `explain_compilation` in a pure `torch.compile` setup.
---
## 4. Event model (concise)
Each event contains:
- `name`: event name (with `fullgraph_` or `subgraph_N_` prefix)
- `attributes`: structured metadata (for example `lifecycle_name`, `runtime_shape`, `graph_index`, `reason`)
- `files`: attached text artifacts (graph code, inductor output, etc.)
### 4.1 Lifecycle naming pattern
Lifecycle events are normalized into the following pattern:
- `*_before_<lifecycle_name>`
- `*_after_<lifecycle_name>`
- `*_failed_<lifecycle_name>`
- `*_skip_<lifecycle_name>`
Where `*` is `fullgraph` or `subgraph_<N>`.
### 4.2 Representative examples
- `fullgraph_before_graph_split`
- `fullgraph_after_graph_split`
- `fullgraph_before_compiler_manager_compile`
- `fullgraph_after_compiler_manager_load`
- `subgraph_2_before_postcleanuppass`
- `subgraph_2_after_postcleanuppass`
---
## 5. Output layout (current implementation)
```text
debug_output/
timeline_events/
timeline.jsonl
files/
0000_fullgraph_.../
attributes.json
*.py / *.txt
0001_subgraph_.../
submod_<N>/
attributes.json
*.py / *.txt
compiled_functions/
<function_name>/
overview.md
decompiled_code.py
entry_0/
guards.txt
compiled_fns/
piecewise_subgraphs/
```
---
## 6. Recommended triage order
When debugging a compile issue, use this order:
1. `timeline.jsonl`: verify event ordering first
2. `*_failed_*` / cache-related lifecycle events (`compiler_manager_load`, `compiler_manager_cache_store`)
3. Graph files attached to relevant events: verify before/after pass graph state
4. `compiled_functions/*/overview.md`: verify final compiled artifact structure
The key is not just absolute timing. The key is:
- what happened,
- whether it matched expectations,
- and where to drill down when it failed.
---
## 7. Compatibility and entry-point summary
- **Primary entry**: MagiCompiler main path (`magi_compile`)
- **Optional entry**: plain `torch.compile` + manual `explain_compilation`
One-line summary:
`magi_depyf` is first and foremost MagiCompiler’s internal observability infrastructure, while still being usable as a manual troubleshooting tool for plain `torch.compile` when needed.

Xet Storage Details

Size:
5.66 kB
·
Xet hash:
bef37d652254fcd97d54a061765cd1817deebd64c2746eba7ca0479364eec682

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.