ARBS / REVIEW.md
CLIWorks's picture
Upload folder using huggingface_hub
d8bc908 verified
# ARBS Code Audit: Dead Imports, Dead Code, and Triton Kernel Analysis
**Reviewed:** 2026-05-20T00:00:00Z
**Depth:** standard
**Files Reviewed:** 10
## Summary
The ARBS codebase has **3 BLOCKER bugs** that will cause runtime crashes, **8 unused class/function definitions** (dead code), **7 dead Triton kernels in components.py** that should be moved to `arbitor/kernel/`, and **21+ unused imports** across files. Two missing function definitions (`_graph_gather_add`, `_moe_dense_combine`) exist in dead code paths but would crash if those paths were ever activated.
---
## BLOCKER Issues
### CR-01: `_TernaryLinearFn.forward` references undefined `x_2d` (NameError at runtime)
**File:** `arbitor/kernel/ternary_scale.py:206-208`
**Issue:** The TileLang `_TernaryLinearFn.forward()` method references `x_2d` on lines 206-208, but `x_2d` is never defined in the method's scope. This will cause a `NameError` at runtime if the TileLang code path is taken in `TernaryScaleTensor.forward` (line 1069). The Triton variant `_TritonTernaryLinearFn` (line 878) correctly defines `x_2d = x.reshape(-1, k_in).contiguous()` before use, so this was likely an omission when the TileLang function was written.
```python
# Line 206 β€” NameError: name 'x_2d' is not defined
M = x_2d.shape[0]
output = torch.empty(M, N, device=x.device, dtype=torch.float32)
fwd_kernel(x_2d.half(), T_packed, E, output)
```
**Fix:** Add `x_2d = x.reshape(-1, K).contiguous()` before line 206:
```python
with torch.no_grad():
N, K = shape
x_2d = x.reshape(-1, K).contiguous() # missing definition
M = x_2d.shape[0]
output = torch.empty(M, N, device=x.device, dtype=torch.float32)
fwd_kernel(x_2d.half(), T_packed, E, output)
```
---
### CR-02: `_check_tilelang_finite` called but never defined (NameError at runtime)
**File:** `arbitor/kernel/ternary_scale.py:1072`
**Issue:** `_check_tilelang_finite()` is called in `TernaryScaleTensor.forward()` but is never defined anywhere in the codebase. This will cause a `NameError` at runtime when the TileLang path is active and the kernel produces valid output (the check is specifically gated by `_HAS_TILELANG` being True).
**Fix:** Either define the function (if the check is intentional) or remove the call:
```python
# Replace line 1072 with a direct finiteness check or remove
if not torch.isfinite(y).all():
raise FloatingPointError("TileLang ternary kernel produced non-finite activations")
```
---
### CR-03: `self.modality_gate` used but never assigned (AttributeError at runtime)
**File:** `arbitor/main.py:129-130`
**Issue:** `ARBModel.forward()` references `self.modality_gate` but it is never assigned in `ARBModel.__init__()`. While `ModalityGate` is imported at line 19, it is never instantiated and stored as `self.modality_gate`. This will cause an `AttributeError` on any forward pass where `self.modality_gate is not None` is evaluated.
The code at lines 129-132:
```python
if self.modality_gate is not None:
gate_weights, active_count, hops = self.modality_gate(active_mods)
else:
gate_weights, active_count, hops = {}, len(active_mods), 1
```
**Fix:** Add `self.modality_gate = ModalityGate()` in `ARBModel.__init__()` (or assign `self.modality_gate = None` if the gate should be optional):
```python
# In ARBModel.__init__, after line 78:
self.modality_gate = ModalityGate()
```
---
## WARNING: Undefined Functions in Dead Code
### WR-01: `_graph_gather_add` called but never defined
**File:** `arbitor/components.py:739`
**Issue:** `TernaryGraph.forward()` calls `_graph_gather_add(vq_output, node_features, vq_indices)` but this function is never defined anywhere in the codebase. `TernaryGraph` is dead code (never imported or used), so this does not crash currently, but it blocks any future use of `TernaryGraph`.
**Fix:** Define `_graph_gather_add` or remove the dead class.
---
### WR-02: `_moe_dense_combine` called but never defined
**File:** `arbitor/components.py:941`
**Issue:** `SharedProjectionMoE.forward()` calls `_moe_dense_combine(torch.stack(...), topk_idx, topk_weights)` but this function is never defined. `SharedProjectionMoE` is dead code, but the missing function is a latent bug.
**Fix:** Define `_moe_dense_combine` or remove the dead class.
---
## WARNING: Unused Class/Function Definitions (Dead Code)
### WR-03: `TernaryLSTMCell` class β€” defined but never used
**File:** `arbitor/components.py:189-207`
**Issue:** `TernaryLSTMCell` is defined and re-exported from `__init__.py` (line 23) but is never instantiated anywhere in the codebase. The model uses `MoEGraph` with attention (MLA) instead of LSTM-based processing.
---
### WR-04: `TernaryGraph` class β€” defined but never used
**File:** `arbitor/components.py:665-802`
**Issue:** `TernaryGraph` is defined in `components.py` but never imported or instantiated. It was replaced by `MoEGraph` (line 1342). The only reference is in a comment (line 1348).
**Also:** `TernaryGraph` references the undefined function `_graph_gather_add` (see WR-01), so it cannot function even if someone tried to use it.
---
### WR-05: `SharedProjectionMoE` class β€” defined but never used
**File:** `arbitor/components.py:806-999`
**Issue:** `SharedProjectionMoE` is defined in `components.py` but never imported or instantiated. It was replaced by `MoEGraph._run_expert()` (line 1429). The only reference is in a comment (line 1348).
**Also:** References the undefined function `_moe_dense_combine` (see WR-02).
---
### WR-06: 7 dead Triton kernel functions in `components.py`
**File:** `arbitor/components.py:266-386`
**Issue:** These Triton kernel functions are defined inside the `if _HAS_TRITON:` block but are only referenced by their forward/backward wrapper functions which are themselves part of dead code (`TernaryGraph` and `SharedProjectionMoE`):
| Line | Function | Used By |
|------|----------|---------|
| 268 | `_triton_graph_aggregate_fwd_kernel` | dead (TernaryGraph) |
| 292 | `_triton_graph_aggregate_bwd_kernel` | dead (TernaryGraph) |
| 316 | `_triton_graph_gather_add_fwd_kernel` | dead (TernaryGraph) |
| 329 | `_triton_graph_gather_add_bwd_kernel` | dead (TernaryGraph) |
| 342 | `_triton_moe_dense_combine_fwd_kernel` | dead (SharedProjectionMoE) |
| 359 | `_triton_moe_dense_combine_bwd_expert_kernel` | dead (SharedProjectionMoE) |
| 374 | `_triton_moe_dense_combine_bwd_weight_kernel` | dead (SharedProjectionMoE) |
The live Triton kernels (`_triton_video_denoise_fwd_kernel` line 389, `_triton_video_denoise_bwd_kernel` line 402) are still in `components.py` and should also be moved to `arbitor/kernel/`.
---
### WR-07: `_triton_flash_vq_quantize_kernel` β€” dead Triton kernel
**File:** `arbitor/kernel/flash_vq.py:370-402`
**Issue:** This Triton kernel is defined but never called. The `_TritonFlashVQFn.forward()` method uses PyTorch's `embed[indices]` for the gather operation (line 468) instead of this kernel.
---
### WR-08: `TILE_SIZE = 384` β€” unused constant
**File:** `arbitor/kernel/ternary_scale.py:949`
**Issue:** `TILE_SIZE` is defined as a module-level constant but never referenced anywhere in the codebase.
---
## WARNING: Unused Imports
### WR-09: Unused imports in `arbitor/main.py` (line 10)
| Symbol | Used In File? |
|--------|--------------|
| `EMBEDDING_DIM` | No β€” not referenced in body |
| `FFN_HIDDEN` | No β€” not referenced in body |
| `CODEBOOK_DIM` | No β€” not referenced in body |
| `ATTENTION_STRIDE` | No β€” not referenced in body |
| `MG_N_EXPERTS` | No β€” MoEGraph uses default, not passed |
| `MG_CORE_RANK` | No β€” MoEGraph uses default |
| `MG_SHARED_INTER` | No β€” MoEGraph uses default |
| `MG_ACT_ITERS` | No β€” MoEGraph uses default |
---
### WR-10: Unused imports in `arbitor/components.py` (line 21)
| Symbol | Used In Live Code? | Note |
|--------|-------------------|------|
| `FFN_HIDDEN` | No | Not referenced in file body |
| `CTX` | No | Not referenced in file body |
| `THRESHOLD` | No | Only used in dead `TernaryGraph`. Live `MoEGraph` hardcodes `threshold=0.05` |
| `KG_EMA_ALPHA` | No | Only used in dead `TernaryGraph`. Live `MoEGraph` hardcodes `0.99` |
| `KG_REQUANT_EVERY` | No | Only used in dead `TernaryGraph`. Live `MoEGraph` hardcodes `50` |
| `KG_TERNARY_THRESHOLD` | No | Only used in dead `TernaryGraph`. Live `MoEGraph` hardcodes `0.3` |
---
### WR-11: Unused imports in `arbitor/profiling.py` (line 17)
| Symbol | Used In File? |
|--------|--------------|
| `VOCAB` | No β€” not referenced in body |
| `math` (line 11) | No β€” not referenced in body |
---
## INFO: Triton Kernel Code in `components.py` Should Be Moved to `arbitor/kernel/`
### IN-01: Live Triton kernels reside in `components.py` instead of `arbitor/kernel/`
**File:** `arbitor/components.py:389-445`
**Issue:** The codebase convention places Triton kernels in `arbitor/kernel/` (e.g., `ternary_scale.py`, `flash_vq.py`, `ternary_audit.py`). Two live Triton kernels remain in `components.py`:
- `_triton_video_denoise_fwd_kernel` (line 389)
- `_triton_video_denoise_bwd_kernel` (line 402)
- `_TritonVideoDenoiseFn` (line 415)
- `_video_denoise_step` (line 448)
These should be extracted into `arbitor/kernel/video_denoise.py` and imported from there, following the pattern established by `ternary_scale.py` and `flash_vq.py`.
---
## INFO: Additional Dead Code
### IN-02: Hardcoded MoEGraph config values bypass config constants
**File:** `arbitor/components.py:1381-1383`
**Issue:** `MoEGraph` uses hardcoded values (`50`, `0.3`, `0.99`) instead of the imported config constants (`KG_REQUANT_EVERY`, `KG_TERNARY_THRESHOLD`, `KG_EMA_ALPHA`). The values happen to match the config, but any future config changes will silently be ignored.
---
### IN-03: `AUDIO_VOCAB` not used meaningfully in `config.py`
**File:** `arbitor/config.py:2`
**Issue:** `AUDIO_VOCAB=288` is imported and used in `TalkerHead` and `TinyNeuralCodec`, but the `SPECIAL_VOCAB` map (line 65) defines tokens up to 287. `AUDIO_VOCAB` = `VOCAB` = 288, meaning the audio head has the same vocabulary as the text head. This may be intentional for the current prototype but is worth flagging given `AUDIO_VOCAB` vs `VOCAB` are separate constants.
---
_Reviewed: 2026-05-20T00:00:00Z_
_Reviewer: gsd-code-reviewer (deep analysis)_