File size: 3,478 Bytes
b5bff9c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
============================================================
  D=8 NaN Isolation
============================================================

[1] Loading model...
  [Auto-detect] Qwen3-Omni MoE thinker (30.5B total, ~3.3B active)
[FireEcho] Loading /run/media/echo/Echo/ECHO/training/Prototype Fireecho/model/Qwen3-Omni-30B-A3B-Instruct...
  [FireEcho] AutoConfig failed ('Qwen3OmniMoeTalkerCodePredictorConfig' object has no attribute 'use_sliding_window'), loading config.json directly
  Qwen3-Omni: will stream-load from 15 shards
  [Qwen3 Streaming] Loaded shard index: 28010 keys across 15 shards
  [Qwen3 Streaming] Building engine skeleton...
  [Qwen3 Streaming] Global params on GPU: 1.2 GB
    Layer 4/48: 393 weights, VRAM 2.8 GB, CPU 1.4 GB
    Layer 8/48: 393 weights, VRAM 4.3 GB, CPU 1.6 GB
    Layer 12/48: 393 weights, VRAM 5.8 GB, CPU 1.7 GB
    Layer 16/48: 393 weights, VRAM 7.4 GB, CPU 1.9 GB
    Layer 20/48: 393 weights, VRAM 8.9 GB, CPU 2.0 GB
    Layer 24/48: 393 weights, VRAM 10.4 GB, CPU 2.2 GB
    Layer 28/48: 393 weights, VRAM 11.9 GB, CPU 2.3 GB
    Layer 32/48: 393 weights, VRAM 13.5 GB, CPU 2.5 GB
    Layer 36/48: 393 weights, VRAM 15.0 GB, CPU 2.6 GB
    Layer 40/48: 393 weights, VRAM 16.5 GB, CPU 2.8 GB
    Layer 44/48: 393 weights, VRAM 18.0 GB, CPU 2.9 GB
    Layer 48/48: 393 weights, VRAM 19.6 GB, CPU 3.1 GB
  [Qwen3 Streaming] Final VRAM: 19.6 GB (FP4 quantized)
  [Qwen3 Streaming] Done: 1571.8M params, 18867 weights loaded
  Total params:     1.57B
  Frozen params:    1.54B (base model, FP4)
  Trainable params: 30.2M (Hebbian only)
  [Packed MoE] 48 layers packed (6144 experts → contiguous)
  [Flat KV] Enabled: 4096 tokens, 403 MB

[2] Warmup...
  VRAM baseline: 19.96 GB

[3] Baseline (no eagle)...
  [baseline] OK — top=13048 ('Hi')

[4] D=2 eagle head...
  [EAGLE] Loaded legacy D=2 checkpoint. 0 new layer params initialized randomly.
  [EAGLE-3] Draft head: D=2, 104.9M params, 210 MB, capture layers [8, 24, 47] + Hebbian memory
  VRAM: 20.17 GB (+0.21)
  [D=2] OK — top=13048 ('Hi')

[5] D=8 eagle head (random init, no checkpoint)...
  [FE-XT] Draft head: D=8, 356.5M params, 713 MB, capture layers [8, 24, 47] + Hebbian memory
  VRAM: 20.67 GB (+0.72)
  [D=8 random] OK — top=13048 ('Hi')

[6] D=8 eagle head (with checkpoint)...
  [EAGLE] Loaded legacy D=2 checkpoint. 54 new layer params initialized randomly.
  [FE-XT] Draft head: D=8, 356.5M params, 713 MB, capture layers [8, 24, 47] + Hebbian memory
  VRAM: 20.67 GB (+0.72)
  [D=8 with ckpt] OK — top=13048 ('Hi')

[7] D=8 eagle head (allocated, NOT registered on engine)...
  VRAM: 20.67 GB (+0.72)
  [D=8 unregistered] OK — top=13048 ('Hi')

[8] D=4 eagle head (checkpoint)...
  [EAGLE] Loaded legacy D=2 checkpoint. 18 new layer params initialized randomly.
  [FE-XT] Draft head: D=4, 188.8M params, 378 MB, capture layers [8, 24, 47] + Hebbian memory
  VRAM: 20.34 GB (+0.38)
  [D=4] OK — top=13048 ('Hi')

[9] D=8 eagle head, but _eagle_enabled=False...
  [EAGLE] Loaded legacy D=2 checkpoint. 54 new layer params initialized randomly.
  [FE-XT] Draft head: D=8, 356.5M params, 713 MB, capture layers [8, 24, 47] + Hebbian memory
  VRAM: 20.67 GB (+0.72)
  [D=8 flag OFF] OK — top=13048 ('Hi')

============================================================
  RESULTS
============================================================
  D=8 random:      OK
  D=8 with ckpt:   OK
  D=8 unregistered: OK
  D=4:             OK
  D=8 flag OFF:    OK