AbstractPhil commited on
Commit
094c5fd
Β·
verified Β·
1 Parent(s): ad9e476

Create cifar10_proto_diffuser_output.txt

Browse files
Files changed (1) hide show
  1. cifar10_proto_diffuser_output.txt +149 -0
cifar10_proto_diffuser_output.txt ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ======================================================================
2
+ FLOW MATCHING + CONSTELLATION RELAY REGULATOR
3
+ Dataset: CIFAR-10
4
+ Base channels: 64
5
+ Relay: True
6
+ Flow matching: ODE (conditional)
7
+ Sampler: Euler, 50 steps
8
+ Device: cuda
9
+ ======================================================================
10
+ Train: 50,000 images
11
+ Total params: 6,746,403
12
+ Relay params: 76,384 (1.1%)
13
+ Relay modules: 2
14
+
15
+ ======================================================================
16
+ TRAINING β€” 50 epochs
17
+ ======================================================================
18
+ E 1/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:11<00:00, 34.46b/s, loss=0.3728, lr=3.0e-04]
19
+ E 1: loss=0.3695 lr=3.0e-04 (11s) β˜…
20
+ β†’ Saved samples/epoch_001.png
21
+ E 2/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.22b/s, loss=0.2382, lr=3.0e-04]
22
+ E 2: loss=0.2379 lr=3.0e-04 (11s) β˜…
23
+ E 3/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.74b/s, loss=0.2233, lr=3.0e-04]
24
+ E 3: loss=0.2230 lr=3.0e-04 (11s) β˜…
25
+ E 4/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.82b/s, loss=0.2147, lr=3.0e-04]
26
+ E 4: loss=0.2145 lr=3.0e-04 (11s) β˜…
27
+ E 5/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 37.00b/s, loss=0.2094, lr=2.9e-04]
28
+ E 5: loss=0.2093 lr=2.9e-04 (11s) β˜…
29
+ β†’ Saved samples/epoch_005.png
30
+ E 6/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.99b/s, loss=0.2050, lr=2.9e-04]
31
+ E 6: loss=0.2049 lr=2.9e-04 (11s) β˜…
32
+ E 7/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.87b/s, loss=0.2010, lr=2.9e-04]
33
+ E 7: loss=0.2009 lr=2.9e-04 (11s) β˜…
34
+ E 8/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.73b/s, loss=0.1984, lr=2.8e-04]
35
+ E 8: loss=0.1983 lr=2.8e-04 (11s) β˜…
36
+ E 9/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.68b/s, loss=0.1966, lr=2.8e-04]
37
+ E 9: loss=0.1967 lr=2.8e-04 (11s) β˜…
38
+ E 10/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.75b/s, loss=0.1950, lr=2.7e-04]
39
+ E 10: loss=0.1951 lr=2.7e-04 (11s) β˜…
40
+ β†’ Saved samples/epoch_010.png
41
+ Relay diagnostics:
42
+ mid_block1.relay: drift=0.0382 rad (2.2Β°) gate=0.0519
43
+ mid_block2.relay: drift=0.0548 rad (3.1Β°) gate=0.0548
44
+ E 11/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.42b/s, loss=0.1947, lr=2.7e-04]
45
+ E 11: loss=0.1946 lr=2.7e-04 (11s) β˜…
46
+ E 12/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.88b/s, loss=0.1923, lr=2.6e-04]
47
+ E 12: loss=0.1923 lr=2.6e-04 (11s) β˜…
48
+ E 13/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.86b/s, loss=0.1910, lr=2.5e-04]
49
+ E 13: loss=0.1909 lr=2.5e-04 (11s) β˜…
50
+ E 14/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.50b/s, loss=0.1907, lr=2.5e-04]
51
+ E 14: loss=0.1907 lr=2.5e-04 (11s) β˜…
52
+ E 15/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.65b/s, loss=0.1901, lr=2.4e-04]
53
+ E 15: loss=0.1901 lr=2.4e-04 (11s) β˜…
54
+ β†’ Saved samples/epoch_015.png
55
+ E 16/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.56b/s, loss=0.1894, lr=2.3e-04]
56
+ E 16: loss=0.1893 lr=2.3e-04 (11s) β˜…
57
+ E 17/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.26b/s, loss=0.1881, lr=2.2e-04]
58
+ E 17: loss=0.1880 lr=2.2e-04 (11s) β˜…
59
+ E 18/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.85b/s, loss=0.1883, lr=2.1e-04]
60
+ E 18: loss=0.1883 lr=2.1e-04 (11s)
61
+ E 19/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.58b/s, loss=0.1875, lr=2.1e-04]
62
+ E 19: loss=0.1874 lr=2.1e-04 (11s) β˜…
63
+ E 20/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.81b/s, loss=0.1869, lr=2.0e-04]
64
+ E 20: loss=0.1870 lr=2.0e-04 (11s) β˜…
65
+ β†’ Saved samples/epoch_020.png
66
+ Relay diagnostics:
67
+ mid_block1.relay: drift=0.0703 rad (4.0Β°) gate=0.0561
68
+ mid_block2.relay: drift=0.0938 rad (5.4Β°) gate=0.0618
69
+ E 21/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.78b/s, loss=0.1853, lr=1.9e-04]
70
+ E 21: loss=0.1853 lr=1.9e-04 (11s) β˜…
71
+ E 22/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.75b/s, loss=0.1864, lr=1.8e-04]
72
+ E 22: loss=0.1864 lr=1.8e-04 (11s)
73
+ E 23/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.73b/s, loss=0.1851, lr=1.7e-04]
74
+ E 23: loss=0.1851 lr=1.7e-04 (11s) β˜…
75
+ E 24/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.72b/s, loss=0.1849, lr=1.6e-04]
76
+ E 24: loss=0.1849 lr=1.6e-04 (11s) β˜…
77
+ E 25/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.78b/s, loss=0.1850, lr=1.5e-04]
78
+ E 25: loss=0.1849 lr=1.5e-04 (11s) β˜…
79
+ β†’ Saved samples/epoch_025.png
80
+ E 26/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.78b/s, loss=0.1851, lr=1.4e-04]
81
+ E 26: loss=0.1848 lr=1.4e-04 (11s) β˜…
82
+ E 27/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.85b/s, loss=0.1835, lr=1.3e-04]
83
+ E 27: loss=0.1833 lr=1.3e-04 (11s) β˜…
84
+ E 28/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.86b/s, loss=0.1840, lr=1.2e-04]
85
+ E 28: loss=0.1839 lr=1.2e-04 (11s)
86
+ E 29/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.88b/s, loss=0.1837, lr=1.1e-04]
87
+ E 29: loss=0.1837 lr=1.1e-04 (11s)
88
+ E 30/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.47b/s, loss=0.1823, lr=1.0e-04]
89
+ E 30: loss=0.1822 lr=1.0e-04 (11s) β˜…
90
+ β†’ Saved samples/epoch_030.png
91
+ Relay diagnostics:
92
+ mid_block1.relay: drift=0.0918 rad (5.3Β°) gate=0.0586
93
+ mid_block2.relay: drift=0.1132 rad (6.5Β°) gate=0.0649
94
+ E 31/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.72b/s, loss=0.1823, lr=9.6e-05]
95
+ E 31: loss=0.1823 lr=9.5e-05 (11s)
96
+ E 32/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.64b/s, loss=0.1823, lr=8.7e-05]
97
+ E 32: loss=0.1823 lr=8.7e-05 (11s)
98
+ E 33/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.85b/s, loss=0.1816, lr=7.9e-05]
99
+ E 33: loss=0.1816 lr=7.8e-05 (11s) β˜…
100
+ E 34/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.73b/s, loss=0.1809, lr=7.1e-05]
101
+ E 34: loss=0.1809 lr=7.0e-05 (11s) β˜…
102
+ E 35/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.80b/s, loss=0.1810, lr=6.3e-05]
103
+ E 35: loss=0.1810 lr=6.3e-05 (11s)
104
+ β†’ Saved samples/epoch_035.png
105
+ E 36/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.77b/s, loss=0.1819, lr=5.5e-05]
106
+ E 36: loss=0.1819 lr=5.5e-05 (11s)
107
+ E 37/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.79b/s, loss=0.1812, lr=4.8e-05]
108
+ E 37: loss=0.1813 lr=4.8e-05 (11s)
109
+ E 38/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.86b/s, loss=0.1808, lr=4.2e-05]
110
+ E 38: loss=0.1808 lr=4.2e-05 (11s) β˜…
111
+ E 39/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.73b/s, loss=0.1815, lr=3.5e-05]
112
+ E 39: loss=0.1814 lr=3.5e-05 (11s)
113
+ E 40/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.71b/s, loss=0.1800, lr=3.0e-05]
114
+ E 40: loss=0.1800 lr=3.0e-05 (11s) β˜…
115
+ β†’ Saved samples/epoch_040.png
116
+ Relay diagnostics:
117
+ mid_block1.relay: drift=0.0964 rad (5.5Β°) gate=0.0593
118
+ mid_block2.relay: drift=0.1163 rad (6.7Β°) gate=0.0657
119
+ E 41/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.53b/s, loss=0.1803, lr=2.4e-05]
120
+ E 41: loss=0.1803 lr=2.4e-05 (11s)
121
+ E 42/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.30b/s, loss=0.1801, lr=2.0e-05]
122
+ E 42: loss=0.1801 lr=1.9e-05 (11s)
123
+ E 43/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.80b/s, loss=0.1800, lr=1.5e-05]
124
+ E 43: loss=0.1799 lr=1.5e-05 (11s) β˜…
125
+ E 44/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.84b/s, loss=0.1801, lr=1.2e-05]
126
+ E 44: loss=0.1799 lr=1.1e-05 (11s)
127
+ E 45/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.48b/s, loss=0.1800, lr=8.4e-06]
128
+ E 45: loss=0.1799 lr=8.3e-06 (11s) β˜…
129
+ β†’ Saved samples/epoch_045.png
130
+ E 46/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.62b/s, loss=0.1803, lr=5.8e-06]
131
+ E 46: loss=0.1805 lr=5.7e-06 (11s)
132
+ E 47/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.73b/s, loss=0.1803, lr=3.7e-06]
133
+ E 47: loss=0.1803 lr=3.6e-06 (11s)
134
+ E 48/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.84b/s, loss=0.1791, lr=2.2e-06]
135
+ E 48: loss=0.1793 lr=2.2e-06 (11s) β˜…
136
+ E 49/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.45b/s, loss=0.1796, lr=1.3e-06]
137
+ E 49: loss=0.1796 lr=1.3e-06 (11s)
138
+ E 50/50: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 390/390 [00:10<00:00, 36.74b/s, loss=0.1797, lr=1.0e-06]
139
+ E 50: loss=0.1797 lr=1.0e-06 (11s)
140
+ β†’ Saved samples/epoch_050.png
141
+ Relay diagnostics:
142
+ mid_block1.relay: drift=0.0968 rad (5.5Β°) gate=0.0594
143
+ mid_block2.relay: drift=0.1164 rad (6.7Β°) gate=0.0658
144
+
145
+ ======================================================================
146
+ DONE β€” Best loss: 0.1793
147
+ Params: 6,746,403 (relay: 76,384)
148
+ Samples in: samples/
149
+ ======================================================================