Spaces:
Running
Running
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>GPT-OSS vs MegaBlocks MoE Comparison</title> | |
| <script> | |
| // Apply theme immediately to prevent flicker | |
| (function() { | |
| const configTheme = 'dark'; | |
| let theme; | |
| if (configTheme === 'auto') { | |
| theme = window.matchMedia('(prefers-color-scheme: dark)').matches ? 'dark' : 'light'; | |
| } else { | |
| theme = localStorage.getItem('uvnote-theme') || configTheme; | |
| } | |
| document.documentElement.setAttribute('data-theme', theme); | |
| })(); | |
| </script> | |
| <style> | |
| :root[data-theme="light"] { | |
| --bg-primary: #ffffff; | |
| --bg-secondary: #f6f8fa; | |
| --bg-tertiary: #f8f9fa; | |
| --bg-code: #f8f9fa; | |
| --bg-error: #fdf2f2; | |
| --bg-artifact: #e6f3ff; | |
| --bg-artifact-hover: #d0e7ff; | |
| --text-primary: #333; | |
| --text-secondary: #656d76; | |
| --text-error: #c53030; | |
| --text-link: #0969da; | |
| --border-primary: #e1e5e9; | |
| --border-error: #e53e3e; | |
| --border-cell-failed: #d73a49; | |
| --shadow: rgba(0, 0, 0, 0.1); | |
| } | |
| :root[data-theme="dark"] { | |
| --bg-primary: #0a0a0a; | |
| --bg-secondary: #121212; | |
| --bg-tertiary: #181818; | |
| --bg-code: #0d0d0d; | |
| --bg-error: #1a0f0f; | |
| --bg-artifact: #151515; | |
| --bg-artifact-hover: #1a1a1a; | |
| --text-primary: #e0e0e0; | |
| --text-secondary: #888888; | |
| --text-error: #ff6b6b; | |
| --text-link: #64b5f6; | |
| --border-primary: #2a2a2a; | |
| --border-error: #ff6b6b; | |
| --border-cell-failed: #ff6b6b; | |
| --shadow: rgba(255, 255, 255, 0.05); | |
| } | |
| html { | |
| overscroll-behavior: none; | |
| } | |
| body { | |
| font-family: 'Cascadia Mono', 'Cascadia Code', 'JetBrains Mono', 'SF Mono', Monaco, 'Consolas', monospace; | |
| line-height: 1.4; | |
| max-width: 1000px; | |
| margin: 0 auto; | |
| padding: 15px; | |
| color: var(--text-primary); | |
| background: var(--bg-primary); | |
| transition: background-color 0.2s ease, color 0.2s ease; | |
| overscroll-behavior: none; | |
| } | |
| /* Two panel layout removed */ | |
| .controls { | |
| position: fixed; | |
| top: 20px; | |
| right: 20px; | |
| display: flex; | |
| gap: 0.5rem; | |
| z-index: 1000; | |
| } | |
| .menu-button { | |
| position: relative; | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| padding: 8px 12px; | |
| border-radius: 4px; | |
| color: var(--text-secondary); | |
| cursor: pointer; | |
| font-family: inherit; | |
| font-size: 0.9rem; | |
| user-select: none; | |
| } | |
| .menu-button:hover { | |
| color: var(--text-primary); | |
| background: var(--bg-tertiary); | |
| } | |
| .menu-dropdown { | |
| position: absolute; | |
| top: 100%; | |
| right: 0; | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| border-radius: 4px; | |
| box-shadow: 0 4px 12px var(--shadow); | |
| min-width: 160px; | |
| opacity: 0; | |
| visibility: hidden; | |
| transform: translateY(-8px); | |
| transition: all 0.2s ease; | |
| z-index: 1001; | |
| margin-top: 4px; | |
| } | |
| .menu-button.active .menu-dropdown { | |
| opacity: 1; | |
| visibility: visible; | |
| transform: translateY(0); | |
| } | |
| .menu-item { | |
| display: block; | |
| padding: 8px 12px; | |
| color: var(--text-secondary); | |
| text-decoration: none; | |
| font-size: 0.85rem; | |
| border-bottom: 1px solid var(--border-primary); | |
| cursor: pointer; | |
| } | |
| .menu-item:last-child { | |
| border-bottom: none; | |
| } | |
| .menu-item:hover { | |
| background: var(--bg-tertiary); | |
| color: var(--text-primary); | |
| } | |
| .menu-checkbox { | |
| display: inline-block; | |
| width: 16px; | |
| font-family: monospace; | |
| color: var(--text-link); | |
| } | |
| .theme-toggle, | |
| .reset-toggle { | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| padding: 8px 12px; | |
| border-radius: 4px; | |
| color: var(--text-secondary); | |
| cursor: pointer; | |
| font-family: inherit; | |
| font-size: 0.9rem; | |
| user-select: none; | |
| } | |
| .theme-toggle:hover, | |
| .reset-toggle:hover { | |
| color: var(--text-primary); | |
| background: var(--bg-tertiary); | |
| } | |
| .system-info { | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| border-radius: 4px; | |
| padding: 8px 12px; | |
| margin-bottom: 16px; | |
| font-size: 0.85em; | |
| color: var(--text-secondary); | |
| } | |
| .system-info-header { | |
| font-weight: 600; | |
| color: var(--text-primary); | |
| margin-bottom: 2px; | |
| } | |
| .system-info-content { | |
| font-family: monospace; | |
| } | |
| .theme-toggle, .reset-toggle { | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| border-radius: 2px; | |
| padding: 0.4rem 0.6rem; | |
| cursor: pointer; | |
| font-family: inherit; | |
| font-size: 0.8rem; | |
| color: var(--text-secondary); | |
| user-select: none; | |
| transition: all 0.2s ease; | |
| text-transform: lowercase; | |
| letter-spacing: 0; | |
| } | |
| .theme-toggle:hover, .reset-toggle:hover { | |
| background: var(--bg-tertiary); | |
| border-color: var(--text-secondary); | |
| color: var(--text-primary); | |
| } | |
| .minimap { | |
| position: fixed; | |
| bottom: 20px; | |
| right: 20px; | |
| width: 220px; | |
| max-height: 400px; | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| border-radius: 2px; | |
| padding: 0.5rem; | |
| font-size: 0.7rem; | |
| overflow-y: auto; | |
| z-index: 100; | |
| opacity: 0.9; | |
| transition: opacity 0.2s ease; | |
| } | |
| .file-explorer { | |
| position: fixed; | |
| bottom: 20px; /* default; JS will stack */ | |
| right: 20px; | |
| left: auto; | |
| top: auto; | |
| width: 220px; | |
| max-height: 400px; | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| border-radius: 2px; | |
| padding: 0.5rem; | |
| font-size: 0.7rem; | |
| overflow-y: auto; | |
| z-index: 100; | |
| opacity: 0.9; | |
| transition: opacity 0.2s ease; | |
| } | |
| /* Drawing overlay */ | |
| .draw-overlay { | |
| position: fixed; | |
| top: 0; | |
| left: 0; | |
| width: 100vw; | |
| height: 100vh; | |
| z-index: 80; /* under widgets (100) and controls (1000) */ | |
| display: block; | |
| pointer-events: none; /* enabled only when a tool is active */ | |
| } | |
| /* Tools widget */ | |
| .tools-widget { | |
| position: fixed; | |
| bottom: 20px; /* default; JS will stack */ | |
| right: 20px; | |
| left: auto; | |
| top: auto; | |
| width: 220px; | |
| background: var(--bg-secondary); | |
| border: 1px solid var(--border-primary); | |
| border-radius: 2px; | |
| padding: 0.5rem; | |
| font-size: 0.7rem; | |
| z-index: 100; | |
| opacity: 0.95; | |
| } | |
| .tools-title { | |
| font-weight: bold; | |
| color: var(--text-secondary); | |
| margin-bottom: 0.5rem; | |
| padding-bottom: 0.25rem; | |
| border-bottom: 1px solid var(--border-primary); | |
| cursor: grab; | |
| user-select: none; | |
| } | |
| .tools-row { display: flex; gap: 0.4rem; flex-wrap: wrap; } | |
| .tool-button { | |
| background: var(--bg-tertiary); | |
| border: 1px solid var(--border-primary); | |
| border-radius: 2px; | |
| padding: 0.25rem 0.4rem; | |
| cursor: pointer; | |
| color: var(--text-secondary); | |
| font-family: inherit; | |
| font-size: 0.75rem; | |
| user-select: none; | |
| } | |
| .tool-button:hover { color: var(--text-primary); } | |
| .tool-button.active { color: var(--text-primary); border-color: var(--text-secondary); background: var(--bg-secondary); } | |
| .minimap:hover, .file-explorer:hover { | |
| opacity: 1; | |
| } | |
| .minimap-title { | |
| font-weight: bold; | |
| color: var(--text-secondary); | |
| margin-bottom: 0.5rem; | |
| padding-bottom: 0.25rem; | |
| border-bottom: 1px solid var(--border-primary); | |
| cursor: grab; /* drag handle */ | |
| user-select: none; | |
| } | |
| .minimap-item { | |
| display: block; | |
| color: var(--text-secondary); | |
| text-decoration: none; | |
| padding: 0.15rem 0; | |
| border-left: 2px solid transparent; | |
| padding-left: 0.5rem; | |
| transition: all 0.2s ease; | |
| cursor: pointer; | |
| } | |
| .minimap-item:hover { | |
| color: var(--text-primary); | |
| border-left-color: var(--text-secondary); | |
| } | |
| .minimap-item.active { | |
| color: var(--text-primary); | |
| border-left-color: var(--text-link); | |
| } | |
| .minimap-heading { | |
| font-weight: normal; | |
| } | |
| .minimap-heading.h1 { padding-left: 0.5rem; } | |
| .minimap-heading.h2 { padding-left: 1rem; } | |
| .minimap-heading.h3 { padding-left: 1.5rem; } | |
| .minimap-heading.h4 { padding-left: 2rem; } | |
| .minimap-heading.h5 { padding-left: 2.5rem; } | |
| .minimap-heading.h6 { padding-left: 3rem; } | |
| .minimap-cell { | |
| color: var(--text-link); | |
| opacity: 0.8; | |
| font-style: italic; | |
| } | |
| .minimap-cell:hover { | |
| opacity: 1; | |
| } | |
| .file-explorer-title { | |
| font-weight: bold; | |
| color: var(--text-secondary); | |
| margin-bottom: 0.5rem; | |
| padding-bottom: 0.25rem; | |
| border-bottom: 1px solid var(--border-primary); | |
| cursor: grab; /* drag handle */ | |
| user-select: none; | |
| } | |
| .file-explorer-section { | |
| margin-bottom: 0.75rem; | |
| } | |
| .file-explorer-section-title { | |
| font-weight: bold; | |
| color: var(--text-secondary); | |
| font-size: 0.65rem; | |
| margin-bottom: 0.25rem; | |
| text-transform: uppercase; | |
| letter-spacing: 0.5px; | |
| } | |
| .file-explorer-item { | |
| display: block; | |
| color: var(--text-secondary); | |
| text-decoration: none; | |
| padding: 0.1rem 0; | |
| margin-left: 0.5rem; | |
| transition: color 0.2s ease; | |
| cursor: pointer; | |
| font-family: monospace; | |
| } | |
| .file-explorer-item:hover { | |
| color: var(--text-primary); | |
| } | |
| .file-explorer-item.script { | |
| color: var(--text-link); | |
| } | |
| .file-explorer-item.artifact { | |
| color: var(--text-secondary); | |
| opacity: 0.8; | |
| } | |
| /* Hide widgets on smaller screens */ | |
| @media (max-width: 768px) { | |
| .minimap, .file-explorer, .tools-widget { | |
| display: none; | |
| } | |
| } | |
| .cell { | |
| margin: 1rem 0; | |
| border: 1px solid var(--border-primary); | |
| border-radius: 2px; | |
| overflow: hidden; | |
| background: var(--bg-secondary); | |
| } | |
| .cell-header { | |
| background: var(--bg-secondary); | |
| padding: 0.5rem 1rem; | |
| border-bottom: 1px solid var(--border-primary); | |
| font-family: inherit; | |
| font-size: 0.85rem; | |
| color: var(--text-secondary); | |
| cursor: pointer; | |
| user-select: none; | |
| transition: background-color 0.2s ease; | |
| } | |
| .cell-header:hover { | |
| background: var(--bg-tertiary); | |
| } | |
| .collapse-indicators { | |
| color: var(--text-secondary); | |
| font-size: 0.8rem; | |
| opacity: 0.7; | |
| } | |
| .collapse-indicators span:hover { | |
| color: var(--text-primary); | |
| opacity: 1; | |
| } | |
| .cell-code { | |
| display: block; | |
| background: var(--bg-code); | |
| } | |
| .cell-code.collapsed { | |
| display: none; | |
| } | |
| .cell-code pre { | |
| margin: 0; | |
| padding: 0.75rem; | |
| background: var(--bg-code); | |
| overflow-x: auto; | |
| color: var(--text-primary); | |
| } | |
| .cell-output { | |
| padding: 0.75rem; | |
| background: var(--bg-primary); | |
| } | |
| .cell-output.collapsed { | |
| display: none; | |
| } | |
| .cell-stdout { | |
| background: var(--bg-tertiary); | |
| padding: 0.75rem; | |
| border-radius: 1px; | |
| margin: 0.25rem 0; | |
| font-family: inherit; | |
| font-size: 0.9rem; | |
| white-space: pre-wrap; | |
| color: var(--text-primary); | |
| } | |
| .cell-stderr { | |
| background: var(--bg-error); | |
| border-left: 2px solid var(--border-error); | |
| padding: 1rem; | |
| margin: 0.5rem 0; | |
| font-family: inherit; | |
| font-size: 0.9rem; | |
| color: var(--text-error); | |
| white-space: pre-wrap; | |
| } | |
| .uv-install-logs { | |
| margin: 0.5rem 0; | |
| } | |
| .uv-logs-header { | |
| cursor: pointer; | |
| padding: 0.75rem; | |
| border-left: 3px solid var(--border-color); | |
| font-family: inherit; | |
| font-size: 0.85rem; | |
| color: var(--text-secondary); | |
| user-select: none; | |
| } | |
| .uv-logs-content { | |
| background: var(--bg-secondary); | |
| padding: 1rem; | |
| border-left: 3px solid var(--border-color); | |
| white-space: pre-wrap; | |
| font-family: monospace; | |
| font-size: 0.85rem; | |
| color: var(--text-secondary); | |
| overflow-x: auto; | |
| } | |
| .cell-artifacts { | |
| margin: 1rem 0; | |
| } | |
| .cell-artifacts h4 { | |
| margin: 0 0 0.5rem 0; | |
| color: var(--text-secondary); | |
| font-size: 0.9rem; | |
| } | |
| .artifact { | |
| display: inline-block; | |
| background: var(--bg-artifact); | |
| padding: 0.25rem 0.5rem; | |
| border-radius: 1px; | |
| margin: 0.25rem 0.5rem 0.25rem 0; | |
| font-family: inherit; | |
| font-size: 0.8rem; | |
| color: var(--text-link); | |
| text-decoration: none; | |
| transition: background-color 0.2s ease; | |
| border: 1px solid var(--border-primary); | |
| } | |
| .artifact:hover { | |
| background: var(--bg-artifact-hover); | |
| } | |
| .artifact-preview { | |
| margin-top: 1rem; | |
| } | |
| .artifact-preview img { | |
| max-width: 100%; | |
| height: auto; | |
| border: 1px solid var(--border-primary); | |
| border-radius: 1px; | |
| } | |
| .artifact-preview svg { | |
| max-width: 100%; | |
| height: auto; | |
| border: 1px solid var(--border-primary); | |
| border-radius: 1px; | |
| display: block; | |
| } | |
| /* Style SVG text elements */ | |
| .artifact-preview svg g { | |
| fill: var(--text-primary) ; | |
| } | |
| /* Auto-theme SVG elements */ | |
| .artifact-preview svg { | |
| background: transparent; | |
| } | |
| .cell-failed { | |
| border-color: var(--border-cell-failed); | |
| } | |
| .cell-failed .cell-header { | |
| background: var(--bg-error); | |
| color: var(--text-error); | |
| } | |
| .cell-commented { | |
| opacity: 0.6; | |
| border-style: dashed; | |
| } | |
| .cell-commented .cell-header { | |
| background: var(--bg-secondary); | |
| color: var(--text-secondary); | |
| font-style: italic; | |
| } | |
| .run-btn { | |
| background: var(--bg-tertiary); | |
| border: 1px solid var(--border-primary); | |
| padding: 2px 6px; | |
| border-radius: 2px; | |
| color: var(--text-secondary); | |
| cursor: pointer; | |
| font-size: 0.75em; | |
| font-family: inherit; | |
| margin-left: 4px; | |
| } | |
| .run-btn:hover { | |
| color: var(--text-primary); | |
| background: var(--bg-primary); | |
| } | |
| .run-btn:disabled { | |
| opacity: 0.6; | |
| cursor: not-allowed; | |
| } | |
| .copy-btn { | |
| background: var(--bg-tertiary); | |
| border: 1px solid var(--border-primary); | |
| padding: 2px 6px; | |
| border-radius: 2px; | |
| color: var(--text-secondary); | |
| cursor: pointer; | |
| font-size: 0.75em; | |
| font-family: inherit; | |
| margin-left: 4px; | |
| } | |
| .copy-btn:hover { | |
| color: var(--text-primary); | |
| background: var(--bg-primary); | |
| } | |
| .copy-btn:disabled { | |
| opacity: 0.6; | |
| cursor: not-allowed; | |
| } | |
| .raw-btn { | |
| background: var(--bg-tertiary); | |
| border: 1px solid var(--border-primary); | |
| padding: 2px 6px; | |
| border-radius: 2px; | |
| color: var(--text-secondary); | |
| cursor: pointer; | |
| font-size: 0.75em; | |
| font-family: inherit; | |
| margin-left: 4px; | |
| text-decoration: none; | |
| display: inline-block; | |
| } | |
| .raw-btn:hover { | |
| color: var(--text-primary); | |
| background: var(--bg-primary); | |
| text-decoration: none; | |
| } | |
| .output-stale { | |
| opacity: 0.5; | |
| position: relative; | |
| } | |
| .output-stale::after { | |
| content: '⏳ updating...'; | |
| position: absolute; | |
| top: 8px; | |
| right: 8px; | |
| background: var(--bg-secondary); | |
| padding: 4px 8px; | |
| border-radius: 2px; | |
| font-size: 0.75em; | |
| color: var(--text-secondary); | |
| border: 1px solid var(--border-primary); | |
| } | |
| h1, h2, h3, h4, h5, h6 { | |
| margin-top: 1.5rem; | |
| margin-bottom: 0.75rem; | |
| color: var(--text-primary); | |
| } | |
| h1 { | |
| margin-top: 0; | |
| margin-bottom: 1rem; | |
| } | |
| p { | |
| margin: 0.75rem 0; | |
| color: var(--text-primary); | |
| } | |
| a { | |
| color: var(--text-link); | |
| } | |
| img { | |
| max-width: 100%; | |
| height: auto; | |
| border-radius: 1px; | |
| box-shadow: none; | |
| } | |
| pre, code { | |
| font-family: 'Cascadia Mono', 'Cascadia Code', 'JetBrains Mono', 'SF Mono', Monaco, 'Consolas', monospace; | |
| } | |
| /* Line numbers */ | |
| .highlight-with-lines { | |
| display: flex; | |
| } | |
| .line-numbers { | |
| background: var(--bg-tertiary); | |
| padding: 0.75rem 0.5rem; | |
| font-family: 'Cascadia Mono', 'Cascadia Code', 'JetBrains Mono', 'SF Mono', Monaco, 'Consolas', monospace; | |
| font-size: 0.9rem; | |
| color: var(--text-secondary); | |
| user-select: none; | |
| text-align: right; | |
| border-right: 1px solid var(--border-primary); | |
| } | |
| .line-numbers .line-number { | |
| display: block; | |
| line-height: 1.5; | |
| } | |
| .highlight-with-lines .highlight { | |
| flex: 1; | |
| } | |
| .highlight-with-lines .highlight pre { | |
| padding-left: 0.75rem; | |
| } | |
| /* Collapsed code styling */ | |
| .cell-code.collapsed { | |
| display: none; | |
| } | |
| .cell-code.expanded { | |
| display: block; | |
| } | |
| .cell-code { | |
| display: block; | |
| } | |
| pre { line-height: 125%; } | |
| td.linenos .normal { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } | |
| span.linenos { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } | |
| td.linenos .special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } | |
| span.linenos.special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } | |
| [data-theme="light"] .highlight .hll { background-color: #ffffcc } | |
| [data-theme="light"] .highlight { background: #f8f8f8; } | |
| [data-theme="light"] .highlight .c { color: #3D7B7B; font-style: italic } /* Comment */ | |
| [data-theme="light"] .highlight .err { border: 1px solid #F00 } /* Error */ | |
| [data-theme="light"] .highlight .k { color: #008000; font-weight: bold } /* Keyword */ | |
| [data-theme="light"] .highlight .o { color: #666 } /* Operator */ | |
| [data-theme="light"] .highlight .ch { color: #3D7B7B; font-style: italic } /* Comment.Hashbang */ | |
| [data-theme="light"] .highlight .cm { color: #3D7B7B; font-style: italic } /* Comment.Multiline */ | |
| [data-theme="light"] .highlight .cp { color: #9C6500 } /* Comment.Preproc */ | |
| [data-theme="light"] .highlight .cpf { color: #3D7B7B; font-style: italic } /* Comment.PreprocFile */ | |
| [data-theme="light"] .highlight .c1 { color: #3D7B7B; font-style: italic } /* Comment.Single */ | |
| [data-theme="light"] .highlight .cs { color: #3D7B7B; font-style: italic } /* Comment.Special */ | |
| [data-theme="light"] .highlight .gd { color: #A00000 } /* Generic.Deleted */ | |
| [data-theme="light"] .highlight .ge { font-style: italic } /* Generic.Emph */ | |
| [data-theme="light"] .highlight .ges { font-weight: bold; font-style: italic } /* Generic.EmphStrong */ | |
| [data-theme="light"] .highlight .gr { color: #E40000 } /* Generic.Error */ | |
| [data-theme="light"] .highlight .gh { color: #000080; font-weight: bold } /* Generic.Heading */ | |
| [data-theme="light"] .highlight .gi { color: #008400 } /* Generic.Inserted */ | |
| [data-theme="light"] .highlight .go { color: #717171 } /* Generic.Output */ | |
| [data-theme="light"] .highlight .gp { color: #000080; font-weight: bold } /* Generic.Prompt */ | |
| [data-theme="light"] .highlight .gs { font-weight: bold } /* Generic.Strong */ | |
| [data-theme="light"] .highlight .gu { color: #800080; font-weight: bold } /* Generic.Subheading */ | |
| [data-theme="light"] .highlight .gt { color: #04D } /* Generic.Traceback */ | |
| [data-theme="light"] .highlight .kc { color: #008000; font-weight: bold } /* Keyword.Constant */ | |
| [data-theme="light"] .highlight .kd { color: #008000; font-weight: bold } /* Keyword.Declaration */ | |
| [data-theme="light"] .highlight .kn { color: #008000; font-weight: bold } /* Keyword.Namespace */ | |
| [data-theme="light"] .highlight .kp { color: #008000 } /* Keyword.Pseudo */ | |
| [data-theme="light"] .highlight .kr { color: #008000; font-weight: bold } /* Keyword.Reserved */ | |
| [data-theme="light"] .highlight .kt { color: #B00040 } /* Keyword.Type */ | |
| [data-theme="light"] .highlight .m { color: #666 } /* Literal.Number */ | |
| [data-theme="light"] .highlight .s { color: #BA2121 } /* Literal.String */ | |
| [data-theme="light"] .highlight .na { color: #687822 } /* Name.Attribute */ | |
| [data-theme="light"] .highlight .nb { color: #008000 } /* Name.Builtin */ | |
| [data-theme="light"] .highlight .nc { color: #00F; font-weight: bold } /* Name.Class */ | |
| [data-theme="light"] .highlight .no { color: #800 } /* Name.Constant */ | |
| [data-theme="light"] .highlight .nd { color: #A2F } /* Name.Decorator */ | |
| [data-theme="light"] .highlight .ni { color: #717171; font-weight: bold } /* Name.Entity */ | |
| [data-theme="light"] .highlight .ne { color: #CB3F38; font-weight: bold } /* Name.Exception */ | |
| [data-theme="light"] .highlight .nf { color: #00F } /* Name.Function */ | |
| [data-theme="light"] .highlight .nl { color: #767600 } /* Name.Label */ | |
| [data-theme="light"] .highlight .nn { color: #00F; font-weight: bold } /* Name.Namespace */ | |
| [data-theme="light"] .highlight .nt { color: #008000; font-weight: bold } /* Name.Tag */ | |
| [data-theme="light"] .highlight .nv { color: #19177C } /* Name.Variable */ | |
| [data-theme="light"] .highlight .ow { color: #A2F; font-weight: bold } /* Operator.Word */ | |
| [data-theme="light"] .highlight .w { color: #BBB } /* Text.Whitespace */ | |
| [data-theme="light"] .highlight .mb { color: #666 } /* Literal.Number.Bin */ | |
| [data-theme="light"] .highlight .mf { color: #666 } /* Literal.Number.Float */ | |
| [data-theme="light"] .highlight .mh { color: #666 } /* Literal.Number.Hex */ | |
| [data-theme="light"] .highlight .mi { color: #666 } /* Literal.Number.Integer */ | |
| [data-theme="light"] .highlight .mo { color: #666 } /* Literal.Number.Oct */ | |
| [data-theme="light"] .highlight .sa { color: #BA2121 } /* Literal.String.Affix */ | |
| [data-theme="light"] .highlight .sb { color: #BA2121 } /* Literal.String.Backtick */ | |
| [data-theme="light"] .highlight .sc { color: #BA2121 } /* Literal.String.Char */ | |
| [data-theme="light"] .highlight .dl { color: #BA2121 } /* Literal.String.Delimiter */ | |
| [data-theme="light"] .highlight .sd { color: #BA2121; font-style: italic } /* Literal.String.Doc */ | |
| [data-theme="light"] .highlight .s2 { color: #BA2121 } /* Literal.String.Double */ | |
| [data-theme="light"] .highlight .se { color: #AA5D1F; font-weight: bold } /* Literal.String.Escape */ | |
| [data-theme="light"] .highlight .sh { color: #BA2121 } /* Literal.String.Heredoc */ | |
| [data-theme="light"] .highlight .si { color: #A45A77; font-weight: bold } /* Literal.String.Interpol */ | |
| [data-theme="light"] .highlight .sx { color: #008000 } /* Literal.String.Other */ | |
| [data-theme="light"] .highlight .sr { color: #A45A77 } /* Literal.String.Regex */ | |
| [data-theme="light"] .highlight .s1 { color: #BA2121 } /* Literal.String.Single */ | |
| [data-theme="light"] .highlight .ss { color: #19177C } /* Literal.String.Symbol */ | |
| [data-theme="light"] .highlight .bp { color: #008000 } /* Name.Builtin.Pseudo */ | |
| [data-theme="light"] .highlight .fm { color: #00F } /* Name.Function.Magic */ | |
| [data-theme="light"] .highlight .vc { color: #19177C } /* Name.Variable.Class */ | |
| [data-theme="light"] .highlight .vg { color: #19177C } /* Name.Variable.Global */ | |
| [data-theme="light"] .highlight .vi { color: #19177C } /* Name.Variable.Instance */ | |
| [data-theme="light"] .highlight .vm { color: #19177C } /* Name.Variable.Magic */ | |
| [data-theme="light"] .highlight .il { color: #666 } /* Literal.Number.Integer.Long */ | |
| pre { line-height: 125%; } | |
| td.linenos .normal { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } | |
| span.linenos { color: inherit; background-color: transparent; padding-left: 5px; padding-right: 5px; } | |
| td.linenos .special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } | |
| span.linenos.special { color: #000000; background-color: #ffffc0; padding-left: 5px; padding-right: 5px; } | |
| [data-theme="dark"] .highlight .hll { background-color: #49483e } | |
| [data-theme="dark"] .highlight { background: #272822; color: #F8F8F2 } | |
| [data-theme="dark"] .highlight .c { color: #959077 } /* Comment */ | |
| [data-theme="dark"] .highlight .err { color: #ED007E; background-color: #1E0010 } /* Error */ | |
| [data-theme="dark"] .highlight .esc { color: #F8F8F2 } /* Escape */ | |
| [data-theme="dark"] .highlight .g { color: #F8F8F2 } /* Generic */ | |
| [data-theme="dark"] .highlight .k { color: #66D9EF } /* Keyword */ | |
| [data-theme="dark"] .highlight .l { color: #AE81FF } /* Literal */ | |
| [data-theme="dark"] .highlight .n { color: #F8F8F2 } /* Name */ | |
| [data-theme="dark"] .highlight .o { color: #FF4689 } /* Operator */ | |
| [data-theme="dark"] .highlight .x { color: #F8F8F2 } /* Other */ | |
| [data-theme="dark"] .highlight .p { color: #F8F8F2 } /* Punctuation */ | |
| [data-theme="dark"] .highlight .ch { color: #959077 } /* Comment.Hashbang */ | |
| [data-theme="dark"] .highlight .cm { color: #959077 } /* Comment.Multiline */ | |
| [data-theme="dark"] .highlight .cp { color: #959077 } /* Comment.Preproc */ | |
| [data-theme="dark"] .highlight .cpf { color: #959077 } /* Comment.PreprocFile */ | |
| [data-theme="dark"] .highlight .c1 { color: #959077 } /* Comment.Single */ | |
| [data-theme="dark"] .highlight .cs { color: #959077 } /* Comment.Special */ | |
| [data-theme="dark"] .highlight .gd { color: #FF4689 } /* Generic.Deleted */ | |
| [data-theme="dark"] .highlight .ge { color: #F8F8F2; font-style: italic } /* Generic.Emph */ | |
| [data-theme="dark"] .highlight .ges { color: #F8F8F2; font-weight: bold; font-style: italic } /* Generic.EmphStrong */ | |
| [data-theme="dark"] .highlight .gr { color: #F8F8F2 } /* Generic.Error */ | |
| [data-theme="dark"] .highlight .gh { color: #F8F8F2 } /* Generic.Heading */ | |
| [data-theme="dark"] .highlight .gi { color: #A6E22E } /* Generic.Inserted */ | |
| [data-theme="dark"] .highlight .go { color: #66D9EF } /* Generic.Output */ | |
| [data-theme="dark"] .highlight .gp { color: #FF4689; font-weight: bold } /* Generic.Prompt */ | |
| [data-theme="dark"] .highlight .gs { color: #F8F8F2; font-weight: bold } /* Generic.Strong */ | |
| [data-theme="dark"] .highlight .gu { color: #959077 } /* Generic.Subheading */ | |
| [data-theme="dark"] .highlight .gt { color: #F8F8F2 } /* Generic.Traceback */ | |
| [data-theme="dark"] .highlight .kc { color: #66D9EF } /* Keyword.Constant */ | |
| [data-theme="dark"] .highlight .kd { color: #66D9EF } /* Keyword.Declaration */ | |
| [data-theme="dark"] .highlight .kn { color: #FF4689 } /* Keyword.Namespace */ | |
| [data-theme="dark"] .highlight .kp { color: #66D9EF } /* Keyword.Pseudo */ | |
| [data-theme="dark"] .highlight .kr { color: #66D9EF } /* Keyword.Reserved */ | |
| [data-theme="dark"] .highlight .kt { color: #66D9EF } /* Keyword.Type */ | |
| [data-theme="dark"] .highlight .ld { color: #E6DB74 } /* Literal.Date */ | |
| [data-theme="dark"] .highlight .m { color: #AE81FF } /* Literal.Number */ | |
| [data-theme="dark"] .highlight .s { color: #E6DB74 } /* Literal.String */ | |
| [data-theme="dark"] .highlight .na { color: #A6E22E } /* Name.Attribute */ | |
| [data-theme="dark"] .highlight .nb { color: #F8F8F2 } /* Name.Builtin */ | |
| [data-theme="dark"] .highlight .nc { color: #A6E22E } /* Name.Class */ | |
| [data-theme="dark"] .highlight .no { color: #66D9EF } /* Name.Constant */ | |
| [data-theme="dark"] .highlight .nd { color: #A6E22E } /* Name.Decorator */ | |
| [data-theme="dark"] .highlight .ni { color: #F8F8F2 } /* Name.Entity */ | |
| [data-theme="dark"] .highlight .ne { color: #A6E22E } /* Name.Exception */ | |
| [data-theme="dark"] .highlight .nf { color: #A6E22E } /* Name.Function */ | |
| [data-theme="dark"] .highlight .nl { color: #F8F8F2 } /* Name.Label */ | |
| [data-theme="dark"] .highlight .nn { color: #F8F8F2 } /* Name.Namespace */ | |
| [data-theme="dark"] .highlight .nx { color: #A6E22E } /* Name.Other */ | |
| [data-theme="dark"] .highlight .py { color: #F8F8F2 } /* Name.Property */ | |
| [data-theme="dark"] .highlight .nt { color: #FF4689 } /* Name.Tag */ | |
| [data-theme="dark"] .highlight .nv { color: #F8F8F2 } /* Name.Variable */ | |
| [data-theme="dark"] .highlight .ow { color: #FF4689 } /* Operator.Word */ | |
| [data-theme="dark"] .highlight .pm { color: #F8F8F2 } /* Punctuation.Marker */ | |
| [data-theme="dark"] .highlight .w { color: #F8F8F2 } /* Text.Whitespace */ | |
| [data-theme="dark"] .highlight .mb { color: #AE81FF } /* Literal.Number.Bin */ | |
| [data-theme="dark"] .highlight .mf { color: #AE81FF } /* Literal.Number.Float */ | |
| [data-theme="dark"] .highlight .mh { color: #AE81FF } /* Literal.Number.Hex */ | |
| [data-theme="dark"] .highlight .mi { color: #AE81FF } /* Literal.Number.Integer */ | |
| [data-theme="dark"] .highlight .mo { color: #AE81FF } /* Literal.Number.Oct */ | |
| [data-theme="dark"] .highlight .sa { color: #E6DB74 } /* Literal.String.Affix */ | |
| [data-theme="dark"] .highlight .sb { color: #E6DB74 } /* Literal.String.Backtick */ | |
| [data-theme="dark"] .highlight .sc { color: #E6DB74 } /* Literal.String.Char */ | |
| [data-theme="dark"] .highlight .dl { color: #E6DB74 } /* Literal.String.Delimiter */ | |
| [data-theme="dark"] .highlight .sd { color: #E6DB74 } /* Literal.String.Doc */ | |
| [data-theme="dark"] .highlight .s2 { color: #E6DB74 } /* Literal.String.Double */ | |
| [data-theme="dark"] .highlight .se { color: #AE81FF } /* Literal.String.Escape */ | |
| [data-theme="dark"] .highlight .sh { color: #E6DB74 } /* Literal.String.Heredoc */ | |
| [data-theme="dark"] .highlight .si { color: #E6DB74 } /* Literal.String.Interpol */ | |
| [data-theme="dark"] .highlight .sx { color: #E6DB74 } /* Literal.String.Other */ | |
| [data-theme="dark"] .highlight .sr { color: #E6DB74 } /* Literal.String.Regex */ | |
| [data-theme="dark"] .highlight .s1 { color: #E6DB74 } /* Literal.String.Single */ | |
| [data-theme="dark"] .highlight .ss { color: #E6DB74 } /* Literal.String.Symbol */ | |
| [data-theme="dark"] .highlight .bp { color: #F8F8F2 } /* Name.Builtin.Pseudo */ | |
| [data-theme="dark"] .highlight .fm { color: #A6E22E } /* Name.Function.Magic */ | |
| [data-theme="dark"] .highlight .vc { color: #F8F8F2 } /* Name.Variable.Class */ | |
| [data-theme="dark"] .highlight .vg { color: #F8F8F2 } /* Name.Variable.Global */ | |
| [data-theme="dark"] .highlight .vi { color: #F8F8F2 } /* Name.Variable.Instance */ | |
| [data-theme="dark"] .highlight .vm { color: #F8F8F2 } /* Name.Variable.Magic */ | |
| [data-theme="dark"] .highlight .il { color: #AE81FF } /* Literal.Number.Integer.Long */ | |
| /* Custom CSS from frontmatter */ | |
| .cell-stderr { max-height: 200px; overflow: auto; } | |
| .minimap { display: none ; } | |
| .file-explorer { display: none ; } | |
| .cell-code { max-height: 400px; overflow: auto; } | |
| /* Cursor for tools */ | |
| body[data-tool="arrow"] .main-content { | |
| cursor: url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="%23e53935" stroke-width="2"><path d="M12 19l7-7 3 3-7 7-3-3z"/><path d="M18 13l-1.5-7.5L2 2l3.5 14.5L13 18l5-5z"/><path d="M2 2l7.586 7.586"/><circle cx="11" cy="11" r="2"/></svg>') 12 12, crosshair; | |
| } | |
| body[data-tool="pen"] .main-content { | |
| cursor: url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="%23e53935" stroke-width="2"><path d="M12 19l7-7 3 3-7 7-3-3z"/><path d="M18 13l-1.5-7.5L2 2l3.5 14.5L13 18l5-5z"/><circle cx="4" cy="20" r="2" fill="%23e53935"/></svg>') 4 20, pointer; | |
| } | |
| body[data-tool="eraser"] .main-content { | |
| cursor: url('data:image/svg+xml;utf8,<svg xmlns="http://www.w3.org/2000/svg" width="24" height="24" viewBox="0 0 24 24" fill="none" stroke="%23e53935" stroke-width="2"><path d="M20 20H7l-7-7 7-7h13v14z"/><path d="M13 13l7-7"/><path d="M13 13L9 9"/></svg>') 12 12, auto; | |
| } | |
| /* Color picker styles */ | |
| .tools-section-title { | |
| font-weight: bold; | |
| color: var(--text-secondary); | |
| font-size: 0.65rem; | |
| margin: 0.75rem 0 0.5rem 0; | |
| text-transform: uppercase; | |
| letter-spacing: 0.5px; | |
| } | |
| .color-row { | |
| display: grid; | |
| grid-template-columns: repeat(6, 1fr); | |
| gap: 0.25rem; | |
| margin-bottom: 0.5rem; | |
| } | |
| .color-swatch { | |
| width: 18px; | |
| height: 18px; | |
| border: 2px solid var(--border-primary); | |
| border-radius: 3px; | |
| cursor: pointer; | |
| transition: all 0.2s ease; | |
| position: relative; | |
| } | |
| .color-swatch:hover { | |
| transform: scale(1.1); | |
| border-color: var(--text-secondary); | |
| } | |
| .color-swatch.selected { | |
| border-color: var(--text-primary); | |
| box-shadow: 0 0 0 2px var(--text-link); | |
| } | |
| .color-swatch.selected::after { | |
| content: '✓'; | |
| position: absolute; | |
| top: 50%; | |
| left: 50%; | |
| transform: translate(-50%, -50%); | |
| color: white; | |
| font-size: 10px; | |
| font-weight: bold; | |
| text-shadow: 1px 1px 1px black; | |
| } | |
| .color-input { | |
| width: 24px; | |
| height: 24px; | |
| border: 2px solid var(--border-primary); | |
| border-radius: 3px; | |
| cursor: pointer; | |
| background: none; | |
| padding: 0; | |
| grid-column: span 2; | |
| justify-self: center; | |
| } | |
| .color-input:hover { | |
| border-color: var(--text-secondary); | |
| } | |
| /* Thickness slider styles */ | |
| .thickness-row { | |
| display: flex; | |
| align-items: center; | |
| gap: 0.5rem; | |
| margin-top: 0.75rem; | |
| } | |
| .thickness-slider { | |
| flex: 1; | |
| -webkit-appearance: none; | |
| appearance: none; | |
| height: 4px; | |
| background: var(--border-primary); | |
| border-radius: 2px; | |
| outline: none; | |
| opacity: 0.7; | |
| transition: opacity 0.2s; | |
| } | |
| .thickness-slider:hover { | |
| opacity: 1; | |
| } | |
| .thickness-slider::-webkit-slider-thumb { | |
| -webkit-appearance: none; | |
| appearance: none; | |
| width: 12px; | |
| height: 12px; | |
| background: var(--text-link); | |
| border-radius: 50%; | |
| cursor: pointer; | |
| } | |
| .thickness-slider::-moz-range-thumb { | |
| width: 12px; | |
| height: 12px; | |
| background: var(--text-link); | |
| border-radius: 50%; | |
| cursor: pointer; | |
| border: none; | |
| } | |
| .thickness-value { | |
| font-size: 0.7rem; | |
| color: var(--text-secondary); | |
| min-width: 20px; | |
| text-align: right; | |
| } | |
| .highlight { | |
| background: none ; | |
| } | |
| /* Loading animations */ | |
| .loading-spinner { | |
| display: inline-block; | |
| width: 16px; | |
| height: 16px; | |
| border: 2px solid var(--border-primary); | |
| border-radius: 50%; | |
| border-top-color: var(--text-link); | |
| animation: spin 1s linear infinite; | |
| margin-right: 8px; | |
| vertical-align: middle; | |
| } | |
| @keyframes spin { | |
| to { transform: rotate(360deg); } | |
| } | |
| .loading-skeleton { | |
| display: inline-block; | |
| background: var(--bg-tertiary); | |
| background: linear-gradient( | |
| 90deg, | |
| var(--bg-tertiary) 25%, | |
| var(--bg-secondary) 50%, | |
| var(--bg-tertiary) 75% | |
| ); | |
| background-size: 200% 100%; | |
| animation: loading-shimmer 2s ease-in-out infinite; | |
| border-radius: 2px; | |
| height: 1em; | |
| width: 80px; | |
| vertical-align: middle; | |
| } | |
| @keyframes loading-shimmer { | |
| 0% { background-position: -200% 0; } | |
| 100% { background-position: 200% 0; } | |
| } | |
| /* Loading state for cell output */ | |
| .cell-output:has(.loading-spinner) { | |
| opacity: 0.7; | |
| background: var(--bg-secondary); | |
| border-left: 3px solid var(--text-link); | |
| } | |
| </style> | |
| <script> | |
| // --- Drag utilities --- | |
| function clamp(val, min, max) { return Math.max(min, Math.min(max, val)); } | |
| function restorePosition(el, storageKey) { | |
| try { | |
| const raw = localStorage.getItem(storageKey); | |
| if (!raw) return; | |
| const pos = JSON.parse(raw); | |
| if (typeof pos.left === 'number' && typeof pos.top === 'number') { | |
| el.style.left = pos.left + 'px'; | |
| el.style.top = pos.top + 'px'; | |
| el.style.right = 'auto'; | |
| el.style.bottom = 'auto'; | |
| } | |
| } catch (_) {} | |
| } | |
| function savePosition(el, storageKey) { | |
| try { | |
| const left = parseFloat(el.style.left || 'NaN'); | |
| const top = parseFloat(el.style.top || 'NaN'); | |
| if (!Number.isNaN(left) && !Number.isNaN(top)) { | |
| localStorage.setItem(storageKey, JSON.stringify({ left, top })); | |
| } | |
| } catch (_) {} | |
| } | |
| function makeDraggable(el, storageKey, handleEl) { | |
| let dragging = false; | |
| let startX = 0, startY = 0; // cursor | |
| let origLeft = 0, origTop = 0; // element | |
| const onMove = (e) => { | |
| if (!dragging) return; | |
| const clientX = e.touches ? e.touches[0].clientX : e.clientX; | |
| const clientY = e.touches ? e.touches[0].clientY : e.clientY; | |
| const dx = clientX - startX; | |
| const dy = clientY - startY; | |
| const w = el.offsetWidth; | |
| const h = el.offsetHeight; | |
| const maxX = window.innerWidth - w; | |
| const maxY = window.innerHeight - h; | |
| const newLeft = clamp(origLeft + dx, 0, maxX); | |
| const newTop = clamp(origTop + dy, 0, maxY); | |
| el.style.left = newLeft + 'px'; | |
| el.style.top = newTop + 'px'; | |
| el.style.right = 'auto'; | |
| el.style.bottom = 'auto'; | |
| }; | |
| const endDrag = () => { | |
| if (!dragging) return; | |
| dragging = false; | |
| document.removeEventListener('mousemove', onMove); | |
| document.removeEventListener('mouseup', endDrag); | |
| document.removeEventListener('touchmove', onMove); | |
| document.removeEventListener('touchend', endDrag); | |
| handleEl && (handleEl.style.cursor = 'grab'); | |
| savePosition(el, storageKey); | |
| // ensure no-overlap constraint after a drag | |
| try { layoutWidgetsStackedBottomRight(); } catch (_) {} | |
| }; | |
| const startDrag = (e) => { | |
| // Start from element's current on-screen rect | |
| const elRect = el.getBoundingClientRect(); | |
| el.style.left = elRect.left + 'px'; | |
| el.style.top = elRect.top + 'px'; | |
| el.style.right = 'auto'; | |
| el.style.bottom = 'auto'; | |
| dragging = true; | |
| startX = e.touches ? e.touches[0].clientX : e.clientX; | |
| startY = e.touches ? e.touches[0].clientY : e.clientY; | |
| origLeft = elRect.left; | |
| origTop = elRect.top; | |
| document.addEventListener('mousemove', onMove); | |
| document.addEventListener('mouseup', endDrag); | |
| document.addEventListener('touchmove', onMove, { passive: false }); | |
| document.addEventListener('touchend', endDrag); | |
| handleEl && (handleEl.style.cursor = 'grabbing'); | |
| e.preventDefault(); | |
| }; | |
| (handleEl || el).addEventListener('mousedown', startDrag); | |
| (handleEl || el).addEventListener('touchstart', startDrag, { passive: false }); | |
| // Apply any saved position on init | |
| restorePosition(el, storageKey); | |
| } | |
| function toggleCell(cellId) { | |
| const codeElement = document.getElementById('code-' + cellId); | |
| const outputElement = document.getElementById('output-' + cellId); | |
| if (codeElement) { | |
| codeElement.classList.toggle('collapsed'); | |
| } | |
| if (outputElement) { | |
| outputElement.classList.toggle('collapsed'); | |
| } | |
| updateIndicators(cellId); | |
| } | |
| function toggleCode(cellId) { | |
| const codeElement = document.getElementById('code-' + cellId); | |
| if (codeElement) { | |
| codeElement.classList.toggle('collapsed'); | |
| updateIndicators(cellId); | |
| } | |
| } | |
| function toggleOutput(cellId) { | |
| const outputElement = document.getElementById('output-' + cellId); | |
| if (outputElement) { | |
| outputElement.classList.toggle('collapsed'); | |
| updateIndicators(cellId); | |
| } | |
| } | |
| function toggleUvLogs(headerElement) { | |
| const contentElement = headerElement.nextElementSibling; | |
| if (contentElement) { | |
| const isCollapsed = contentElement.style.display === 'none'; | |
| contentElement.style.display = isCollapsed ? 'block' : 'none'; | |
| headerElement.textContent = isCollapsed ? '▼ UV Install Logs' : '▶ UV Install Logs'; | |
| // Update the header indicator if it exists | |
| const uvLogsDiv = headerElement.parentElement; | |
| if (uvLogsDiv && uvLogsDiv.id && uvLogsDiv.id.startsWith('uv-logs-')) { | |
| const cellId = uvLogsDiv.id.replace('uv-logs-', ''); | |
| const indicatorElement = document.getElementById('uv-indicator-' + cellId); | |
| if (indicatorElement) { | |
| indicatorElement.textContent = isCollapsed ? '▼ uv-logs' : '▶ uv-logs'; | |
| } | |
| } | |
| } | |
| } | |
| function toggleUvLogsFromHeader(cellId) { | |
| const uvLogsElement = document.getElementById('uv-logs-' + cellId); | |
| const indicatorElement = document.getElementById('uv-indicator-' + cellId); | |
| if (uvLogsElement) { | |
| const headerElement = uvLogsElement.querySelector('.uv-logs-header'); | |
| const contentElement = uvLogsElement.querySelector('.uv-logs-content'); | |
| if (contentElement && headerElement) { | |
| const isCollapsed = contentElement.style.display === 'none'; | |
| contentElement.style.display = isCollapsed ? 'block' : 'none'; | |
| headerElement.textContent = isCollapsed ? '▼ UV Install Logs' : '▶ UV Install Logs'; | |
| if (indicatorElement) { | |
| indicatorElement.textContent = isCollapsed ? '▼ uv-logs' : '▶ uv-logs'; | |
| } | |
| } | |
| } | |
| } | |
| function updateIndicators(cellId) { | |
| const codeElement = document.getElementById('code-' + cellId); | |
| const outputElement = document.getElementById('output-' + cellId); | |
| const indicators = document.querySelector(`[onclick*="${cellId}"]`)?.closest('.cell-header')?.querySelector('.collapse-indicators'); | |
| if (indicators) { | |
| const codeCollapsed = codeElement && codeElement.classList.contains('collapsed'); | |
| const outputCollapsed = outputElement && outputElement.classList.contains('collapsed'); | |
| const codeIcon = codeCollapsed ? '▶' : '▼'; | |
| const outputIcon = outputCollapsed ? '▶' : '▼'; | |
| const codeSpan = indicators.querySelector('[onclick*="toggleCode"]'); | |
| const outputSpan = indicators.querySelector('[onclick*="toggleOutput"]'); | |
| if (codeSpan) codeSpan.innerHTML = `${codeIcon} code`; | |
| if (outputSpan) outputSpan.innerHTML = `${outputIcon} output`; | |
| } | |
| } | |
| function toggleTheme() { | |
| const html = document.documentElement; | |
| const currentTheme = html.getAttribute('data-theme'); | |
| const newTheme = currentTheme === 'dark' ? 'light' : 'dark'; | |
| html.setAttribute('data-theme', newTheme); | |
| localStorage.setItem('uvnote-theme', newTheme); | |
| updateThemeIcon(); | |
| } | |
| // Two panel code removed | |
| function updateThemeIcon() { | |
| const theme = document.documentElement.getAttribute('data-theme'); | |
| const toggle = document.querySelector('.theme-toggle'); | |
| if (toggle) { | |
| toggle.textContent = theme === 'dark' ? 'light' : 'dark'; | |
| } | |
| } | |
| function resetLayout() { | |
| try { | |
| // Clear all uvnote-* keys | |
| const allKeys = Object.keys(localStorage); | |
| const uvnoteKeys = allKeys.filter(key => key.startsWith('uvnote-')); | |
| uvnoteKeys.forEach(k => localStorage.removeItem(k)); | |
| } catch (_) {} | |
| // Reload to reinitialize UI with defaults | |
| location.reload(); | |
| } | |
| function toggleMenu() { | |
| const menuButton = document.querySelector('.menu-button'); | |
| if (menuButton) { | |
| menuButton.classList.toggle('active'); | |
| } | |
| } | |
| function toggleWidget(widgetName) { | |
| let widget; | |
| let checkbox; | |
| // Close the menu first | |
| const menuButton = document.querySelector('.menu-button'); | |
| if (menuButton) { | |
| menuButton.classList.remove('active'); | |
| } | |
| switch(widgetName) { | |
| case 'tools': | |
| widget = document.querySelector('.tools-widget'); | |
| checkbox = document.getElementById('checkbox-tools'); | |
| break; | |
| case 'file-explorer': | |
| widget = document.querySelector('.file-explorer'); | |
| checkbox = document.getElementById('checkbox-file-explorer'); | |
| break; | |
| case 'minimap': | |
| widget = document.querySelector('.minimap'); | |
| checkbox = document.getElementById('checkbox-minimap'); | |
| break; | |
| default: | |
| return; | |
| } | |
| if (widget && checkbox) { | |
| const isVisible = getComputedStyle(widget).display !== 'none'; | |
| widget.style.display = isVisible ? 'none' : 'block'; | |
| checkbox.textContent = isVisible ? '☐' : '☑'; | |
| // Save state to localStorage | |
| try { | |
| localStorage.setItem(`uvnote-widget-${widgetName}`, isVisible ? 'hidden' : 'visible'); | |
| } catch (_) {} | |
| // Re-layout widgets after visibility change | |
| try { | |
| layoutWidgetsStackedBottomRight(); | |
| } catch (_) {} | |
| } | |
| } | |
| function initializeWidgetVisibility() { | |
| const widgets = [ | |
| { name: 'tools', selector: '.tools-widget' }, | |
| { name: 'file-explorer', selector: '.file-explorer' }, | |
| { name: 'minimap', selector: '.minimap' } | |
| ]; | |
| widgets.forEach(({ name, selector }) => { | |
| const savedState = localStorage.getItem(`uvnote-widget-${name}`) || 'hidden'; | |
| const widget = document.querySelector(selector); | |
| const checkbox = document.getElementById(`checkbox-${name}`); | |
| if (widget && checkbox) { | |
| const isVisible = savedState === 'visible'; | |
| widget.style.display = isVisible ? 'block' : 'none'; | |
| checkbox.textContent = isVisible ? '☑' : '☐'; | |
| } | |
| }); | |
| } | |
| // Close menu when clicking outside | |
| document.addEventListener('click', function(event) { | |
| const menuButton = document.querySelector('.menu-button'); | |
| // Don't close if clicking on a menu item (let the item handler close it) | |
| if (menuButton && !menuButton.contains(event.target)) { | |
| menuButton.classList.remove('active'); | |
| } | |
| }); | |
| // Layout: stack widgets bottom-right and equalize widths | |
| function hasCustomWidgetPositions() { | |
| try { | |
| return ( | |
| localStorage.getItem('uvnote-minimap-pos') || | |
| localStorage.getItem('uvnote-file-explorer-pos') || | |
| localStorage.getItem('uvnote-tools-pos') | |
| ); | |
| } catch (_) { return false; } | |
| } | |
| function rectsOverlap(r1, r2) { | |
| return !(r1.right <= r2.left || r2.right <= r1.left || r1.bottom <= r2.top || r2.bottom <= r1.top); | |
| } | |
| function widgetsOverlap(widgets) { | |
| for (let i = 0; i < widgets.length; i++) { | |
| const a = widgets[i]; | |
| const ra = a.getBoundingClientRect(); | |
| for (let j = i + 1; j < widgets.length; j++) { | |
| const b = widgets[j]; | |
| const rb = b.getBoundingClientRect(); | |
| if (rectsOverlap(ra, rb)) return true; | |
| } | |
| } | |
| return false; | |
| } | |
| function applyStackLayout(widgets, order) { | |
| if (!widgets.length) return; | |
| // Fixed equal width | |
| const fixedWidth = 220; | |
| widgets.forEach(el => { el.style.width = fixedWidth + 'px'; }); | |
| // Fit heights if needed to avoid overflow | |
| const gap = 12; | |
| const available = Math.max(0, window.innerHeight - 40 - gap * (order.length - 1)); | |
| const eachMax = Math.floor(available / order.length); | |
| order.forEach(el => { | |
| el.style.maxHeight = eachMax + 'px'; | |
| el.style.overflowY = 'auto'; | |
| }); | |
| // Stack bottom-up in the requested order | |
| let bottomOffset = 20; // base gutter | |
| order.forEach(el => { | |
| el.style.left = 'auto'; | |
| el.style.top = 'auto'; | |
| el.style.right = '20px'; | |
| el.style.bottom = bottomOffset + 'px'; | |
| bottomOffset += el.offsetHeight + gap; | |
| }); | |
| } | |
| function layoutWidgetsStackedBottomRight() { | |
| const minimap = document.querySelector('.minimap'); | |
| const fileExplorer = document.querySelector('.file-explorer'); | |
| const tools = document.querySelector('.tools-widget'); | |
| const widgets = [minimap, fileExplorer, tools].filter(el => el && getComputedStyle(el).display !== 'none'); | |
| if (!widgets.length) return; | |
| const order = [minimap, fileExplorer, tools].filter(Boolean).filter(el => getComputedStyle(el).display !== 'none'); | |
| // If user placed custom positions and there is no overlap, respect them. | |
| if (hasCustomWidgetPositions() && !widgetsOverlap(widgets)) return; | |
| applyStackLayout(widgets, order); | |
| } | |
| // Panel icon removed | |
| let _minimapScrollContainer = null; | |
| let _minimapScrollHandler = null; | |
| function initMinimap() { | |
| // Generate minimap content | |
| const minimap = createMinimap(); | |
| document.body.appendChild(minimap); | |
| // Make draggable (use title as handle) | |
| const mTitle = minimap.querySelector('.minimap-title'); | |
| makeDraggable(minimap, 'uvnote-minimap-pos', mTitle); | |
| // Attach scroll listener to window (two-panel removed) | |
| _minimapScrollContainer = window; | |
| if (_minimapScrollContainer) { | |
| _minimapScrollHandler = () => updateMinimapActive(); | |
| if (_minimapScrollContainer === window) { | |
| window.addEventListener('scroll', _minimapScrollHandler); | |
| } else { | |
| _minimapScrollContainer.addEventListener('scroll', _minimapScrollHandler); | |
| } | |
| } | |
| updateMinimapActive(); | |
| } | |
| function teardownMinimap() { | |
| const minimap = document.querySelector('.minimap'); | |
| if (minimap && minimap.parentNode) minimap.parentNode.removeChild(minimap); | |
| if (_minimapScrollContainer && _minimapScrollHandler) { | |
| if (_minimapScrollContainer === window) { | |
| window.removeEventListener('scroll', _minimapScrollHandler); | |
| } else { | |
| _minimapScrollContainer.removeEventListener('scroll', _minimapScrollHandler); | |
| } | |
| } | |
| _minimapScrollContainer = null; | |
| _minimapScrollHandler = null; | |
| } | |
| function initFileExplorer() { | |
| // Generate file explorer content | |
| const fileExplorer = createFileExplorer(); | |
| document.body.appendChild(fileExplorer); | |
| } | |
| function createMinimap() { | |
| const minimap = document.createElement('div'); | |
| minimap.className = 'minimap'; | |
| const title = document.createElement('div'); | |
| title.className = 'minimap-title'; | |
| title.textContent = 'navigation'; | |
| minimap.appendChild(title); | |
| // Find all headings and cells | |
| const root = document.querySelector('.main-content') || document; | |
| const headings = root.querySelectorAll('h1, h2, h3, h4, h5, h6'); | |
| const cells = root.querySelectorAll('.cell'); | |
| // Combine and sort by position | |
| const items = []; | |
| headings.forEach(heading => { | |
| const id = heading.id || generateId(heading.textContent); | |
| if (!heading.id) heading.id = id; | |
| items.push({ | |
| element: heading, | |
| type: 'heading', | |
| level: parseInt(heading.tagName.charAt(1)), | |
| text: heading.textContent.trim(), | |
| id: id, | |
| position: heading.getBoundingClientRect().top + window.scrollY | |
| }); | |
| }); | |
| cells.forEach(cell => { | |
| const header = cell.querySelector('.cell-header'); | |
| if (header) { | |
| const id = cell.id || `cell-${Date.now()}-${Math.random().toString(36).substr(2, 9)}`; | |
| if (!cell.id) cell.id = id; | |
| items.push({ | |
| element: cell, | |
| type: 'cell', | |
| text: header.textContent.trim(), | |
| id: id, | |
| position: cell.getBoundingClientRect().top + window.scrollY | |
| }); | |
| } | |
| }); | |
| // Sort by position | |
| items.sort((a, b) => a.position - b.position); | |
| // Create minimap items | |
| items.forEach(item => { | |
| const link = document.createElement('a'); | |
| link.className = `minimap-item ${item.type === 'heading' ? 'minimap-heading' : 'minimap-cell'}`; | |
| if (item.type === 'heading') { | |
| link.classList.add(`h${item.level}`); | |
| } | |
| link.textContent = item.text.length > 25 ? item.text.substring(0, 22) + '...' : item.text; | |
| link.href = `#${item.id}`; | |
| link.onclick = function(e) { | |
| e.preventDefault(); | |
| item.element.scrollIntoView({ behavior: 'smooth', block: 'start' }); | |
| }; | |
| minimap.appendChild(link); | |
| }); | |
| return minimap; | |
| } | |
| function generateId(text) { | |
| return text.toLowerCase() | |
| .replace(/[^a-z0-9]+/g, '-') | |
| .replace(/^-+|-+$/g, '') | |
| .substring(0, 20); | |
| } | |
| function updateMinimapActive() { | |
| const minimapItems = document.querySelectorAll('.minimap-item'); | |
| const container = _minimapScrollContainer || window; | |
| const containerRect = container === window ? null : container.getBoundingClientRect(); | |
| const scrollPos = (container === window ? window.scrollY : container.scrollTop) + 100; // Offset for better detection | |
| let activeItem = null; | |
| minimapItems.forEach(item => { | |
| const targetId = item.getAttribute('href').substring(1); | |
| const target = document.getElementById(targetId); | |
| if (target) { | |
| const rectTop = target.getBoundingClientRect().top; | |
| const targetPos = (container === window) | |
| ? rectTop + window.scrollY | |
| : rectTop - containerRect.top + container.scrollTop; | |
| if (targetPos <= scrollPos) { | |
| activeItem = item; | |
| } | |
| } | |
| item.classList.remove('active'); | |
| }); | |
| if (activeItem) { | |
| activeItem.classList.add('active'); | |
| } | |
| } | |
| function createFileExplorer() { | |
| const fileExplorer = document.createElement('div'); | |
| fileExplorer.className = 'file-explorer'; | |
| const title = document.createElement('div'); | |
| title.className = 'file-explorer-title'; | |
| title.textContent = 'files'; | |
| fileExplorer.appendChild(title); | |
| // Make draggable (use title as handle) | |
| makeDraggable(fileExplorer, 'uvnote-file-explorer-pos', title); | |
| // Scripts section | |
| const scriptsSection = document.createElement('div'); | |
| scriptsSection.className = 'file-explorer-section'; | |
| const scriptsTitle = document.createElement('div'); | |
| scriptsTitle.className = 'file-explorer-section-title'; | |
| scriptsTitle.textContent = 'scripts'; | |
| scriptsSection.appendChild(scriptsTitle); | |
| // Find all cells and list their script files (single panel) | |
| const root = document.querySelector('.main-content') || document; | |
| const cells = root.querySelectorAll('.cell'); | |
| cells.forEach(cell => { | |
| const header = cell.querySelector('.cell-header'); | |
| if (header) { | |
| const cellText = header.textContent.trim(); | |
| const cellMatch = cellText.match(/Cell: ([a-zA-Z_][a-zA-Z0-9_]*)/); | |
| if (cellMatch) { | |
| const cellId = cellMatch[1]; | |
| const scriptItem = document.createElement('div'); | |
| scriptItem.className = 'file-explorer-item script'; | |
| scriptItem.textContent = `${cellId}.py`; | |
| scriptItem.onclick = function() { | |
| cell.scrollIntoView({ behavior: 'smooth', block: 'start' }); | |
| }; | |
| scriptsSection.appendChild(scriptItem); | |
| } | |
| } | |
| }); | |
| fileExplorer.appendChild(scriptsSection); | |
| // Artifacts section | |
| const artifactsSection = document.createElement('div'); | |
| artifactsSection.className = 'file-explorer-section'; | |
| const artifactsTitle = document.createElement('div'); | |
| artifactsTitle.className = 'file-explorer-section-title'; | |
| artifactsTitle.textContent = 'artifacts'; | |
| artifactsSection.appendChild(artifactsTitle); | |
| // Find all artifact links (single panel) | |
| const artifactsRoot = document.querySelector('.main-content') || document; | |
| const artifacts = artifactsRoot.querySelectorAll('.artifact'); | |
| if (artifacts.length === 0) { | |
| const noArtifacts = document.createElement('div'); | |
| noArtifacts.className = 'file-explorer-item artifact'; | |
| noArtifacts.textContent = '(none)'; | |
| noArtifacts.style.opacity = '0.5'; | |
| artifactsSection.appendChild(noArtifacts); | |
| } else { | |
| artifacts.forEach(artifact => { | |
| const artifactItem = document.createElement('div'); | |
| artifactItem.className = 'file-explorer-item artifact'; | |
| artifactItem.textContent = artifact.textContent; | |
| artifactItem.onclick = function() { | |
| artifact.click(); | |
| }; | |
| artifactsSection.appendChild(artifactItem); | |
| }); | |
| } | |
| fileExplorer.appendChild(artifactsSection); | |
| return fileExplorer; | |
| } | |
| // Tools widget | |
| let _cursorX = 0; | |
| let _cursorY = 0; | |
| let _cursorVisible = false; | |
| function setActiveTool(tool) { | |
| if (!tool || tool === 'none') { | |
| document.body.dataset.tool = 'none'; | |
| localStorage.setItem('uvnote-active-tool', 'none'); | |
| setOverlayActive(false); | |
| _cursorVisible = false; | |
| // Remove active class from all tool buttons when deactivating | |
| const toolButtons = document.querySelectorAll('.tools-widget .tool-button'); | |
| toolButtons.forEach(btn => btn.classList.remove('active')); | |
| return; | |
| } | |
| document.body.dataset.tool = tool; | |
| localStorage.setItem('uvnote-active-tool', tool); | |
| setOverlayActive(true); | |
| _cursorVisible = true; | |
| } | |
| // Make setActiveTool globally accessible for ESC key handler | |
| window.setActiveTool = setActiveTool; | |
| function getArrowColor() { | |
| const saved = localStorage.getItem('uvnote-arrow-color'); | |
| if (saved) return saved; | |
| return '#e53935'; // Default red color | |
| } | |
| function setStoredArrowColor(color) { | |
| try { localStorage.setItem('uvnote-arrow-color', color); } catch (_) {} | |
| } | |
| function getLineThickness() { | |
| const saved = localStorage.getItem('uvnote-line-thickness'); | |
| if (saved) return parseInt(saved, 10); | |
| return 4; // default thickness | |
| } | |
| function setStoredLineThickness(thickness) { | |
| try { localStorage.setItem('uvnote-line-thickness', thickness); } catch (_) {} | |
| } | |
| function getFadeoutTime() { | |
| const saved = localStorage.getItem('uvnote-fadeout-time'); | |
| if (saved) return parseInt(saved, 10); | |
| return 5; // default 5 seconds | |
| } | |
| function setStoredFadeoutTime(seconds) { | |
| try { localStorage.setItem('uvnote-fadeout-time', seconds); } catch (_) {} | |
| } | |
| function createToolsWidget() { | |
| const tools = document.createElement('div'); | |
| tools.className = 'tools-widget'; | |
| const title = document.createElement('div'); | |
| title.className = 'tools-title'; | |
| title.textContent = 'tools'; | |
| tools.appendChild(title); | |
| const row = document.createElement('div'); | |
| row.className = 'tools-row'; | |
| tools.appendChild(row); | |
| // Arrow tool | |
| const arrowBtn = document.createElement('div'); | |
| arrowBtn.className = 'tool-button'; | |
| arrowBtn.textContent = 'arrow'; | |
| arrowBtn.onclick = function() { | |
| const isActive = arrowBtn.classList.contains('active'); | |
| if (isActive) { | |
| arrowBtn.classList.remove('active'); | |
| setActiveTool('none'); | |
| } else { | |
| tools.querySelectorAll('.tool-button').forEach(b => b.classList.remove('active')); | |
| arrowBtn.classList.add('active'); | |
| setActiveTool('arrow'); | |
| } | |
| }; | |
| row.appendChild(arrowBtn); | |
| // Pen tool | |
| const penBtn = document.createElement('div'); | |
| penBtn.className = 'tool-button'; | |
| penBtn.textContent = 'pen'; | |
| penBtn.onclick = function() { | |
| const isActive = penBtn.classList.contains('active'); | |
| if (isActive) { | |
| penBtn.classList.remove('active'); | |
| setActiveTool('none'); | |
| } else { | |
| tools.querySelectorAll('.tool-button').forEach(b => b.classList.remove('active')); | |
| penBtn.classList.add('active'); | |
| setActiveTool('pen'); | |
| } | |
| }; | |
| row.appendChild(penBtn); | |
| // Eraser tool | |
| const eraseBtn = document.createElement('div'); | |
| eraseBtn.className = 'tool-button'; | |
| eraseBtn.textContent = 'eraser'; | |
| eraseBtn.onclick = function() { | |
| const isActive = eraseBtn.classList.contains('active'); | |
| if (isActive) { | |
| eraseBtn.classList.remove('active'); | |
| setActiveTool('none'); | |
| } else { | |
| tools.querySelectorAll('.tool-button').forEach(b => b.classList.remove('active')); | |
| eraseBtn.classList.add('active'); | |
| setActiveTool('eraser'); | |
| } | |
| }; | |
| row.appendChild(eraseBtn); | |
| // Spotlight tool | |
| const spotlightBtn = document.createElement('div'); | |
| spotlightBtn.className = 'tool-button'; | |
| spotlightBtn.textContent = 'spotlight'; | |
| spotlightBtn.onclick = function() { | |
| const isActive = spotlightBtn.classList.contains('active'); | |
| if (isActive) { | |
| spotlightBtn.classList.remove('active'); | |
| setActiveTool('none'); | |
| } else { | |
| tools.querySelectorAll('.tool-button').forEach(b => b.classList.remove('active')); | |
| spotlightBtn.classList.add('active'); | |
| setActiveTool('spotlight'); | |
| } | |
| }; | |
| row.appendChild(spotlightBtn); | |
| // Clear all | |
| const clearBtn = document.createElement('div'); | |
| clearBtn.className = 'tool-button'; | |
| clearBtn.textContent = 'clear'; | |
| clearBtn.onclick = function() { | |
| _shapes = []; | |
| saveShapes(); | |
| renderOverlay(); | |
| }; | |
| row.appendChild(clearBtn); | |
| // Restore active state from storage | |
| const saved = localStorage.getItem('uvnote-active-tool') || 'none'; | |
| if (saved === 'arrow') { | |
| arrowBtn.classList.add('active'); | |
| setActiveTool('arrow'); | |
| } else if (saved === 'pen') { | |
| penBtn.classList.add('active'); | |
| setActiveTool('pen'); | |
| } else if (saved === 'eraser') { | |
| eraseBtn.classList.add('active'); | |
| setActiveTool('eraser'); | |
| } else if (saved === 'spotlight') { | |
| spotlightBtn.classList.add('active'); | |
| setActiveTool('spotlight'); | |
| } | |
| // Color selector | |
| const colorTitle = document.createElement('div'); | |
| colorTitle.className = 'tools-section-title'; | |
| colorTitle.textContent = 'color'; | |
| tools.appendChild(colorTitle); | |
| const colorRow = document.createElement('div'); | |
| colorRow.className = 'tools-row color-row'; | |
| tools.appendChild(colorRow); | |
| const swatchColors = [ | |
| // Primary colors | |
| '#e53935', '#fb8c00', '#fdd835', '#43a047', '#1e88e5', '#8e24aa', | |
| // Additional useful colors | |
| '#ff5722', '#795548', '#607d8b', '#9c27b0', | |
| // Grayscale | |
| '#000000', '#424242', '#9e9e9e', '#ffffff' | |
| ]; | |
| const swatches = []; | |
| swatchColors.forEach(c => { | |
| const s = document.createElement('div'); | |
| s.className = 'color-swatch'; | |
| s.style.backgroundColor = c; | |
| s.title = c; | |
| s.onclick = () => { | |
| setStoredArrowColor(c); | |
| refreshColorUI(c); | |
| if (_cursorVisible) renderOverlay(); | |
| }; | |
| colorRow.appendChild(s); | |
| swatches.push(s); | |
| }); | |
| const colorInput = document.createElement('input'); | |
| colorInput.type = 'color'; | |
| colorInput.className = 'color-input'; | |
| colorInput.oninput = () => { | |
| setStoredArrowColor(colorInput.value); | |
| refreshColorUI(colorInput.value); | |
| if (_cursorVisible) renderOverlay(); | |
| }; | |
| colorRow.appendChild(colorInput); | |
| function refreshColorUI(selected) { | |
| const selectedHex = selected.startsWith('#') ? selected.toLowerCase() : rgbToHex(selected); | |
| swatches.forEach((s, i) => { | |
| const swatchHex = swatchColors[i].toLowerCase(); | |
| if (swatchHex === selectedHex) { | |
| s.classList.add('selected'); | |
| } else { | |
| s.classList.remove('selected'); | |
| } | |
| }); | |
| try { | |
| colorInput.value = selectedHex; | |
| } catch (_) {} | |
| } | |
| function rgbToHex(rgb) { | |
| const m = rgb.match(/rgba?\((\d+),\s*(\d+),\s*(\d+)/i); | |
| if (!m) return '#000000'; | |
| const r = parseInt(m[1]).toString(16).padStart(2, '0'); | |
| const g = parseInt(m[2]).toString(16).padStart(2, '0'); | |
| const b = parseInt(m[3]).toString(16).padStart(2, '0'); | |
| return `#${r}${g}${b}`; | |
| } | |
| // Restore color selection | |
| refreshColorUI(getArrowColor()); | |
| // Thickness slider | |
| const thicknessTitle = document.createElement('div'); | |
| thicknessTitle.className = 'tools-section-title'; | |
| thicknessTitle.textContent = 'thickness'; | |
| tools.appendChild(thicknessTitle); | |
| const thicknessRow = document.createElement('div'); | |
| thicknessRow.className = 'thickness-row'; | |
| tools.appendChild(thicknessRow); | |
| const thicknessSlider = document.createElement('input'); | |
| thicknessSlider.type = 'range'; | |
| thicknessSlider.className = 'thickness-slider'; | |
| thicknessSlider.min = '1'; | |
| thicknessSlider.max = '10'; | |
| thicknessSlider.value = getLineThickness(); | |
| const thicknessValue = document.createElement('span'); | |
| thicknessValue.className = 'thickness-value'; | |
| thicknessValue.textContent = thicknessSlider.value + 'px'; | |
| thicknessSlider.oninput = function() { | |
| const value = parseInt(thicknessSlider.value, 10); | |
| setStoredLineThickness(value); | |
| thicknessValue.textContent = value + 'px'; | |
| if (_cursorVisible) renderOverlay(); | |
| }; | |
| thicknessRow.appendChild(thicknessSlider); | |
| thicknessRow.appendChild(thicknessValue); | |
| // Fadeout time slider | |
| const fadeoutTitle = document.createElement('div'); | |
| fadeoutTitle.className = 'tools-section-title'; | |
| fadeoutTitle.textContent = 'fadeout time'; | |
| tools.appendChild(fadeoutTitle); | |
| const fadeoutRow = document.createElement('div'); | |
| fadeoutRow.className = 'thickness-row'; | |
| tools.appendChild(fadeoutRow); | |
| const fadeoutSlider = document.createElement('input'); | |
| fadeoutSlider.type = 'range'; | |
| fadeoutSlider.className = 'thickness-slider'; | |
| fadeoutSlider.min = '0'; | |
| fadeoutSlider.max = '30'; | |
| fadeoutSlider.value = getFadeoutTime(); | |
| const fadeoutValue = document.createElement('span'); | |
| fadeoutValue.className = 'thickness-value'; | |
| fadeoutValue.textContent = fadeoutSlider.value === '0' ? 'never' : fadeoutSlider.value + 's'; | |
| fadeoutSlider.oninput = function() { | |
| const value = parseInt(fadeoutSlider.value, 10); | |
| setStoredFadeoutTime(value); | |
| fadeoutValue.textContent = value === 0 ? 'never' : value + 's'; | |
| }; | |
| fadeoutRow.appendChild(fadeoutSlider); | |
| fadeoutRow.appendChild(fadeoutValue); | |
| // Draggable behavior | |
| makeDraggable(tools, 'uvnote-tools-pos', title); | |
| return tools; | |
| } | |
| function initTools() { | |
| const widget = createToolsWidget(); | |
| document.body.appendChild(widget); | |
| } | |
| function teardownTools() { | |
| const w = document.querySelector('.tools-widget'); | |
| if (w && w.parentNode) w.parentNode.removeChild(w); | |
| } | |
| // --- Canvas overlay for tools --- | |
| let _overlay = null; | |
| let _overlayCtx = null; | |
| let _overlayContainer = null; // window | |
| let _overlayMode = 'single'; | |
| let _overlayResizeHandler = null; | |
| let _overlayScrollHandler = null; | |
| let _drawing = null; // current in-progress arrow {x1,y1,x2,y2} | |
| let _shapes = []; // committed shapes for current mode | |
| let _fadeTimer = null; // timer for fade animation | |
| function getOverlayStorageKey() { return 'uvnote-shapes'; } | |
| function loadShapes() { | |
| try { | |
| const raw = localStorage.getItem(getOverlayStorageKey()); | |
| _shapes = raw ? JSON.parse(raw) : []; | |
| } catch (_) { _shapes = []; } | |
| } | |
| function saveShapes() { | |
| try { localStorage.setItem(getOverlayStorageKey(), JSON.stringify(_shapes)); } catch (_) {} | |
| } | |
| function updateShapesFade() { | |
| const now = Date.now(); | |
| const fadeoutSeconds = getFadeoutTime(); | |
| // If fadeout is disabled (0 seconds), don't fade anything | |
| if (fadeoutSeconds === 0) return; | |
| const fadeStartTime = Math.max(0, (fadeoutSeconds - 2) * 1000); // Start fading 2s before end | |
| const fadeEndTime = fadeoutSeconds * 1000; // Fully gone after specified time | |
| let needsUpdate = false; | |
| for (let i = _shapes.length - 1; i >= 0; i--) { | |
| const shape = _shapes[i]; | |
| if (!shape.createdAt) continue; // Skip old shapes without timestamps | |
| const age = now - shape.createdAt; | |
| if (age >= fadeEndTime) { | |
| // Remove completely faded shapes | |
| _shapes.splice(i, 1); | |
| needsUpdate = true; | |
| } else if (age >= fadeStartTime) { | |
| // Update opacity for fading shapes | |
| const fadeProgress = (age - fadeStartTime) / (fadeEndTime - fadeStartTime); | |
| const newOpacity = 1 - fadeProgress; | |
| if (Math.abs(shape.opacity - newOpacity) > 0.01) { | |
| shape.opacity = newOpacity; | |
| needsUpdate = true; | |
| } | |
| } | |
| } | |
| if (needsUpdate) { | |
| saveShapes(); | |
| renderOverlay(); | |
| } | |
| } | |
| function getContentContainer() { return window; } | |
| function updateOverlayModeAndContainer() { | |
| _overlayContainer = window; | |
| _overlayMode = 'single'; | |
| } | |
| function updateOverlayBounds() { | |
| if (!_overlay) return; | |
| if (_overlayContainer === window) { | |
| _overlay.style.position = 'fixed'; | |
| _overlay.style.left = '0px'; | |
| _overlay.style.top = '0px'; | |
| _overlay.width = window.innerWidth; | |
| _overlay.height = window.innerHeight; | |
| } else { | |
| const rect = _overlayContainer.getBoundingClientRect(); | |
| _overlay.style.position = 'fixed'; | |
| _overlay.style.left = rect.left + 'px'; | |
| _overlay.style.top = rect.top + 'px'; | |
| _overlay.width = Math.max(0, Math.floor(rect.width)); | |
| _overlay.height = Math.max(0, Math.floor(rect.height)); | |
| } | |
| renderOverlay(); | |
| } | |
| function containerScrollLeft() { | |
| return (_overlayContainer === window) ? (window.scrollX || 0) : (_overlayContainer.scrollLeft || 0); | |
| } | |
| function containerScrollTop() { | |
| return (_overlayContainer === window) ? (window.scrollY || 0) : (_overlayContainer.scrollTop || 0); | |
| } | |
| function toCanvasCoords(clientX, clientY) { | |
| const rect = _overlay.getBoundingClientRect(); | |
| return { x: clientX - rect.left, y: clientY - rect.top }; | |
| } | |
| function onPointerDown(e) { | |
| const tool = document.body.dataset.tool; | |
| if (tool === 'arrow') { | |
| startDrawArrow(e); | |
| } else if (tool === 'pen') { | |
| startDrawPen(e); | |
| } else if (tool === 'eraser') { | |
| eraseAt(e); | |
| } else if (tool === 'spotlight') { | |
| startDrawSpotlight(e); | |
| } | |
| } | |
| function onPointerMove(e) { | |
| // Update cursor position | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| _cursorX = pt.x; | |
| _cursorY = pt.y; | |
| if (!_drawing) { | |
| // Just update cursor position and re-render | |
| if (_cursorVisible) { | |
| renderOverlay(); | |
| } | |
| return; | |
| } | |
| if (_drawing.type === 'pen') { | |
| moveDrawPen(e); | |
| } else if (_drawing.type === 'spotlight') { | |
| moveDrawSpotlight(e); | |
| } else { | |
| moveDrawArrow(e); | |
| } | |
| } | |
| function onPointerEnter(e) { | |
| _cursorVisible = document.body.dataset.tool !== 'none'; | |
| if (_cursorVisible) { | |
| renderOverlay(); | |
| } | |
| } | |
| function onPointerLeave(e) { | |
| _cursorVisible = false; | |
| renderOverlay(); | |
| } | |
| function onPointerUp(e) { | |
| if (!_drawing) return; | |
| if (_drawing.type === 'pen') { | |
| endDrawPen(); | |
| } else if (_drawing.type === 'spotlight') { | |
| endDrawSpotlight(); | |
| } else { | |
| endDrawArrow(); | |
| } | |
| } | |
| function startDrawArrow(e) { | |
| if (document.body.dataset.tool !== 'arrow') return; | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| _drawing = { | |
| x1: pt.x + containerScrollLeft(), | |
| y1: pt.y + containerScrollTop(), | |
| x2: pt.x + containerScrollLeft(), | |
| y2: pt.y + containerScrollTop(), | |
| color: getArrowColor(), | |
| width: getLineThickness() | |
| }; | |
| renderOverlay(); | |
| e.preventDefault(); | |
| } | |
| function moveDrawArrow(e) { | |
| if (!_drawing) return; | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| _drawing.x2 = pt.x + containerScrollLeft(); | |
| _drawing.y2 = pt.y + containerScrollTop(); | |
| renderOverlay(); | |
| e.preventDefault(); | |
| } | |
| function endDrawArrow() { | |
| if (!_drawing) return; | |
| _shapes.push({ | |
| type: 'arrow', | |
| ..._drawing, | |
| createdAt: Date.now(), | |
| opacity: 1.0 | |
| }); | |
| _drawing = null; | |
| saveShapes(); | |
| renderOverlay(); | |
| } | |
| function startDrawPen(e) { | |
| if (document.body.dataset.tool !== 'pen') return; | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| _drawing = { | |
| type: 'pen', | |
| points: [{ | |
| x: pt.x + containerScrollLeft(), | |
| y: pt.y + containerScrollTop() | |
| }], | |
| color: getArrowColor(), | |
| width: getLineThickness() | |
| }; | |
| renderOverlay(); | |
| e.preventDefault(); | |
| } | |
| function moveDrawPen(e) { | |
| if (!_drawing || _drawing.type !== 'pen') return; | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| _drawing.points.push({ | |
| x: pt.x + containerScrollLeft(), | |
| y: pt.y + containerScrollTop() | |
| }); | |
| renderOverlay(); | |
| e.preventDefault(); | |
| } | |
| function endDrawPen() { | |
| if (!_drawing || _drawing.type !== 'pen') return; | |
| if (_drawing.points.length > 1) { | |
| _shapes.push({ | |
| ..._drawing, | |
| createdAt: Date.now(), | |
| opacity: 1.0 | |
| }); | |
| } | |
| _drawing = null; | |
| saveShapes(); | |
| renderOverlay(); | |
| } | |
| function startDrawSpotlight(e) { | |
| if (document.body.dataset.tool !== 'spotlight') return; | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| _drawing = { | |
| type: 'spotlight', | |
| x: pt.x + containerScrollLeft(), | |
| y: pt.y + containerScrollTop(), | |
| radius: getLineThickness() * 20, // Use thickness to control spotlight size (bigger default) | |
| color: getArrowColor() | |
| }; | |
| renderOverlay(); | |
| e.preventDefault(); | |
| } | |
| function moveDrawSpotlight(e) { | |
| if (!_drawing || _drawing.type !== 'spotlight') return; | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| const dx = pt.x + containerScrollLeft() - _drawing.x; | |
| const dy = pt.y + containerScrollTop() - _drawing.y; | |
| _drawing.radius = Math.max(20, Math.sqrt(dx * dx + dy * dy)); // Minimum radius of 20 | |
| renderOverlay(); | |
| e.preventDefault(); | |
| } | |
| function endDrawSpotlight() { | |
| if (!_drawing || _drawing.type !== 'spotlight') return; | |
| _shapes.push({ | |
| ..._drawing, | |
| createdAt: Date.now(), | |
| opacity: 1.0 | |
| }); | |
| _drawing = null; | |
| saveShapes(); | |
| renderOverlay(); | |
| } | |
| function distPointToSegment(px, py, x1, y1, x2, y2) { | |
| const dx = x2 - x1, dy = y2 - y1; | |
| if (dx === 0 && dy === 0) return Math.hypot(px - x1, py - y1); | |
| const t = Math.max(0, Math.min(1, ((px - x1) * dx + (py - y1) * dy) / (dx*dx + dy*dy))); | |
| const cx = x1 + t * dx, cy = y1 + t * dy; | |
| return Math.hypot(px - cx, py - cy); | |
| } | |
| function eraseAt(e) { | |
| const pt = toCanvasCoords(e.touches ? e.touches[0].clientX : e.clientX, e.touches ? e.touches[0].clientY : e.clientY); | |
| const x = pt.x + containerScrollLeft(); | |
| const y = pt.y + containerScrollTop(); | |
| const threshold = 10; // pixels | |
| for (let i = _shapes.length - 1; i >= 0; i--) { | |
| const s = _shapes[i]; | |
| if (s.type === 'arrow') { | |
| const d = distPointToSegment(x, y, s.x1, s.y1, s.x2, s.y2); | |
| if (d <= threshold) { | |
| _shapes.splice(i, 1); | |
| saveShapes(); | |
| renderOverlay(); | |
| break; | |
| } | |
| } else if (s.type === 'pen' && s.points) { | |
| // Check if click is near any line segment in the pen stroke | |
| let minDist = Infinity; | |
| for (let j = 1; j < s.points.length; j++) { | |
| const d = distPointToSegment(x, y, s.points[j-1].x, s.points[j-1].y, s.points[j].x, s.points[j].y); | |
| minDist = Math.min(minDist, d); | |
| } | |
| if (minDist <= threshold) { | |
| _shapes.splice(i, 1); | |
| saveShapes(); | |
| renderOverlay(); | |
| break; | |
| } | |
| } | |
| } | |
| e.preventDefault(); | |
| } | |
| function drawArrow(ctx, x1, y1, x2, y2, color, width, opacity = 1.0) { | |
| // Set opacity | |
| const oldAlpha = ctx.globalAlpha; | |
| ctx.globalAlpha = opacity; | |
| ctx.strokeStyle = color; | |
| ctx.fillStyle = color; | |
| ctx.lineWidth = width; | |
| ctx.lineCap = 'round'; | |
| ctx.lineJoin = 'round'; | |
| // Check if points are too close (initial state) | |
| const dx = x2 - x1; | |
| const dy = y2 - y1; | |
| const distance = Math.sqrt(dx * dx + dy * dy); | |
| if (distance < 5) { | |
| // Draw just a small arrowhead pointing down-right when first clicked | |
| const defaultAngle = Math.PI / 4; // 45 degrees (down-right) | |
| const headLength = Math.min(15 + width * 1.5, 25); | |
| const headAngle = Math.PI / 6; | |
| // Calculate arrowhead points | |
| const hx1 = x1 + headLength * Math.cos(defaultAngle - headAngle); | |
| const hy1 = y1 + headLength * Math.sin(defaultAngle - headAngle); | |
| const hx2 = x1 + headLength * Math.cos(defaultAngle + headAngle); | |
| const hy2 = y1 + headLength * Math.sin(defaultAngle + headAngle); | |
| // Draw arrowhead only | |
| ctx.beginPath(); | |
| ctx.moveTo(x1, y1); | |
| ctx.lineTo(hx1, hy1); | |
| ctx.lineTo(hx2, hy2); | |
| ctx.closePath(); | |
| ctx.fill(); | |
| } else { | |
| // Normal arrow drawing - head at x1,y1, tail at x2,y2 | |
| const angle = Math.atan2(y1 - y2, x1 - x2); | |
| const headLength = Math.min(15 + width * 1.5, 25); | |
| const headAngle = Math.PI / 6; | |
| // Calculate where the line should end (before the arrowhead) | |
| const lineEndX = x1 - headLength * 0.8 * Math.cos(angle); | |
| const lineEndY = y1 - headLength * 0.8 * Math.sin(angle); | |
| // Draw the line from tail to near the head | |
| ctx.beginPath(); | |
| ctx.moveTo(x2, y2); | |
| ctx.lineTo(lineEndX, lineEndY); | |
| ctx.stroke(); | |
| // Calculate arrowhead points | |
| const hx1 = x1 - headLength * Math.cos(angle - headAngle); | |
| const hy1 = y1 - headLength * Math.sin(angle - headAngle); | |
| const hx2 = x1 - headLength * Math.cos(angle + headAngle); | |
| const hy2 = y1 - headLength * Math.sin(angle + headAngle); | |
| // Draw arrowhead | |
| ctx.beginPath(); | |
| ctx.moveTo(x1, y1); | |
| ctx.lineTo(hx1, hy1); | |
| ctx.lineTo(hx2, hy2); | |
| ctx.closePath(); | |
| ctx.fill(); | |
| } | |
| // Restore opacity | |
| ctx.globalAlpha = oldAlpha; | |
| } | |
| function drawPen(ctx, points, color, width, offX, offY, opacity = 1.0) { | |
| if (!points || points.length < 2) return; | |
| // Set opacity | |
| const oldAlpha = ctx.globalAlpha; | |
| ctx.globalAlpha = opacity; | |
| ctx.strokeStyle = color; | |
| ctx.lineWidth = width; | |
| ctx.lineCap = 'round'; | |
| ctx.lineJoin = 'round'; | |
| ctx.beginPath(); | |
| ctx.moveTo(points[0].x - offX, points[0].y - offY); | |
| for (let i = 1; i < points.length; i++) { | |
| ctx.lineTo(points[i].x - offX, points[i].y - offY); | |
| } | |
| ctx.stroke(); | |
| // Restore opacity | |
| ctx.globalAlpha = oldAlpha; | |
| } | |
| function drawAllSpotlights(ctx, spotlights, offX, offY) { | |
| if (!spotlights || spotlights.length === 0) return; | |
| ctx.save(); | |
| // Calculate the overall opacity based on all spotlights | |
| const maxOpacity = Math.max(...spotlights.map(s => s.opacity || 1.0)); | |
| // Fill entire canvas with dark overlay | |
| ctx.fillStyle = `rgba(0, 0, 0, ${0.7 * maxOpacity})`; | |
| ctx.fillRect(0, 0, ctx.canvas.width, ctx.canvas.height); | |
| // Cut out completely transparent holes for all spotlights | |
| ctx.globalCompositeOperation = 'destination-out'; | |
| ctx.fillStyle = 'rgba(0, 0, 0, 1)'; // Solid black to ensure complete removal | |
| for (const spotlight of spotlights) { | |
| ctx.beginPath(); | |
| ctx.arc(spotlight.x - offX, spotlight.y - offY, spotlight.radius, 0, 2 * Math.PI); | |
| ctx.fill(); | |
| } | |
| ctx.restore(); | |
| } | |
| function renderOverlay() { | |
| if (!_overlay || !_overlayCtx) return; | |
| _overlayCtx.clearRect(0, 0, _overlay.width, _overlay.height); | |
| const offX = containerScrollLeft(); | |
| const offY = containerScrollTop(); | |
| // Draw non-spotlight shapes first | |
| for (const s of _shapes) { | |
| const opacity = s.opacity !== undefined ? s.opacity : 1.0; | |
| if (s.type === 'arrow') { | |
| drawArrow(_overlayCtx, s.x1 - offX, s.y1 - offY, s.x2 - offX, s.y2 - offY, s.color || '#f00', s.width || 2, opacity); | |
| } else if (s.type === 'pen') { | |
| drawPen(_overlayCtx, s.points, s.color || '#f00', s.width || 2, offX, offY, opacity); | |
| } | |
| } | |
| // Draw current drawing (non-spotlight) | |
| if (_drawing) { | |
| if (_drawing.type === 'pen') { | |
| drawPen(_overlayCtx, _drawing.points, _drawing.color, _drawing.width, offX, offY); | |
| } else if (_drawing.type !== 'spotlight') { | |
| drawArrow(_overlayCtx, _drawing.x1 - offX, _drawing.y1 - offY, _drawing.x2 - offX, _drawing.y2 - offY, _drawing.color, _drawing.width); | |
| } | |
| } | |
| // Collect all spotlights (existing + current drawing + cursor preview) | |
| const spotlights = []; | |
| // Add existing spotlight shapes | |
| for (const s of _shapes) { | |
| if (s.type === 'spotlight') { | |
| spotlights.push({ | |
| x: s.x, | |
| y: s.y, | |
| radius: s.radius, | |
| opacity: s.opacity !== undefined ? s.opacity : 1.0 | |
| }); | |
| } | |
| } | |
| // Add current spotlight being drawn | |
| if (_drawing && _drawing.type === 'spotlight') { | |
| spotlights.push({ | |
| x: _drawing.x, | |
| y: _drawing.y, | |
| radius: _drawing.radius, | |
| opacity: 1.0 | |
| }); | |
| } | |
| // Add cursor preview spotlight if tool is active | |
| if (_cursorVisible && !_drawing) { | |
| const tool = document.body.dataset.tool; | |
| if (tool === 'spotlight') { | |
| const thickness = getLineThickness(); | |
| const radius = thickness * 20; | |
| const cursorWorldX = _cursorX + containerScrollLeft(); | |
| const cursorWorldY = _cursorY + containerScrollTop(); | |
| spotlights.push({ | |
| x: cursorWorldX, | |
| y: cursorWorldY, | |
| radius: radius, | |
| opacity: 0.8 | |
| }); | |
| } | |
| } | |
| // Draw all spotlights as a single overlay with multiple holes | |
| drawAllSpotlights(_overlayCtx, spotlights, offX, offY); | |
| // Draw cursor indicators for non-spotlight tools | |
| if (_cursorVisible && !_drawing) { | |
| const tool = document.body.dataset.tool; | |
| const color = getArrowColor(); | |
| const thickness = getLineThickness(); | |
| if (tool !== 'spotlight') { | |
| _overlayCtx.save(); | |
| _overlayCtx.fillStyle = color; | |
| _overlayCtx.globalAlpha = 0.7; | |
| if (tool === 'eraser') { | |
| // Draw eraser indicator | |
| _overlayCtx.strokeStyle = color; | |
| _overlayCtx.lineWidth = 2; | |
| _overlayCtx.beginPath(); | |
| _overlayCtx.arc(_cursorX, _cursorY, 10, 0, 2 * Math.PI); | |
| _overlayCtx.stroke(); | |
| } else { | |
| // Draw dot for pen/arrow | |
| _overlayCtx.beginPath(); | |
| _overlayCtx.arc(_cursorX, _cursorY, thickness / 2, 0, 2 * Math.PI); | |
| _overlayCtx.fill(); | |
| } | |
| _overlayCtx.restore(); | |
| } | |
| } | |
| } | |
| function setOverlayActive(active) { | |
| if (!_overlay) initOverlay(); | |
| _overlay.style.pointerEvents = active ? 'auto' : 'none'; | |
| _overlay.style.cursor = active ? 'none' : 'auto'; | |
| // Re-render to ensure visibility aligns with content | |
| renderOverlay(); | |
| } | |
| function initOverlay() { | |
| if (_overlay) return; | |
| updateOverlayModeAndContainer(); | |
| _overlay = document.createElement('canvas'); | |
| _overlay.className = 'draw-overlay'; | |
| _overlayCtx = _overlay.getContext('2d'); | |
| document.body.appendChild(_overlay); | |
| updateOverlayBounds(); | |
| loadShapes(); | |
| renderOverlay(); | |
| // Events | |
| _overlay.addEventListener('mousedown', onPointerDown); | |
| _overlay.addEventListener('mousemove', onPointerMove); | |
| _overlay.addEventListener('mouseenter', onPointerEnter); | |
| _overlay.addEventListener('mouseleave', onPointerLeave); | |
| document.addEventListener('mouseup', onPointerUp); | |
| _overlay.addEventListener('touchstart', onPointerDown, { passive: false }); | |
| _overlay.addEventListener('touchmove', onPointerMove, { passive: false }); | |
| document.addEventListener('touchend', onPointerUp); | |
| _overlayResizeHandler = () => updateOverlayBounds(); | |
| window.addEventListener('resize', _overlayResizeHandler); | |
| _overlayScrollHandler = () => renderOverlay(); | |
| window.addEventListener('scroll', _overlayScrollHandler); | |
| // Start fade animation timer | |
| _fadeTimer = setInterval(updateShapesFade, 100); // Update every 100ms for smooth fade | |
| } | |
| function rebindOverlayContainer() { | |
| if (!_overlay) return; | |
| // Remove old scroll handler | |
| if (_overlayScrollHandler) { window.removeEventListener('scroll', _overlayScrollHandler); } | |
| updateOverlayModeAndContainer(); | |
| updateOverlayBounds(); | |
| loadShapes(); | |
| renderOverlay(); | |
| _overlayScrollHandler = () => renderOverlay(); | |
| window.addEventListener('scroll', _overlayScrollHandler); | |
| } | |
| function teardownOverlay() { | |
| if (!_overlay) return; | |
| _overlay.removeEventListener('mousedown', onPointerDown); | |
| _overlay.removeEventListener('mousemove', onPointerMove); | |
| _overlay.removeEventListener('mouseenter', onPointerEnter); | |
| _overlay.removeEventListener('mouseleave', onPointerLeave); | |
| document.removeEventListener('mouseup', onPointerUp); | |
| _overlay.removeEventListener('touchstart', onPointerDown); | |
| _overlay.removeEventListener('touchmove', onPointerMove); | |
| document.removeEventListener('touchend', onPointerUp); | |
| if (_overlayResizeHandler) window.removeEventListener('resize', _overlayResizeHandler); | |
| if (_overlayScrollHandler) { | |
| if (_overlayContainer === window) { | |
| window.removeEventListener('scroll', _overlayScrollHandler); | |
| } else if (_overlayContainer) { | |
| _overlayContainer.removeEventListener('scroll', _overlayScrollHandler); | |
| } | |
| } | |
| if (_fadeTimer) { | |
| clearInterval(_fadeTimer); | |
| _fadeTimer = null; | |
| } | |
| if (_overlay.parentNode) _overlay.parentNode.removeChild(_overlay); | |
| _overlay = null; _overlayCtx = null; _overlayContainer = null; _overlayResizeHandler = null; _overlayScrollHandler = null; _drawing = null; | |
| } | |
| function teardownFileExplorer() { | |
| const fe = document.querySelector('.file-explorer'); | |
| if (fe && fe.parentNode) fe.parentNode.removeChild(fe); | |
| } | |
| function escapeHtml(text) { | |
| const div = document.createElement('div'); | |
| div.textContent = text; | |
| return div.innerHTML; | |
| } | |
| function runCell(cellId){ | |
| const btn=document.querySelector('.run-btn[onclick*="'+cellId+'"]'); | |
| const output=document.getElementById('output-'+cellId); | |
| if(btn){btn.textContent='⏳ running...';btn.disabled=true;} | |
| if(output){output.classList.add('output-stale');} | |
| fetch('/run/'+cellId,{method:'POST'}).then(r=>r.json()).then(data=>{ | |
| if(output){ | |
| output.classList.remove('output-stale'); | |
| let html=''; | |
| if(data.stdout) html+='<div class="cell-stdout">'+escapeHtml(data.stdout)+'</div>'; | |
| console.log('UV Logs:', data); | |
| if(data.stderr) { | |
| // Split UV logs from regular stderr | |
| const lines = data.stderr.split('\n'); | |
| let uvLogs = []; | |
| let regularLogs = []; | |
| let inUvSection = true; | |
| for (const line of lines) { | |
| if (inUvSection) { | |
| uvLogs.push(line); | |
| if (line.startsWith('Installed ')) { | |
| inUvSection = false; | |
| } | |
| } else { | |
| regularLogs.push(line); | |
| } | |
| } | |
| // If we never found "Installed", treat it all as regular stderr | |
| if (inUvSection) { | |
| html+='<div class="cell-stderr">'+escapeHtml(data.stderr)+'</div>'; | |
| } else { | |
| const uvLogsStr = uvLogs.join('\n'); | |
| const regularLogsStr = regularLogs.join('\n').trim(); | |
| if (uvLogsStr) { | |
| html+='<div class="uv-install-logs">'; | |
| html+='<div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div>'; | |
| html+='<div class="uv-logs-content" style="display: none;">'+escapeHtml(uvLogsStr)+'</div>'; | |
| html+='</div>'; | |
| } | |
| if (regularLogsStr) { | |
| html+='<div class="cell-stderr">'+escapeHtml(regularLogsStr)+'</div>'; | |
| } | |
| } | |
| } | |
| output.innerHTML=html; | |
| } | |
| if(btn){btn.textContent='▶ run';btn.disabled=false;} | |
| }).catch(e=>{ | |
| console.error('Run failed:',e); | |
| if(output){output.classList.remove('output-stale');} | |
| if(btn){btn.textContent='▶ run';btn.disabled=false;} | |
| }); | |
| } | |
| function copyCell(cellId){ | |
| console.log('copyCell called with cellId:', cellId); | |
| // Try multiple selectors to find the code element | |
| let codeElement = document.querySelector('#code-'+cellId+' code'); | |
| if (!codeElement) { | |
| codeElement = document.querySelector('#code-'+cellId+' pre code'); | |
| } | |
| if (!codeElement) { | |
| codeElement = document.querySelector('#code-'+cellId+' .highlight code'); | |
| } | |
| if (!codeElement) { | |
| // Try finding any code element within the cell | |
| const codeDiv = document.getElementById('code-'+cellId); | |
| if (codeDiv) { | |
| codeElement = codeDiv.querySelector('code'); | |
| } | |
| } | |
| const btn = document.querySelector('.copy-btn[onclick*="'+cellId+'"]'); | |
| console.log('Found codeElement:', codeElement); | |
| console.log('Found btn:', btn); | |
| console.log('Code div structure:', document.getElementById('code-'+cellId)); | |
| if (!codeElement) { | |
| console.error('Code element not found for cell:', cellId); | |
| // Log the actual structure for debugging | |
| const codeDiv = document.getElementById('code-'+cellId); | |
| if (codeDiv) { | |
| console.log('Code div HTML:', codeDiv.innerHTML); | |
| } | |
| return; | |
| } | |
| if (!btn) { | |
| console.error('Copy button not found for cell:', cellId); | |
| return; | |
| } | |
| const codeText = codeElement.textContent; | |
| console.log('Code text to copy:', codeText ? codeText.substring(0, 50) + '...' : 'empty'); | |
| if (navigator.clipboard && navigator.clipboard.writeText) { | |
| navigator.clipboard.writeText(codeText).then(function() { | |
| console.log('Clipboard copy successful'); | |
| btn.textContent = '✓ Copied!'; | |
| btn.classList.add('copied'); | |
| setTimeout(function() { | |
| btn.textContent = 'Copy'; | |
| btn.classList.remove('copied'); | |
| }, 2000); | |
| }).catch(function(err) { | |
| console.warn('Clipboard copy failed:', err); | |
| fallbackCopy(); | |
| }); | |
| } else { | |
| console.log('Using fallback copy method'); | |
| fallbackCopy(); | |
| } | |
| function fallbackCopy() { | |
| const textarea = document.createElement('textarea'); | |
| textarea.value = codeText; | |
| textarea.style.position = 'absolute'; | |
| textarea.style.left = '-9999px'; | |
| document.body.appendChild(textarea); | |
| textarea.select(); | |
| try { | |
| const success = document.execCommand('copy'); | |
| console.log('Fallback copy success:', success); | |
| btn.textContent = '✓ Copied!'; | |
| btn.classList.add('copied'); | |
| setTimeout(function() { | |
| btn.textContent = 'Copy'; | |
| btn.classList.remove('copied'); | |
| }, 2000); | |
| } catch (err) { | |
| console.error('Fallback copy failed:', err); | |
| btn.textContent = 'Copy failed'; | |
| setTimeout(function() { | |
| btn.textContent = 'Copy'; | |
| }, 2000); | |
| } | |
| document.body.removeChild(textarea); | |
| } | |
| } | |
| // Live reload functionality (robust SSE handling) | |
| (function(){ | |
| if (!('EventSource' in window)) { | |
| console.warn('SSE not supported in this browser'); | |
| return; | |
| } | |
| let source = new EventSource('/events'); | |
| let isOpen = false; | |
| source.onopen = function(){ isOpen = true; console.log('SSE connected'); }; | |
| source.onmessage = function(e){ | |
| const msg=(e.data||'').trim(); if(!msg) return; | |
| console.log('SSE message:', msg); | |
| if (msg==='reload' || msg==='incremental') { location.reload(); } | |
| // Ignore 'loading' to avoid premature reload loops | |
| }; | |
| source.onerror = function(e){ | |
| // Let EventSource auto-reconnect instead of forcing a reload | |
| if (isOpen) console.warn('SSE error after open, retrying...', e); | |
| }; | |
| window.addEventListener('beforeunload', function(){ try{source.close();}catch(_){} }); | |
| })(); | |
| document.addEventListener('DOMContentLoaded', function() { | |
| updateThemeIcon(); | |
| initMinimap(); | |
| initFileExplorer(); | |
| initTools(); | |
| initOverlay(); | |
| initializeWidgetVisibility(); | |
| layoutWidgetsStackedBottomRight(); | |
| window.addEventListener('resize', layoutWidgetsStackedBottomRight); | |
| // Add ESC key handler to exit tools | |
| document.addEventListener('keydown', function(e) { | |
| if (e.key === 'Escape' || e.keyCode === 27) { | |
| const currentTool = document.body.dataset.tool; | |
| if (currentTool && currentTool !== 'none') { | |
| // Deactivate the current tool | |
| window.setActiveTool('none'); | |
| } | |
| } | |
| }); | |
| }); | |
| </script> | |
| </head> | |
| <body> | |
| <div class="controls"> | |
| <div class="theme-toggle" onclick="toggleTheme()">light</div> | |
| <div class="reset-toggle" onclick="resetLayout()">reset</div> | |
| <div class="menu-button" onclick="toggleMenu()"> | |
| menu ▼ | |
| <div class="menu-dropdown"> | |
| <div class="menu-item" onclick="toggleWidget('tools')"> | |
| <span class="menu-checkbox" id="checkbox-tools">☐</span> Tools | |
| </div> | |
| <div class="menu-item" onclick="toggleWidget('file-explorer')"> | |
| <span class="menu-checkbox" id="checkbox-file-explorer">☐</span> File Explorer | |
| </div> | |
| <div class="menu-item" onclick="toggleWidget('minimap')"> | |
| <span class="menu-checkbox" id="checkbox-minimap">☐</span> Table of Contents | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <div class="system-info"> | |
| <div class="system-info-header">Generated on:</div> | |
| <div class="system-info-content"> | |
| Linux x86_64 | Linux-5.15.0-1084-aws-x86_64-with-glibc2.31 | |
| </div> | |
| </div> | |
| <div class="main-content"> | |
| <div class="cell"> | |
| <div class="cell-header"> | |
| <span class="collapse-indicators"> | |
| <span onclick="toggleCode('nvidia_dump')" style="cursor: pointer;">▶ code</span> | |
| <span onclick="toggleOutput('nvidia_dump')" style="cursor: pointer;">▼ output</span> | |
| <span id="uv-indicator-nvidia_dump" onclick="toggleUvLogsFromHeader('nvidia_dump')" style="cursor: pointer;">▶ uv-logs</span> | |
| </span> | | |
| Cell: nvidia_dump | deps: torch | 33.33s | |
| | <button class="run-btn" onclick="runCell('nvidia_dump')">▶ run</button> | |
| <button class="copy-btn" onclick="copyCell('nvidia_dump')">Copy</button> | |
| <a href="cells/nvidia_dump.py" target="_blank" class="raw-btn">Raw</a> | |
| </div> | |
| <div id="code-nvidia_dump" class="cell-code collapsed"> | |
| <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span> | |
| <span class="normal"> 2</span> | |
| <span class="normal"> 3</span> | |
| <span class="normal"> 4</span> | |
| <span class="normal"> 5</span> | |
| <span class="normal"> 6</span> | |
| <span class="normal"> 7</span> | |
| <span class="normal"> 8</span> | |
| <span class="normal"> 9</span> | |
| <span class="normal">10</span> | |
| <span class="normal">11</span> | |
| <span class="normal">12</span> | |
| <span class="normal">13</span> | |
| <span class="normal">14</span> | |
| <span class="normal">15</span></pre></div></td><td class="code"><div><pre><span></span><span class="sd">"""Utility to dump NVIDIA GPU information."""</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">subprocess</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">nvidia_dump</span><span class="p">():</span> | |
| <span class="w"> </span><span class="sd">"""Dump NVIDIA GPU information."""</span> | |
| <span class="k">try</span><span class="p">:</span> | |
| <span class="n">result</span> <span class="o">=</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">run</span><span class="p">([</span><span class="s1">'nvidia-smi'</span><span class="p">],</span> <span class="n">capture_output</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">text</span><span class="o">=</span><span class="kc">True</span><span class="p">,</span> <span class="n">check</span><span class="o">=</span><span class="kc">True</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"NVIDIA GPU Information:"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="n">result</span><span class="o">.</span><span class="n">stdout</span><span class="p">)</span> | |
| <span class="k">except</span> <span class="ne">FileNotFoundError</span><span class="p">:</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"nvidia-smi not found. Are you running on a machine with NVIDIA GPUs?"</span><span class="p">)</span> | |
| <span class="k">except</span> <span class="n">subprocess</span><span class="o">.</span><span class="n">CalledProcessError</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Error running nvidia-smi: </span><span class="si">{</span><span class="n">e</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="n">nvidia_dump</span><span class="p">()</span> | |
| </pre></div></td></tr></table></div> | |
| </div> | |
| <div id="output-nvidia_dump" class="cell-output"> | |
| <div class="cell-stdout">NVIDIA GPU Information: | |
| Mon Sep 15 16:41:01 2025 | |
| +-----------------------------------------------------------------------------------------+ | |
| | NVIDIA-SMI 560.35.05 Driver Version: 560.35.05 CUDA Version: 12.6 | | |
| |-----------------------------------------+------------------------+----------------------+ | |
| | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | |
| | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | |
| | | | MIG M. | | |
| |=========================================+========================+======================| | |
| | 0 NVIDIA L4 Off | 00000000:38:00.0 Off | 0 | | |
| | N/A 46C P0 28W / 72W | 1MiB / 23034MiB | 0% Default | | |
| | | | N/A | | |
| +-----------------------------------------+------------------------+----------------------+ | |
| | 1 NVIDIA L4 Off | 00000000:3A:00.0 Off | 0 | | |
| | N/A 46C P0 28W / 72W | 1MiB / 23034MiB | 2% Default | | |
| | | | N/A | | |
| +-----------------------------------------+------------------------+----------------------+ | |
| | 2 NVIDIA L4 Off | 00000000:3C:00.0 Off | 0 | | |
| | N/A 49C P0 31W / 72W | 1MiB / 23034MiB | 2% Default | | |
| | | | N/A | | |
| +-----------------------------------------+------------------------+----------------------+ | |
| | 3 NVIDIA L4 Off | 00000000:3E:00.0 Off | 0 | | |
| | N/A 48C P0 29W / 72W | 1MiB / 23034MiB | 2% Default | | |
| | | | N/A | | |
| +-----------------------------------------+------------------------+----------------------+ | |
| +-----------------------------------------------------------------------------------------+ | |
| | Processes: | | |
| | GPU GI CI PID Type Process name GPU Memory | | |
| | ID ID Usage | | |
| |=========================================================================================| | |
| | No running processes found | | |
| +-----------------------------------------------------------------------------------------+ | |
| </div> | |
| <div class="uv-install-logs" id="uv-logs-nvidia_dump"> | |
| <div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div> | |
| <div class="uv-logs-content" style="display: none;"> | |
| Downloading networkx (1.9MiB) | |
| Downloading setuptools (1.1MiB) | |
| Downloading nvidia-cufile-cu12 (1.1MiB) | |
| Downloading nvidia-cuda-cupti-cu12 (9.8MiB) | |
| Downloading nvidia-cusolver-cu12 (255.1MiB) | |
| Downloading sympy (6.0MiB) | |
| Downloading nvidia-cublas-cu12 (566.8MiB) | |
| Downloading nvidia-cusparse-cu12 (274.9MiB) | |
| Downloading nvidia-cuda-nvrtc-cu12 (84.0MiB) | |
| Downloading nvidia-nvjitlink-cu12 (37.4MiB) | |
| Downloading nvidia-curand-cu12 (60.7MiB) | |
| Downloading torch (846.8MiB) | |
| Downloading nvidia-cufft-cu12 (184.2MiB) | |
| Downloading nvidia-cusparselt-cu12 (273.9MiB) | |
| Downloading nvidia-cudnn-cu12 (674.0MiB) | |
| Downloading nvidia-nccl-cu12 (307.4MiB) | |
| Downloading triton (148.4MiB) | |
| Downloading nvidia-cufile-cu12 | |
| Downloading setuptools | |
| Downloading networkx | |
| Downloading nvidia-cuda-cupti-cu12 | |
| Downloading nvidia-nvjitlink-cu12 | |
| Downloading nvidia-curand-cu12 | |
| Downloading sympy | |
| Downloading nvidia-cuda-nvrtc-cu12 | |
| Downloading triton | |
| Downloading nvidia-cufft-cu12 | |
| Downloading nvidia-cusolver-cu12 | |
| Downloading nvidia-cusparselt-cu12 | |
| Downloading nvidia-cusparse-cu12 | |
| Downloading nvidia-nccl-cu12 | |
| Downloading nvidia-cublas-cu12 | |
| Downloading nvidia-cudnn-cu12 | |
| Downloading torch | |
| Installed 25 packages in 220ms | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <div class="cell"> | |
| <div class="cell-header"> | |
| <span class="collapse-indicators"> | |
| <span onclick="toggleCode('utils')" style="cursor: pointer;">▶ code</span> | |
| <span onclick="toggleOutput('utils')" style="cursor: pointer;">▼ output</span> | |
| <span id="uv-indicator-utils" onclick="toggleUvLogsFromHeader('utils')" style="cursor: pointer;">▶ uv-logs</span> | |
| </span> | | |
| Cell: utils | deps: torch, numpy | 31.77s | |
| | <button class="run-btn" onclick="runCell('utils')">▶ run</button> | |
| <button class="copy-btn" onclick="copyCell('utils')">Copy</button> | |
| <a href="cells/utils.py" target="_blank" class="raw-btn">Raw</a> | |
| </div> | |
| <div id="code-utils" class="cell-code collapsed"> | |
| <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span> | |
| <span class="normal"> 2</span> | |
| <span class="normal"> 3</span> | |
| <span class="normal"> 4</span> | |
| <span class="normal"> 5</span> | |
| <span class="normal"> 6</span> | |
| <span class="normal"> 7</span> | |
| <span class="normal"> 8</span> | |
| <span class="normal"> 9</span> | |
| <span class="normal"> 10</span> | |
| <span class="normal"> 11</span> | |
| <span class="normal"> 12</span> | |
| <span class="normal"> 13</span> | |
| <span class="normal"> 14</span> | |
| <span class="normal"> 15</span> | |
| <span class="normal"> 16</span> | |
| <span class="normal"> 17</span> | |
| <span class="normal"> 18</span> | |
| <span class="normal"> 19</span> | |
| <span class="normal"> 20</span> | |
| <span class="normal"> 21</span> | |
| <span class="normal"> 22</span> | |
| <span class="normal"> 23</span> | |
| <span class="normal"> 24</span> | |
| <span class="normal"> 25</span> | |
| <span class="normal"> 26</span> | |
| <span class="normal"> 27</span> | |
| <span class="normal"> 28</span> | |
| <span class="normal"> 29</span> | |
| <span class="normal"> 30</span> | |
| <span class="normal"> 31</span> | |
| <span class="normal"> 32</span> | |
| <span class="normal"> 33</span> | |
| <span class="normal"> 34</span> | |
| <span class="normal"> 35</span> | |
| <span class="normal"> 36</span> | |
| <span class="normal"> 37</span> | |
| <span class="normal"> 38</span> | |
| <span class="normal"> 39</span> | |
| <span class="normal"> 40</span> | |
| <span class="normal"> 41</span> | |
| <span class="normal"> 42</span> | |
| <span class="normal"> 43</span> | |
| <span class="normal"> 44</span> | |
| <span class="normal"> 45</span> | |
| <span class="normal"> 46</span> | |
| <span class="normal"> 47</span> | |
| <span class="normal"> 48</span> | |
| <span class="normal"> 49</span> | |
| <span class="normal"> 50</span> | |
| <span class="normal"> 51</span> | |
| <span class="normal"> 52</span> | |
| <span class="normal"> 53</span> | |
| <span class="normal"> 54</span> | |
| <span class="normal"> 55</span> | |
| <span class="normal"> 56</span> | |
| <span class="normal"> 57</span> | |
| <span class="normal"> 58</span> | |
| <span class="normal"> 59</span> | |
| <span class="normal"> 60</span> | |
| <span class="normal"> 61</span> | |
| <span class="normal"> 62</span> | |
| <span class="normal"> 63</span> | |
| <span class="normal"> 64</span> | |
| <span class="normal"> 65</span> | |
| <span class="normal"> 66</span> | |
| <span class="normal"> 67</span> | |
| <span class="normal"> 68</span> | |
| <span class="normal"> 69</span> | |
| <span class="normal"> 70</span> | |
| <span class="normal"> 71</span> | |
| <span class="normal"> 72</span> | |
| <span class="normal"> 73</span> | |
| <span class="normal"> 74</span> | |
| <span class="normal"> 75</span> | |
| <span class="normal"> 76</span> | |
| <span class="normal"> 77</span> | |
| <span class="normal"> 78</span> | |
| <span class="normal"> 79</span> | |
| <span class="normal"> 80</span> | |
| <span class="normal"> 81</span> | |
| <span class="normal"> 82</span> | |
| <span class="normal"> 83</span> | |
| <span class="normal"> 84</span> | |
| <span class="normal"> 85</span> | |
| <span class="normal"> 86</span> | |
| <span class="normal"> 87</span> | |
| <span class="normal"> 88</span> | |
| <span class="normal"> 89</span> | |
| <span class="normal"> 90</span> | |
| <span class="normal"> 91</span> | |
| <span class="normal"> 92</span> | |
| <span class="normal"> 93</span> | |
| <span class="normal"> 94</span> | |
| <span class="normal"> 95</span> | |
| <span class="normal"> 96</span> | |
| <span class="normal"> 97</span> | |
| <span class="normal"> 98</span> | |
| <span class="normal"> 99</span> | |
| <span class="normal">100</span> | |
| <span class="normal">101</span> | |
| <span class="normal">102</span> | |
| <span class="normal">103</span> | |
| <span class="normal">104</span> | |
| <span class="normal">105</span> | |
| <span class="normal">106</span> | |
| <span class="normal">107</span> | |
| <span class="normal">108</span> | |
| <span class="normal">109</span> | |
| <span class="normal">110</span> | |
| <span class="normal">111</span> | |
| <span class="normal">112</span> | |
| <span class="normal">113</span> | |
| <span class="normal">114</span> | |
| <span class="normal">115</span> | |
| <span class="normal">116</span> | |
| <span class="normal">117</span> | |
| <span class="normal">118</span> | |
| <span class="normal">119</span> | |
| <span class="normal">120</span> | |
| <span class="normal">121</span> | |
| <span class="normal">122</span> | |
| <span class="normal">123</span> | |
| <span class="normal">124</span> | |
| <span class="normal">125</span> | |
| <span class="normal">126</span> | |
| <span class="normal">127</span> | |
| <span class="normal">128</span> | |
| <span class="normal">129</span> | |
| <span class="normal">130</span> | |
| <span class="normal">131</span> | |
| <span class="normal">132</span> | |
| <span class="normal">133</span> | |
| <span class="normal">134</span> | |
| <span class="normal">135</span> | |
| <span class="normal">136</span></pre></div></td><td class="code"><div><pre><span></span><span class="sd">"""Simple utilities for running the models."""</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">torch</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">to_dtype</span><span class="p">(</span><span class="n">dtype_str</span><span class="p">:</span> <span class="nb">str</span><span class="p">):</span> | |
| <span class="w"> </span><span class="sd">"""Convert string to torch dtype."""</span> | |
| <span class="k">if</span> <span class="n">dtype_str</span> <span class="o">==</span> <span class="s2">"float16"</span><span class="p">:</span> | |
| <span class="k">return</span> <span class="n">torch</span><span class="o">.</span><span class="n">float16</span> | |
| <span class="k">if</span> <span class="n">dtype_str</span> <span class="o">==</span> <span class="s2">"bfloat16"</span><span class="p">:</span> | |
| <span class="k">return</span> <span class="n">torch</span><span class="o">.</span><span class="n">bfloat16</span> | |
| <span class="k">return</span> <span class="n">torch</span><span class="o">.</span><span class="n">float32</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">tensor_stats</span><span class="p">(</span><span class="n">t</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">)</span> <span class="o">-></span> <span class="nb">str</span><span class="p">:</span> | |
| <span class="w"> </span><span class="sd">"""Generate stats string for a tensor."""</span> | |
| <span class="k">return</span> <span class="p">(</span><span class="sa">f</span><span class="s2">"shape=</span><span class="si">{</span><span class="nb">tuple</span><span class="p">(</span><span class="n">t</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span><span class="si">}</span><span class="s2">, "</span> | |
| <span class="sa">f</span><span class="s2">"dtype=</span><span class="si">{</span><span class="n">t</span><span class="o">.</span><span class="n">dtype</span><span class="si">}</span><span class="s2">, "</span> | |
| <span class="sa">f</span><span class="s2">"device=</span><span class="si">{</span><span class="n">t</span><span class="o">.</span><span class="n">device</span><span class="si">}</span><span class="s2">, "</span> | |
| <span class="sa">f</span><span class="s2">"mean=</span><span class="si">{</span><span class="n">t</span><span class="o">.</span><span class="n">mean</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">, "</span> | |
| <span class="sa">f</span><span class="s2">"std=</span><span class="si">{</span><span class="n">t</span><span class="o">.</span><span class="n">std</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">set_seed</span><span class="p">(</span><span class="n">seed</span><span class="p">:</span> <span class="nb">int</span><span class="p">):</span> | |
| <span class="w"> </span><span class="sd">"""Set seeds for reproducibility."""</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">manual_seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> | |
| <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">():</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">manual_seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">manual_seed_all</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">backends</span><span class="o">.</span><span class="n">cudnn</span><span class="o">.</span><span class="n">deterministic</span> <span class="o">=</span> <span class="kc">True</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">backends</span><span class="o">.</span><span class="n">cudnn</span><span class="o">.</span><span class="n">benchmark</span> <span class="o">=</span> <span class="kc">False</span> | |
| <span class="sd">"""Reusable benchmarking utilities for performance testing."""</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">time</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">contextlib</span><span class="w"> </span><span class="kn">import</span> <span class="n">contextmanager</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">typing</span><span class="w"> </span><span class="kn">import</span> <span class="n">Callable</span><span class="p">,</span> <span class="n">Dict</span><span class="p">,</span> <span class="n">Tuple</span><span class="p">,</span> <span class="n">Any</span><span class="p">,</span> <span class="n">Optional</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">torch</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">json</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">precise_timing</span><span class="p">(</span><span class="n">func</span><span class="p">:</span> <span class="n">Callable</span><span class="p">[[],</span> <span class="n">Any</span><span class="p">],</span> <span class="n">warmup</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">5</span><span class="p">,</span> <span class="n">iters</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">20</span><span class="p">,</span> | |
| <span class="n">input_generator</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Callable</span><span class="p">[[</span><span class="nb">int</span><span class="p">],</span> <span class="n">Any</span><span class="p">]]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">)</span> <span class="o">-></span> <span class="n">Tuple</span><span class="p">[</span><span class="n">Any</span><span class="p">,</span> <span class="nb">float</span><span class="p">]:</span> | |
| <span class="w"> </span><span class="sd">"""High precision timing function with warmup and optional input generation per iteration."""</span> | |
| <span class="c1"># Warmup</span> | |
| <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">warmup</span><span class="p">):</span> | |
| <span class="k">if</span> <span class="n">input_generator</span><span class="p">:</span> | |
| <span class="n">inputs</span> <span class="o">=</span> <span class="n">input_generator</span><span class="p">(</span><span class="n">i</span><span class="p">)</span> | |
| <span class="n">func</span><span class="p">(</span><span class="n">inputs</span><span class="p">)</span> | |
| <span class="k">else</span><span class="p">:</span> | |
| <span class="n">func</span><span class="p">()</span> | |
| <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">():</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">synchronize</span><span class="p">()</span> | |
| <span class="n">start</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()</span> | |
| <span class="n">result</span> <span class="o">=</span> <span class="kc">None</span> | |
| <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">iters</span><span class="p">):</span> | |
| <span class="k">if</span> <span class="n">input_generator</span><span class="p">:</span> | |
| <span class="n">inputs</span> <span class="o">=</span> <span class="n">input_generator</span><span class="p">(</span><span class="n">i</span> <span class="o">+</span> <span class="n">warmup</span><span class="p">)</span> <span class="c1"># Continue seed sequence after warmup</span> | |
| <span class="n">result</span> <span class="o">=</span> <span class="n">func</span><span class="p">(</span><span class="n">inputs</span><span class="p">)</span> | |
| <span class="k">else</span><span class="p">:</span> | |
| <span class="n">result</span> <span class="o">=</span> <span class="n">func</span><span class="p">()</span> | |
| <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">():</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">synchronize</span><span class="p">()</span> | |
| <span class="n">end</span> <span class="o">=</span> <span class="n">time</span><span class="o">.</span><span class="n">perf_counter</span><span class="p">()</span> | |
| <span class="n">avg_time</span> <span class="o">=</span> <span class="p">(</span><span class="n">end</span> <span class="o">-</span> <span class="n">start</span><span class="p">)</span> <span class="o">/</span> <span class="n">iters</span> | |
| <span class="k">return</span> <span class="n">result</span><span class="p">,</span> <span class="n">avg_time</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">memory_usage</span><span class="p">()</span> <span class="o">-></span> <span class="n">Dict</span><span class="p">[</span><span class="nb">str</span><span class="p">,</span> <span class="nb">float</span><span class="p">]:</span> | |
| <span class="w"> </span><span class="sd">"""Get current memory usage in GB."""</span> | |
| <span class="k">if</span> <span class="ow">not</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">():</span> | |
| <span class="k">return</span> <span class="p">{</span><span class="s2">"allocated"</span><span class="p">:</span> <span class="mf">0.0</span><span class="p">,</span> <span class="s2">"cached"</span><span class="p">:</span> <span class="mf">0.0</span><span class="p">,</span> <span class="s2">"max_allocated"</span><span class="p">:</span> <span class="mf">0.0</span><span class="p">}</span> | |
| <span class="k">return</span> <span class="p">{</span> | |
| <span class="s2">"allocated"</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">memory_allocated</span><span class="p">()</span> <span class="o">/</span> <span class="mi">1024</span><span class="o">**</span><span class="mi">3</span><span class="p">,</span> | |
| <span class="s2">"cached"</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">memory_reserved</span><span class="p">()</span> <span class="o">/</span> <span class="mi">1024</span><span class="o">**</span><span class="mi">3</span><span class="p">,</span> | |
| <span class="s2">"max_allocated"</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">max_memory_allocated</span><span class="p">()</span> <span class="o">/</span> <span class="mi">1024</span><span class="o">**</span><span class="mi">3</span> | |
| <span class="p">}</span> | |
| <span class="nd">@contextmanager</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">bench_context</span><span class="p">(</span><span class="n">warmup</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">10</span><span class="p">,</span> <span class="n">iters</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">50</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> | |
| <span class="n">tokens</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">save_json</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="nb">str</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> | |
| <span class="n">input_shape</span><span class="p">:</span> <span class="n">Optional</span><span class="p">[</span><span class="n">Tuple</span><span class="p">]</span> <span class="o">=</span> <span class="kc">None</span><span class="p">,</span> <span class="n">input_seed_base</span><span class="p">:</span> <span class="nb">int</span> <span class="o">=</span> <span class="mi">42</span><span class="p">):</span> | |
| <span class="w"> </span><span class="sd">"""Context manager for benchmarking with comprehensive metrics and optional input generation."""</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">run_benchmark</span><span class="p">(</span><span class="n">model_func</span><span class="p">,</span> <span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">):</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">empty_cache</span><span class="p">()</span> <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">()</span> <span class="k">else</span> <span class="kc">None</span> | |
| <span class="n">mem_before</span> <span class="o">=</span> <span class="n">memory_usage</span><span class="p">()</span> | |
| <span class="c1"># Create input generator if input_shape is provided</span> | |
| <span class="n">input_generator</span> <span class="o">=</span> <span class="kc">None</span> | |
| <span class="k">if</span> <span class="n">input_shape</span> <span class="ow">is</span> <span class="ow">not</span> <span class="kc">None</span><span class="p">:</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">create_input</span><span class="p">(</span><span class="n">iteration</span><span class="p">:</span> <span class="nb">int</span><span class="p">):</span> | |
| <span class="c1"># Use deterministic but different seed for each iteration</span> | |
| <span class="n">iteration_seed</span> <span class="o">=</span> <span class="n">input_seed_base</span> <span class="o">+</span> <span class="n">iteration</span> <span class="o">*</span> <span class="mi">123</span> <span class="c1"># Spread out seeds</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">manual_seed</span><span class="p">(</span><span class="n">iteration_seed</span><span class="p">)</span> | |
| <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">():</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">manual_seed</span><span class="p">(</span><span class="n">iteration_seed</span><span class="p">)</span> | |
| <span class="k">return</span> <span class="n">torch</span><span class="o">.</span><span class="n">randn</span><span class="p">(</span><span class="o">*</span><span class="n">input_shape</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">)</span> <span class="o">*</span> <span class="mf">0.1</span> | |
| <span class="n">input_generator</span> <span class="o">=</span> <span class="n">create_input</span> | |
| <span class="k">if</span> <span class="n">input_generator</span><span class="p">:</span> | |
| <span class="n">result</span><span class="p">,</span> <span class="n">avg_time</span> <span class="o">=</span> <span class="n">precise_timing</span><span class="p">(</span><span class="k">lambda</span> <span class="n">x</span><span class="p">:</span> <span class="n">model_func</span><span class="p">(</span><span class="n">x</span><span class="p">),</span> <span class="n">warmup</span><span class="p">,</span> <span class="n">iters</span><span class="p">,</span> <span class="n">input_generator</span><span class="p">)</span> | |
| <span class="k">else</span><span class="p">:</span> | |
| <span class="n">result</span><span class="p">,</span> <span class="n">avg_time</span> <span class="o">=</span> <span class="n">precise_timing</span><span class="p">(</span><span class="k">lambda</span><span class="p">:</span> <span class="n">model_func</span><span class="p">(</span><span class="o">*</span><span class="n">args</span><span class="p">,</span> <span class="o">**</span><span class="n">kwargs</span><span class="p">),</span> <span class="n">warmup</span><span class="p">,</span> <span class="n">iters</span><span class="p">)</span> | |
| <span class="n">mem_after</span> <span class="o">=</span> <span class="n">memory_usage</span><span class="p">()</span> | |
| <span class="c1"># Calculate metrics</span> | |
| <span class="n">metrics</span> <span class="o">=</span> <span class="p">{</span> | |
| <span class="s2">"avg_time_ms"</span><span class="p">:</span> <span class="n">avg_time</span> <span class="o">*</span> <span class="mi">1000</span><span class="p">,</span> | |
| <span class="s2">"throughput_tokens_per_sec"</span><span class="p">:</span> <span class="n">tokens</span> <span class="o">/</span> <span class="n">avg_time</span> <span class="k">if</span> <span class="n">tokens</span> <span class="k">else</span> <span class="kc">None</span><span class="p">,</span> | |
| <span class="s2">"memory_allocated_gb"</span><span class="p">:</span> <span class="n">mem_after</span><span class="p">[</span><span class="s2">"allocated"</span><span class="p">],</span> | |
| <span class="s2">"memory_cached_gb"</span><span class="p">:</span> <span class="n">mem_after</span><span class="p">[</span><span class="s2">"cached"</span><span class="p">],</span> | |
| <span class="s2">"memory_increase_gb"</span><span class="p">:</span> <span class="n">mem_after</span><span class="p">[</span><span class="s2">"allocated"</span><span class="p">]</span> <span class="o">-</span> <span class="n">mem_before</span><span class="p">[</span><span class="s2">"allocated"</span><span class="p">],</span> | |
| <span class="s2">"device"</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">device</span><span class="p">)</span> <span class="k">if</span> <span class="n">device</span> <span class="k">else</span> <span class="s2">"cpu"</span><span class="p">,</span> | |
| <span class="s2">"dtype"</span><span class="p">:</span> <span class="nb">str</span><span class="p">(</span><span class="n">dtype</span><span class="p">)</span> <span class="k">if</span> <span class="n">dtype</span> <span class="k">else</span> <span class="s2">"float32"</span><span class="p">,</span> | |
| <span class="s2">"tokens"</span><span class="p">:</span> <span class="n">tokens</span><span class="p">,</span> | |
| <span class="s2">"warmup_iters"</span><span class="p">:</span> <span class="n">warmup</span><span class="p">,</span> | |
| <span class="s2">"timing_iters"</span><span class="p">:</span> <span class="n">iters</span> | |
| <span class="p">}</span> | |
| <span class="c1"># Print results</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Average time: </span><span class="si">{</span><span class="n">metrics</span><span class="p">[</span><span class="s1">'avg_time_ms'</span><span class="p">]</span><span class="si">:</span><span class="s2">.3f</span><span class="si">}</span><span class="s2"> ms"</span><span class="p">)</span> | |
| <span class="k">if</span> <span class="n">tokens</span><span class="p">:</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Throughput: </span><span class="si">{</span><span class="n">metrics</span><span class="p">[</span><span class="s1">'throughput_tokens_per_sec'</span><span class="p">]</span><span class="si">:</span><span class="s2">.0f</span><span class="si">}</span><span class="s2"> tokens/sec"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Memory allocated: </span><span class="si">{</span><span class="n">metrics</span><span class="p">[</span><span class="s1">'memory_allocated_gb'</span><span class="p">]</span><span class="si">:</span><span class="s2">.3f</span><span class="si">}</span><span class="s2"> GB"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Memory increase: </span><span class="si">{</span><span class="n">metrics</span><span class="p">[</span><span class="s1">'memory_increase_gb'</span><span class="p">]</span><span class="si">:</span><span class="s2">.3f</span><span class="si">}</span><span class="s2"> GB"</span><span class="p">)</span> | |
| <span class="c1"># Save to JSON if requested</span> | |
| <span class="k">if</span> <span class="n">save_json</span><span class="p">:</span> | |
| <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">save_json</span><span class="p">,</span> <span class="s1">'w'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> | |
| <span class="n">json</span><span class="o">.</span><span class="n">dump</span><span class="p">(</span><span class="n">metrics</span><span class="p">,</span> <span class="n">f</span><span class="p">,</span> <span class="n">indent</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span> | |
| <span class="k">return</span> <span class="n">result</span> | |
| <span class="k">yield</span> <span class="n">run_benchmark</span> | |
| </pre></div></td></tr></table></div> | |
| </div> | |
| <div id="output-utils" class="cell-output"> | |
| <div class="uv-install-logs" id="uv-logs-utils"> | |
| <div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div> | |
| <div class="uv-logs-content" style="display: none;"> | |
| Downloading setuptools (1.1MiB) | |
| Downloading nvidia-cuda-nvrtc-cu12 (84.0MiB) | |
| Downloading numpy (15.9MiB) | |
| Downloading nvidia-cudnn-cu12 (674.0MiB) | |
| Downloading nvidia-cusparselt-cu12 (273.9MiB) | |
| Downloading nvidia-nccl-cu12 (307.4MiB) | |
| Downloading nvidia-cublas-cu12 (566.8MiB) | |
| Downloading networkx (1.9MiB) | |
| Downloading nvidia-nvjitlink-cu12 (37.4MiB) | |
| Downloading nvidia-cufft-cu12 (184.2MiB) | |
| Downloading nvidia-cusparse-cu12 (274.9MiB) | |
| Downloading sympy (6.0MiB) | |
| Downloading nvidia-cuda-cupti-cu12 (9.8MiB) | |
| Downloading nvidia-cusolver-cu12 (255.1MiB) | |
| Downloading nvidia-cufile-cu12 (1.1MiB) | |
| Downloading nvidia-curand-cu12 (60.7MiB) | |
| Downloading torch (846.8MiB) | |
| Downloading triton (148.4MiB) | |
| Downloading nvidia-cufile-cu12 | |
| Downloading setuptools | |
| Downloading networkx | |
| Downloading nvidia-cuda-cupti-cu12 | |
| Downloading numpy | |
| Downloading sympy | |
| Downloading nvidia-nvjitlink-cu12 | |
| Downloading nvidia-curand-cu12 | |
| Downloading nvidia-cuda-nvrtc-cu12 | |
| Downloading triton | |
| Downloading nvidia-cufft-cu12 | |
| Downloading nvidia-cusolver-cu12 | |
| Downloading nvidia-cusparselt-cu12 | |
| Downloading nvidia-cusparse-cu12 | |
| Downloading nvidia-nccl-cu12 | |
| Downloading nvidia-cublas-cu12 | |
| Downloading nvidia-cudnn-cu12 | |
| Downloading torch | |
| Installed 26 packages in 292ms | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <div class="cell"> | |
| <div class="cell-header"> | |
| <span class="collapse-indicators"> | |
| <span onclick="toggleCode('config')" style="cursor: pointer;">▶ code</span> | |
| <span onclick="toggleOutput('config')" style="cursor: pointer;">▼ output</span> | |
| <span id="uv-indicator-config" onclick="toggleUvLogsFromHeader('config')" style="cursor: pointer;">▶ uv-logs</span> | |
| </span> | | |
| Cell: config | deps: torch, numpy | 37.88s | |
| | <button class="run-btn" onclick="runCell('config')">▶ run</button> | |
| <button class="copy-btn" onclick="copyCell('config')">Copy</button> | |
| <a href="cells/config.py" target="_blank" class="raw-btn">Raw</a> | |
| </div> | |
| <div id="code-config" class="cell-code collapsed"> | |
| <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span> | |
| <span class="normal"> 2</span> | |
| <span class="normal"> 3</span> | |
| <span class="normal"> 4</span> | |
| <span class="normal"> 5</span> | |
| <span class="normal"> 6</span> | |
| <span class="normal"> 7</span> | |
| <span class="normal"> 8</span> | |
| <span class="normal"> 9</span> | |
| <span class="normal">10</span> | |
| <span class="normal">11</span> | |
| <span class="normal">12</span> | |
| <span class="normal">13</span> | |
| <span class="normal">14</span> | |
| <span class="normal">15</span> | |
| <span class="normal">16</span> | |
| <span class="normal">17</span> | |
| <span class="normal">18</span> | |
| <span class="normal">19</span> | |
| <span class="normal">20</span> | |
| <span class="normal">21</span> | |
| <span class="normal">22</span> | |
| <span class="normal">23</span> | |
| <span class="normal">24</span> | |
| <span class="normal">25</span> | |
| <span class="normal">26</span> | |
| <span class="normal">27</span> | |
| <span class="normal">28</span></pre></div></td><td class="code"><div><pre><span></span><span class="sd">"""Configuration for MoE benchmarks."""</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">torch</span> | |
| <span class="c1"># Model configuration</span> | |
| <span class="n">NUM_EXPERTS</span> <span class="o">=</span> <span class="mi">128</span> | |
| <span class="n">HIDDEN_SIZE</span> <span class="o">=</span> <span class="mi">1152</span> | |
| <span class="n">TOP_K</span> <span class="o">=</span> <span class="mi">4</span> | |
| <span class="c1"># Benchmark configuration </span> | |
| <span class="n">BATCH_SIZE</span> <span class="o">=</span> <span class="mi">8</span> | |
| <span class="n">SEQ_LEN</span> <span class="o">=</span> <span class="mi">512</span> | |
| <span class="n">DTYPE</span> <span class="o">=</span> <span class="s2">"bfloat16"</span> | |
| <span class="n">DEVICE</span> <span class="o">=</span> <span class="s2">"cuda"</span> <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">()</span> <span class="k">else</span> <span class="s2">"cpu"</span> | |
| <span class="c1"># Seeds for reproducibility</span> | |
| <span class="n">WEIGHT_SEED</span> <span class="o">=</span> <span class="mi">999</span> | |
| <span class="n">EXPERT_SEED</span> <span class="o">=</span> <span class="mi">777</span> | |
| <span class="n">INPUT_SEED</span> <span class="o">=</span> <span class="mi">123</span> | |
| <span class="n">GENERAL_SEED</span> <span class="o">=</span> <span class="mi">42</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Configuration:"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Experts: </span><span class="si">{</span><span class="n">NUM_EXPERTS</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Hidden size: </span><span class="si">{</span><span class="n">HIDDEN_SIZE</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Top-k: </span><span class="si">{</span><span class="n">TOP_K</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Batch size: </span><span class="si">{</span><span class="n">BATCH_SIZE</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Sequence length: </span><span class="si">{</span><span class="n">SEQ_LEN</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Device: </span><span class="si">{</span><span class="n">DEVICE</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Dtype: </span><span class="si">{</span><span class="n">DTYPE</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| </pre></div></td></tr></table></div> | |
| </div> | |
| <div id="output-config" class="cell-output"> | |
| <div class="cell-stdout">Configuration: | |
| Experts: 128 | |
| Hidden size: 1152 | |
| Top-k: 4 | |
| Batch size: 8 | |
| Sequence length: 512 | |
| Device: cuda | |
| Dtype: bfloat16 | |
| </div> | |
| <div class="uv-install-logs" id="uv-logs-config"> | |
| <div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div> | |
| <div class="uv-logs-content" style="display: none;"> | |
| Downloading nvidia-curand-cu12 (60.7MiB) | |
| Downloading numpy (15.9MiB) | |
| Downloading networkx (1.9MiB) | |
| Downloading nvidia-nvjitlink-cu12 (37.4MiB) | |
| Downloading nvidia-cuda-cupti-cu12 (9.8MiB) | |
| Downloading nvidia-cufile-cu12 (1.1MiB) | |
| Downloading setuptools (1.1MiB) | |
| Downloading sympy (6.0MiB) | |
| Downloading nvidia-cudnn-cu12 (674.0MiB) | |
| Downloading nvidia-cusparselt-cu12 (273.9MiB) | |
| Downloading nvidia-cusolver-cu12 (255.1MiB) | |
| Downloading nvidia-cublas-cu12 (566.8MiB) | |
| Downloading nvidia-cusparse-cu12 (274.9MiB) | |
| Downloading nvidia-nccl-cu12 (307.4MiB) | |
| Downloading nvidia-cuda-nvrtc-cu12 (84.0MiB) | |
| Downloading torch (846.8MiB) | |
| Downloading nvidia-cufft-cu12 (184.2MiB) | |
| Downloading triton (148.4MiB) | |
| Downloading nvidia-cufile-cu12 | |
| Downloading setuptools | |
| Downloading networkx | |
| Downloading nvidia-cuda-cupti-cu12 | |
| Downloading numpy | |
| Downloading nvidia-nvjitlink-cu12 | |
| Downloading sympy | |
| Downloading nvidia-curand-cu12 | |
| Downloading nvidia-cuda-nvrtc-cu12 | |
| Downloading triton | |
| Downloading nvidia-cufft-cu12 | |
| Downloading nvidia-cusolver-cu12 | |
| Downloading nvidia-cusparse-cu12 | |
| Downloading nvidia-cusparselt-cu12 | |
| Downloading nvidia-nccl-cu12 | |
| Downloading nvidia-cublas-cu12 | |
| Downloading nvidia-cudnn-cu12 | |
| Downloading torch | |
| Installed 26 packages in 207ms | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <div class="cell"> | |
| <div class="cell-header"> | |
| <span class="collapse-indicators"> | |
| <span onclick="toggleCode('save_data')" style="cursor: pointer;">▼ code</span> | |
| <span onclick="toggleOutput('save_data')" style="cursor: pointer;">▼ output</span> | |
| <span id="uv-indicator-save_data" onclick="toggleUvLogsFromHeader('save_data')" style="cursor: pointer;">▶ uv-logs</span> | |
| </span> | | |
| Cell: save_data | deps: torch, numpy | 44.56s | |
| | <button class="run-btn" onclick="runCell('save_data')">▶ run</button> | |
| <button class="copy-btn" onclick="copyCell('save_data')">Copy</button> | |
| <a href="cells/save_data.py" target="_blank" class="raw-btn">Raw</a> | |
| </div> | |
| <div id="code-save_data" class="cell-code"> | |
| <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span> | |
| <span class="normal"> 2</span> | |
| <span class="normal"> 3</span> | |
| <span class="normal"> 4</span> | |
| <span class="normal"> 5</span> | |
| <span class="normal"> 6</span> | |
| <span class="normal"> 7</span> | |
| <span class="normal"> 8</span> | |
| <span class="normal"> 9</span> | |
| <span class="normal">10</span> | |
| <span class="normal">11</span> | |
| <span class="normal">12</span> | |
| <span class="normal">13</span> | |
| <span class="normal">14</span> | |
| <span class="normal">15</span> | |
| <span class="normal">16</span> | |
| <span class="normal">17</span> | |
| <span class="normal">18</span> | |
| <span class="normal">19</span> | |
| <span class="normal">20</span> | |
| <span class="normal">21</span> | |
| <span class="normal">22</span> | |
| <span class="normal">23</span> | |
| <span class="normal">24</span> | |
| <span class="normal">25</span> | |
| <span class="normal">26</span> | |
| <span class="normal">27</span> | |
| <span class="normal">28</span> | |
| <span class="normal">29</span> | |
| <span class="normal">30</span> | |
| <span class="normal">31</span> | |
| <span class="normal">32</span> | |
| <span class="normal">33</span> | |
| <span class="normal">34</span> | |
| <span class="normal">35</span> | |
| <span class="normal">36</span> | |
| <span class="normal">37</span> | |
| <span class="normal">38</span> | |
| <span class="normal">39</span> | |
| <span class="normal">40</span> | |
| <span class="normal">41</span> | |
| <span class="normal">42</span> | |
| <span class="normal">43</span> | |
| <span class="normal">44</span> | |
| <span class="normal">45</span> | |
| <span class="normal">46</span> | |
| <span class="normal">47</span> | |
| <span class="normal">48</span> | |
| <span class="normal">49</span> | |
| <span class="normal">50</span> | |
| <span class="normal">51</span> | |
| <span class="normal">52</span> | |
| <span class="normal">53</span> | |
| <span class="normal">54</span> | |
| <span class="normal">55</span> | |
| <span class="normal">56</span> | |
| <span class="normal">57</span> | |
| <span class="normal">58</span> | |
| <span class="normal">59</span> | |
| <span class="normal">60</span></pre></div></td><td class="code"><div><pre><span></span><span class="sd">"""Generate and save shared weights for consistent comparison."""</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">torch</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">pathlib</span><span class="w"> </span><span class="kn">import</span> <span class="n">Path</span> | |
| <span class="c1"># Model configuration</span> | |
| <span class="n">NUM_EXPERTS</span> <span class="o">=</span> <span class="mi">128</span> | |
| <span class="n">HIDDEN_SIZE</span> <span class="o">=</span> <span class="mi">1152</span> | |
| <span class="n">INTERMEDIATE_SIZE</span> <span class="o">=</span> <span class="mi">3072</span> | |
| <span class="n">TOP_K</span> <span class="o">=</span> <span class="mi">4</span> | |
| <span class="c1"># Input configuration</span> | |
| <span class="n">BATCH_SIZE</span> <span class="o">=</span> <span class="mi">1</span> | |
| <span class="n">SEQ_LEN</span> <span class="o">=</span> <span class="mi">100</span> | |
| <span class="n">DTYPE</span> <span class="o">=</span> <span class="s2">"float32"</span> | |
| <span class="n">DEVICE</span> <span class="o">=</span> <span class="s2">"cuda"</span> <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">()</span> <span class="k">else</span> <span class="s2">"cpu"</span> | |
| <span class="c1"># Seeds for reproducibility</span> | |
| <span class="n">WEIGHT_SEED</span> <span class="o">=</span> <span class="mi">999</span> | |
| <span class="n">EXPERT_SEED</span> <span class="o">=</span> <span class="mi">777</span> | |
| <span class="n">INPUT_SEED</span> <span class="o">=</span> <span class="mi">123</span> | |
| <span class="n">GENERAL_SEED</span> <span class="o">=</span> <span class="mi">42</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">set_seed</span><span class="p">(</span><span class="n">seed</span><span class="p">:</span> <span class="nb">int</span><span class="p">):</span> | |
| <span class="w"> </span><span class="sd">"""Set seeds for reproducibility."""</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">manual_seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> | |
| <span class="n">np</span><span class="o">.</span><span class="n">random</span><span class="o">.</span><span class="n">seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> | |
| <span class="k">if</span> <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">is_available</span><span class="p">():</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">manual_seed</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">cuda</span><span class="o">.</span><span class="n">manual_seed_all</span><span class="p">(</span><span class="n">seed</span><span class="p">)</span> | |
| <span class="c1"># Generate shared weights for all implementations</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"Generating shared weights..."</span><span class="p">)</span> | |
| <span class="c1"># Router weights</span> | |
| <span class="n">set_seed</span><span class="p">(</span><span class="n">WEIGHT_SEED</span><span class="p">)</span> | |
| <span class="n">router_weight</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">empty</span><span class="p">(</span><span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">init</span><span class="o">.</span><span class="n">kaiming_uniform_</span><span class="p">(</span><span class="n">router_weight</span><span class="p">)</span> | |
| <span class="n">router_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">NUM_EXPERTS</span><span class="p">)</span> | |
| <span class="c1"># Expert weights - using proper dimensions for gate/up combined projection</span> | |
| <span class="n">set_seed</span><span class="p">(</span><span class="n">EXPERT_SEED</span><span class="p">)</span> | |
| <span class="n">gate_up_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">empty</span><span class="p">(</span><span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">HIDDEN_SIZE</span><span class="p">)</span><span class="o">.</span><span class="n">normal_</span><span class="p">(</span><span class="n">mean</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">std</span><span class="o">=</span><span class="mf">0.02</span><span class="p">)</span> | |
| <span class="n">gate_up_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">HIDDEN_SIZE</span><span class="p">)</span> | |
| <span class="n">down_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">empty</span><span class="p">(</span><span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">)</span><span class="o">.</span><span class="n">normal_</span><span class="p">(</span><span class="n">mean</span><span class="o">=</span><span class="mf">0.0</span><span class="p">,</span> <span class="n">std</span><span class="o">=</span><span class="mf">0.02</span><span class="p">)</span> | |
| <span class="n">down_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">zeros</span><span class="p">(</span><span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">)</span> | |
| <span class="c1"># Save weights</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">router_weight</span><span class="p">,</span> <span class="s1">'router_weight.pt'</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">router_bias</span><span class="p">,</span> <span class="s1">'router_bias.pt'</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">gate_up_proj</span><span class="p">,</span> <span class="s1">'gate_up_proj.pt'</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">gate_up_proj_bias</span><span class="p">,</span> <span class="s1">'gate_up_proj_bias.pt'</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">down_proj</span><span class="p">,</span> <span class="s1">'down_proj.pt'</span><span class="p">)</span> | |
| <span class="n">torch</span><span class="o">.</span><span class="n">save</span><span class="p">(</span><span class="n">down_proj_bias</span><span class="p">,</span> <span class="s1">'down_proj_bias.pt'</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Saved weights:"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Router: </span><span class="si">{</span><span class="nb">tuple</span><span class="p">(</span><span class="n">router_weight</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Gate/Up proj: </span><span class="si">{</span><span class="nb">tuple</span><span class="p">(</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Down proj: </span><span class="si">{</span><span class="nb">tuple</span><span class="p">(</span><span class="n">down_proj</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Hidden size: </span><span class="si">{</span><span class="n">HIDDEN_SIZE</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| </pre></div></td></tr></table></div> | |
| </div> | |
| <div id="output-save_data" class="cell-output"> | |
| <div class="cell-stdout">Generating shared weights... | |
| Saved weights: | |
| Router: (128, 1152) | |
| Gate/Up proj: (128, 1152, 2304) | |
| Down proj: (128, 1152, 1152) | |
| Hidden size: 1152 | |
| </div> | |
| <div class="uv-install-logs" id="uv-logs-save_data"> | |
| <div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div> | |
| <div class="uv-logs-content" style="display: none;"> | |
| Downloading nvidia-curand-cu12 (60.7MiB) | |
| Downloading nvidia-cudnn-cu12 (674.0MiB) | |
| Downloading sympy (6.0MiB) | |
| Downloading nvidia-cufft-cu12 (184.2MiB) | |
| Downloading nvidia-cusparselt-cu12 (273.9MiB) | |
| Downloading nvidia-cublas-cu12 (566.8MiB) | |
| Downloading nvidia-cuda-nvrtc-cu12 (84.0MiB) | |
| Downloading numpy (15.9MiB) | |
| Downloading nvidia-cusolver-cu12 (255.1MiB) | |
| Downloading nvidia-nvjitlink-cu12 (37.4MiB) | |
| Downloading networkx (1.9MiB) | |
| Downloading nvidia-cuda-cupti-cu12 (9.8MiB) | |
| Downloading nvidia-nccl-cu12 (307.4MiB) | |
| Downloading nvidia-cusparse-cu12 (274.9MiB) | |
| Downloading triton (148.4MiB) | |
| Downloading setuptools (1.1MiB) | |
| Downloading nvidia-cufile-cu12 (1.1MiB) | |
| Downloading torch (846.8MiB) | |
| Downloading nvidia-cufile-cu12 | |
| Downloading setuptools | |
| Downloading networkx | |
| Downloading nvidia-cuda-cupti-cu12 | |
| Downloading numpy | |
| Downloading sympy | |
| Downloading nvidia-nvjitlink-cu12 | |
| Downloading nvidia-curand-cu12 | |
| Downloading nvidia-cuda-nvrtc-cu12 | |
| Downloading triton | |
| Downloading nvidia-cufft-cu12 | |
| Downloading nvidia-cusolver-cu12 | |
| Downloading nvidia-cusparselt-cu12 | |
| Downloading nvidia-cusparse-cu12 | |
| Downloading nvidia-nccl-cu12 | |
| Downloading nvidia-cublas-cu12 | |
| Downloading nvidia-cudnn-cu12 | |
| Downloading torch | |
| Installed 26 packages in 240ms | |
| </div> | |
| </div> | |
| <div class="cell-artifacts"> | |
| <h4>Artifacts:</h4> | |
| <a href="artifacts/save_data/down_proj.pt" class="artifact" target="_blank">down_proj.pt</a> | |
| <a href="artifacts/save_data/down_proj_bias.pt" class="artifact" target="_blank">down_proj_bias.pt</a> | |
| <a href="artifacts/save_data/gate_up_proj.pt" class="artifact" target="_blank">gate_up_proj.pt</a> | |
| <a href="artifacts/save_data/gate_up_proj_bias.pt" class="artifact" target="_blank">gate_up_proj_bias.pt</a> | |
| <a href="artifacts/save_data/router_weight.pt" class="artifact" target="_blank">router_weight.pt</a> | |
| <a href="artifacts/save_data/router_bias.pt" class="artifact" target="_blank">router_bias.pt</a> | |
| </div> | |
| </div> | |
| </div> | |
| <h2>GPT-OSS Implementation</h2> | |
| <p>This section benchmarks the GPT-OSS MoE implementation in non-training mode.</p> | |
| <div class="cell"> | |
| <div class="cell-header"> | |
| <span class="collapse-indicators"> | |
| <span onclick="toggleCode('gptoss_run')" style="cursor: pointer;">▶ code</span> | |
| <span onclick="toggleOutput('gptoss_run')" style="cursor: pointer;">▼ output</span> | |
| <span id="uv-indicator-gptoss_run" onclick="toggleUvLogsFromHeader('gptoss_run')" style="cursor: pointer;">▶ uv-logs</span> | |
| </span> | | |
| Cell: gptoss_run | deps: torch, numpy | 43.29s | |
| | <button class="run-btn" onclick="runCell('gptoss_run')">▶ run</button> | |
| <button class="copy-btn" onclick="copyCell('gptoss_run')">Copy</button> | |
| <a href="cells/gptoss_run.py" target="_blank" class="raw-btn">Raw</a> | |
| </div> | |
| <div id="code-gptoss_run" class="cell-code collapsed"> | |
| <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span> | |
| <span class="normal"> 2</span> | |
| <span class="normal"> 3</span> | |
| <span class="normal"> 4</span> | |
| <span class="normal"> 5</span> | |
| <span class="normal"> 6</span> | |
| <span class="normal"> 7</span> | |
| <span class="normal"> 8</span> | |
| <span class="normal"> 9</span> | |
| <span class="normal"> 10</span> | |
| <span class="normal"> 11</span> | |
| <span class="normal"> 12</span> | |
| <span class="normal"> 13</span> | |
| <span class="normal"> 14</span> | |
| <span class="normal"> 15</span> | |
| <span class="normal"> 16</span> | |
| <span class="normal"> 17</span> | |
| <span class="normal"> 18</span> | |
| <span class="normal"> 19</span> | |
| <span class="normal"> 20</span> | |
| <span class="normal"> 21</span> | |
| <span class="normal"> 22</span> | |
| <span class="normal"> 23</span> | |
| <span class="normal"> 24</span> | |
| <span class="normal"> 25</span> | |
| <span class="normal"> 26</span> | |
| <span class="normal"> 27</span> | |
| <span class="normal"> 28</span> | |
| <span class="normal"> 29</span> | |
| <span class="normal"> 30</span> | |
| <span class="normal"> 31</span> | |
| <span class="normal"> 32</span> | |
| <span class="normal"> 33</span> | |
| <span class="normal"> 34</span> | |
| <span class="normal"> 35</span> | |
| <span class="normal"> 36</span> | |
| <span class="normal"> 37</span> | |
| <span class="normal"> 38</span> | |
| <span class="normal"> 39</span> | |
| <span class="normal"> 40</span> | |
| <span class="normal"> 41</span> | |
| <span class="normal"> 42</span> | |
| <span class="normal"> 43</span> | |
| <span class="normal"> 44</span> | |
| <span class="normal"> 45</span> | |
| <span class="normal"> 46</span> | |
| <span class="normal"> 47</span> | |
| <span class="normal"> 48</span> | |
| <span class="normal"> 49</span> | |
| <span class="normal"> 50</span> | |
| <span class="normal"> 51</span> | |
| <span class="normal"> 52</span> | |
| <span class="normal"> 53</span> | |
| <span class="normal"> 54</span> | |
| <span class="normal"> 55</span> | |
| <span class="normal"> 56</span> | |
| <span class="normal"> 57</span> | |
| <span class="normal"> 58</span> | |
| <span class="normal"> 59</span> | |
| <span class="normal"> 60</span> | |
| <span class="normal"> 61</span> | |
| <span class="normal"> 62</span> | |
| <span class="normal"> 63</span> | |
| <span class="normal"> 64</span> | |
| <span class="normal"> 65</span> | |
| <span class="normal"> 66</span> | |
| <span class="normal"> 67</span> | |
| <span class="normal"> 68</span> | |
| <span class="normal"> 69</span> | |
| <span class="normal"> 70</span> | |
| <span class="normal"> 71</span> | |
| <span class="normal"> 72</span> | |
| <span class="normal"> 73</span> | |
| <span class="normal"> 74</span> | |
| <span class="normal"> 75</span> | |
| <span class="normal"> 76</span> | |
| <span class="normal"> 77</span> | |
| <span class="normal"> 78</span> | |
| <span class="normal"> 79</span> | |
| <span class="normal"> 80</span> | |
| <span class="normal"> 81</span> | |
| <span class="normal"> 82</span> | |
| <span class="normal"> 83</span> | |
| <span class="normal"> 84</span> | |
| <span class="normal"> 85</span> | |
| <span class="normal"> 86</span> | |
| <span class="normal"> 87</span> | |
| <span class="normal"> 88</span> | |
| <span class="normal"> 89</span> | |
| <span class="normal"> 90</span> | |
| <span class="normal"> 91</span> | |
| <span class="normal"> 92</span> | |
| <span class="normal"> 93</span> | |
| <span class="normal"> 94</span> | |
| <span class="normal"> 95</span> | |
| <span class="normal"> 96</span> | |
| <span class="normal"> 97</span> | |
| <span class="normal"> 98</span> | |
| <span class="normal"> 99</span> | |
| <span class="normal">100</span> | |
| <span class="normal">101</span> | |
| <span class="normal">102</span> | |
| <span class="normal">103</span> | |
| <span class="normal">104</span> | |
| <span class="normal">105</span> | |
| <span class="normal">106</span> | |
| <span class="normal">107</span> | |
| <span class="normal">108</span> | |
| <span class="normal">109</span> | |
| <span class="normal">110</span> | |
| <span class="normal">111</span> | |
| <span class="normal">112</span> | |
| <span class="normal">113</span> | |
| <span class="normal">114</span> | |
| <span class="normal">115</span> | |
| <span class="normal">116</span> | |
| <span class="normal">117</span> | |
| <span class="normal">118</span> | |
| <span class="normal">119</span> | |
| <span class="normal">120</span> | |
| <span class="normal">121</span> | |
| <span class="normal">122</span> | |
| <span class="normal">123</span> | |
| <span class="normal">124</span> | |
| <span class="normal">125</span> | |
| <span class="normal">126</span> | |
| <span class="normal">127</span> | |
| <span class="normal">128</span> | |
| <span class="normal">129</span> | |
| <span class="normal">130</span> | |
| <span class="normal">131</span> | |
| <span class="normal">132</span> | |
| <span class="normal">133</span> | |
| <span class="normal">134</span> | |
| <span class="normal">135</span> | |
| <span class="normal">136</span> | |
| <span class="normal">137</span> | |
| <span class="normal">138</span> | |
| <span class="normal">139</span> | |
| <span class="normal">140</span> | |
| <span class="normal">141</span> | |
| <span class="normal">142</span></pre></div></td><td class="code"><div><pre><span></span><span class="kn">import</span><span class="w"> </span><span class="nn">torch</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">torch</span><span class="w"> </span><span class="kn">import</span> <span class="n">nn</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">torch.nn</span><span class="w"> </span><span class="kn">import</span> <span class="n">functional</span> <span class="k">as</span> <span class="n">F</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">utils</span><span class="w"> </span><span class="kn">import</span> <span class="n">to_dtype</span><span class="p">,</span> <span class="n">tensor_stats</span><span class="p">,</span> <span class="n">set_seed</span><span class="p">,</span> <span class="n">bench_context</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">config</span><span class="w"> </span><span class="kn">import</span> <span class="p">(</span> | |
| <span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">,</span> <span class="n">TOP_K</span><span class="p">,</span> | |
| <span class="n">BATCH_SIZE</span><span class="p">,</span> <span class="n">SEQ_LEN</span><span class="p">,</span> <span class="n">DTYPE</span><span class="p">,</span> <span class="n">DEVICE</span><span class="p">,</span> | |
| <span class="n">WEIGHT_SEED</span><span class="p">,</span> <span class="n">EXPERT_SEED</span><span class="p">,</span> <span class="n">INPUT_SEED</span><span class="p">,</span> <span class="n">GENERAL_SEED</span> | |
| <span class="p">)</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">pathlib</span><span class="w"> </span><span class="kn">import</span> <span class="n">Path</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">os</span> | |
| <span class="c1"># Discover the upstream artifact directory from env</span> | |
| <span class="n">data_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'UVNOTE_INPUT_SAVE_DATA'</span><span class="p">,</span> <span class="s1">'.'</span><span class="p">)</span> | |
| <span class="c1"># list all the files in the directory</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Loading weights from: </span><span class="si">{</span><span class="n">data_dir</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Files in directory: </span><span class="si">{</span><span class="nb">list</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span><span class="o">.</span><span class="n">glob</span><span class="p">(</span><span class="s1">'*'</span><span class="p">))</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="n">router_weight</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'router_weight.pt'</span><span class="p">)</span> | |
| <span class="n">router_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'router_bias.pt'</span><span class="p">)</span> | |
| <span class="n">gate_up_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'gate_up_proj.pt'</span><span class="p">)</span> | |
| <span class="n">gate_up_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'gate_up_proj_bias.pt'</span><span class="p">)</span> | |
| <span class="n">down_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'down_proj.pt'</span><span class="p">)</span> | |
| <span class="n">down_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'down_proj_bias.pt'</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"Loaded shared weights from artifacts"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Router weight sum: </span><span class="si">{</span><span class="n">router_weight</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Gate/up sum: </span><span class="si">{</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Down sum: </span><span class="si">{</span><span class="n">down_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="k">class</span><span class="w"> </span><span class="nc">GptOssRouter</span><span class="p">(</span><span class="n">nn</span><span class="o">.</span><span class="n">Module</span><span class="p">):</span> | |
| <span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">router_weight</span><span class="p">,</span> <span class="n">router_bias</span><span class="p">):</span> | |
| <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">top_k</span> <span class="o">=</span> <span class="n">TOP_K</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">num_experts</span> <span class="o">=</span> <span class="n">NUM_EXPERTS</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">hidden_dim</span> <span class="o">=</span> <span class="n">HIDDEN_SIZE</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">weight</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">router_weight</span><span class="o">.</span><span class="n">clone</span><span class="p">())</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">bias</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">router_bias</span><span class="o">.</span><span class="n">clone</span><span class="p">())</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">hidden_states</span><span class="p">):</span> | |
| <span class="n">hidden_states</span> <span class="o">=</span> <span class="n">hidden_states</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">hidden_dim</span><span class="p">)</span> | |
| <span class="n">router_logits</span> <span class="o">=</span> <span class="n">F</span><span class="o">.</span><span class="n">linear</span><span class="p">(</span><span class="n">hidden_states</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">weight</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">bias</span><span class="p">)</span> | |
| <span class="n">router_top_value</span><span class="p">,</span> <span class="n">router_indices</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">topk</span><span class="p">(</span><span class="n">router_logits</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">top_k</span><span class="p">,</span> <span class="n">dim</span><span class="o">=-</span><span class="mi">1</span><span class="p">)</span> | |
| <span class="n">router_top_value</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">functional</span><span class="o">.</span><span class="n">softmax</span><span class="p">(</span><span class="n">router_top_value</span><span class="p">,</span> <span class="n">dim</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">router_top_value</span><span class="o">.</span><span class="n">dtype</span><span class="p">)</span> | |
| <span class="n">router_scores</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="n">router_logits</span><span class="p">)</span><span class="o">.</span><span class="n">scatter_</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">router_indices</span><span class="p">,</span> <span class="n">router_top_value</span><span class="p">)</span> | |
| <span class="k">return</span> <span class="n">router_scores</span><span class="p">,</span> <span class="n">router_indices</span> | |
| <span class="k">class</span><span class="w"> </span><span class="nc">GptOssExperts</span><span class="p">(</span><span class="n">nn</span><span class="o">.</span><span class="n">Module</span><span class="p">):</span> | |
| <span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">gate_up_proj</span><span class="p">,</span> <span class="n">gate_up_proj_bias</span><span class="p">,</span> <span class="n">down_proj</span><span class="p">,</span> <span class="n">down_proj_bias</span><span class="p">):</span> | |
| <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">num_experts</span> <span class="o">=</span> <span class="n">NUM_EXPERTS</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">hidden_size</span> <span class="o">=</span> <span class="n">HIDDEN_SIZE</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">expert_dim</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">hidden_size</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">gate_up_proj</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">clone</span><span class="p">())</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">gate_up_proj_bias</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">gate_up_proj_bias</span><span class="o">.</span><span class="n">clone</span><span class="p">())</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">down_proj</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">down_proj</span><span class="o">.</span><span class="n">clone</span><span class="p">())</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">down_proj_bias</span> <span class="o">=</span> <span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">down_proj_bias</span><span class="o">.</span><span class="n">clone</span><span class="p">())</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">alpha</span> <span class="o">=</span> <span class="mf">1.702</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">limit</span> <span class="o">=</span> <span class="mf">7.0</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">hidden_states</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">,</span> <span class="n">router_indices</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="n">routing_weights</span><span class="o">=</span><span class="kc">None</span><span class="p">)</span> <span class="o">-></span> <span class="n">torch</span><span class="o">.</span><span class="n">Tensor</span><span class="p">:</span> | |
| <span class="n">batch_size</span> <span class="o">=</span> <span class="n">hidden_states</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> | |
| <span class="n">hidden_states</span> <span class="o">=</span> <span class="n">hidden_states</span><span class="o">.</span><span class="n">reshape</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">hidden_size</span><span class="p">)</span> | |
| <span class="n">num_experts</span> <span class="o">=</span> <span class="n">routing_weights</span><span class="o">.</span><span class="n">shape</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> | |
| <span class="k">if</span> <span class="n">hidden_states</span><span class="o">.</span><span class="n">device</span><span class="o">.</span><span class="n">type</span> <span class="o">==</span> <span class="s2">"cpu"</span> <span class="ow">or</span> <span class="bp">self</span><span class="o">.</span><span class="n">training</span><span class="p">:</span> | |
| <span class="n">next_states</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">zeros_like</span><span class="p">(</span><span class="n">hidden_states</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">hidden_states</span><span class="o">.</span><span class="n">dtype</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="n">hidden_states</span><span class="o">.</span><span class="n">device</span><span class="p">)</span> | |
| <span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">():</span> | |
| <span class="n">expert_mask</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">functional</span><span class="o">.</span><span class="n">one_hot</span><span class="p">(</span><span class="n">router_indices</span><span class="p">,</span> <span class="n">num_classes</span><span class="o">=</span><span class="n">num_experts</span><span class="p">)</span> | |
| <span class="n">expert_mask</span> <span class="o">=</span> <span class="n">expert_mask</span><span class="o">.</span><span class="n">permute</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> | |
| <span class="n">expert_hit</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">greater</span><span class="p">(</span><span class="n">expert_mask</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">dim</span><span class="o">=</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="o">-</span><span class="mi">2</span><span class="p">)),</span> <span class="mi">0</span><span class="p">)</span><span class="o">.</span><span class="n">nonzero</span><span class="p">()</span> | |
| <span class="k">for</span> <span class="n">expert_idx</span> <span class="ow">in</span> <span class="n">expert_hit</span><span class="p">[:]:</span> | |
| <span class="n">expert_idx</span> <span class="o">=</span> <span class="n">expert_idx</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> | |
| <span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">():</span> | |
| <span class="n">_</span><span class="p">,</span> <span class="n">token_idx</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">where</span><span class="p">(</span><span class="n">expert_mask</span><span class="p">[</span><span class="n">expert_idx</span><span class="p">])</span> | |
| <span class="n">current_state</span> <span class="o">=</span> <span class="n">hidden_states</span><span class="p">[</span><span class="n">token_idx</span><span class="p">]</span> | |
| <span class="n">gate_up</span> <span class="o">=</span> <span class="n">current_state</span> <span class="o">@</span> <span class="bp">self</span><span class="o">.</span><span class="n">gate_up_proj</span><span class="p">[</span><span class="n">expert_idx</span><span class="p">]</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">gate_up_proj_bias</span><span class="p">[</span><span class="n">expert_idx</span><span class="p">]</span> | |
| <span class="n">gate</span><span class="p">,</span> <span class="n">up</span> <span class="o">=</span> <span class="n">gate_up</span><span class="p">[</span><span class="o">...</span><span class="p">,</span> <span class="p">::</span><span class="mi">2</span><span class="p">],</span> <span class="n">gate_up</span><span class="p">[</span><span class="o">...</span><span class="p">,</span> <span class="mi">1</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span> | |
| <span class="n">gate</span> <span class="o">=</span> <span class="n">gate</span><span class="o">.</span><span class="n">clamp</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">)</span> | |
| <span class="n">up</span> <span class="o">=</span> <span class="n">up</span><span class="o">.</span><span class="n">clamp</span><span class="p">(</span><span class="nb">min</span><span class="o">=-</span><span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">)</span> | |
| <span class="n">glu</span> <span class="o">=</span> <span class="n">gate</span> <span class="o">*</span> <span class="n">torch</span><span class="o">.</span><span class="n">sigmoid</span><span class="p">(</span><span class="n">gate</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">alpha</span><span class="p">)</span> | |
| <span class="n">gated_output</span> <span class="o">=</span> <span class="p">(</span><span class="n">up</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="n">glu</span> | |
| <span class="n">out</span> <span class="o">=</span> <span class="n">gated_output</span> <span class="o">@</span> <span class="bp">self</span><span class="o">.</span><span class="n">down_proj</span><span class="p">[</span><span class="n">expert_idx</span><span class="p">]</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">down_proj_bias</span><span class="p">[</span><span class="n">expert_idx</span><span class="p">]</span> | |
| <span class="n">weighted_output</span> <span class="o">=</span> <span class="n">out</span> <span class="o">*</span> <span class="n">routing_weights</span><span class="p">[</span><span class="n">token_idx</span><span class="p">,</span> <span class="n">expert_idx</span><span class="p">,</span> <span class="kc">None</span><span class="p">]</span> | |
| <span class="n">next_states</span><span class="o">.</span><span class="n">index_add_</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">token_idx</span><span class="p">,</span> <span class="n">weighted_output</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">hidden_states</span><span class="o">.</span><span class="n">dtype</span><span class="p">))</span> | |
| <span class="n">next_states</span> <span class="o">=</span> <span class="n">next_states</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="n">batch_size</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">hidden_size</span><span class="p">)</span> | |
| <span class="k">else</span><span class="p">:</span> | |
| <span class="n">hidden_states</span> <span class="o">=</span> <span class="n">hidden_states</span><span class="o">.</span><span class="n">repeat</span><span class="p">(</span><span class="n">num_experts</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span> | |
| <span class="n">hidden_states</span> <span class="o">=</span> <span class="n">hidden_states</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="n">num_experts</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">hidden_size</span><span class="p">)</span> | |
| <span class="n">gate_up</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">bmm</span><span class="p">(</span><span class="n">hidden_states</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">gate_up_proj</span><span class="p">)</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">gate_up_proj_bias</span><span class="p">[</span><span class="o">...</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="p">:]</span> | |
| <span class="n">gate</span><span class="p">,</span> <span class="n">up</span> <span class="o">=</span> <span class="n">gate_up</span><span class="p">[</span><span class="o">...</span><span class="p">,</span> <span class="p">::</span><span class="mi">2</span><span class="p">],</span> <span class="n">gate_up</span><span class="p">[</span><span class="o">...</span><span class="p">,</span> <span class="mi">1</span><span class="p">::</span><span class="mi">2</span><span class="p">]</span> | |
| <span class="n">gate</span> <span class="o">=</span> <span class="n">gate</span><span class="o">.</span><span class="n">clamp</span><span class="p">(</span><span class="nb">min</span><span class="o">=</span><span class="kc">None</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">)</span> | |
| <span class="n">up</span> <span class="o">=</span> <span class="n">up</span><span class="o">.</span><span class="n">clamp</span><span class="p">(</span><span class="nb">min</span><span class="o">=-</span><span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">,</span> <span class="nb">max</span><span class="o">=</span><span class="bp">self</span><span class="o">.</span><span class="n">limit</span><span class="p">)</span> | |
| <span class="n">glu</span> <span class="o">=</span> <span class="n">gate</span> <span class="o">*</span> <span class="n">torch</span><span class="o">.</span><span class="n">sigmoid</span><span class="p">(</span><span class="n">gate</span> <span class="o">*</span> <span class="bp">self</span><span class="o">.</span><span class="n">alpha</span><span class="p">)</span> | |
| <span class="n">next_states</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">bmm</span><span class="p">(((</span><span class="n">up</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="n">glu</span><span class="p">),</span> <span class="bp">self</span><span class="o">.</span><span class="n">down_proj</span><span class="p">)</span> | |
| <span class="n">next_states</span> <span class="o">=</span> <span class="n">next_states</span> <span class="o">+</span> <span class="bp">self</span><span class="o">.</span><span class="n">down_proj_bias</span><span class="p">[</span><span class="o">...</span><span class="p">,</span> <span class="kc">None</span><span class="p">,</span> <span class="p">:]</span> | |
| <span class="n">next_states</span> <span class="o">=</span> <span class="n">next_states</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="n">num_experts</span><span class="p">,</span> <span class="n">batch_size</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="bp">self</span><span class="o">.</span><span class="n">hidden_size</span><span class="p">)</span> | |
| <span class="n">next_states</span> <span class="o">=</span> <span class="n">next_states</span> <span class="o">*</span> <span class="n">routing_weights</span><span class="o">.</span><span class="n">transpose</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span><span class="o">.</span><span class="n">view</span><span class="p">(</span><span class="n">num_experts</span><span class="p">,</span> <span class="n">batch_size</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">)[</span><span class="o">...</span><span class="p">,</span> <span class="kc">None</span><span class="p">]</span> | |
| <span class="n">next_states</span> <span class="o">=</span> <span class="n">next_states</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">dim</span><span class="o">=</span><span class="mi">0</span><span class="p">)</span> | |
| <span class="k">return</span> <span class="n">next_states</span> | |
| <span class="k">class</span><span class="w"> </span><span class="nc">GptOssMoEMLP</span><span class="p">(</span><span class="n">nn</span><span class="o">.</span><span class="n">Module</span><span class="p">):</span> | |
| <span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">router_weight</span><span class="p">,</span> <span class="n">router_bias</span><span class="p">,</span> <span class="n">gate_up_proj</span><span class="p">,</span> <span class="n">gate_up_proj_bias</span><span class="p">,</span> <span class="n">down_proj</span><span class="p">,</span> <span class="n">down_proj_bias</span><span class="p">):</span> | |
| <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">router</span> <span class="o">=</span> <span class="n">GptOssRouter</span><span class="p">(</span><span class="n">router_weight</span><span class="p">,</span> <span class="n">router_bias</span><span class="p">)</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">experts</span> <span class="o">=</span> <span class="n">GptOssExperts</span><span class="p">(</span><span class="n">gate_up_proj</span><span class="p">,</span> <span class="n">gate_up_proj_bias</span><span class="p">,</span> <span class="n">down_proj</span><span class="p">,</span> <span class="n">down_proj_bias</span><span class="p">)</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">hidden_states</span><span class="p">):</span> | |
| <span class="n">router_scores</span><span class="p">,</span> <span class="n">router_indices</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">router</span><span class="p">(</span><span class="n">hidden_states</span><span class="p">)</span> | |
| <span class="n">routed_out</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">experts</span><span class="p">(</span><span class="n">hidden_states</span><span class="p">,</span> <span class="n">router_indices</span><span class="o">=</span><span class="n">router_indices</span><span class="p">,</span> <span class="n">routing_weights</span><span class="o">=</span><span class="n">router_scores</span><span class="p">)</span> | |
| <span class="k">return</span> <span class="n">routed_out</span><span class="p">,</span> <span class="n">router_scores</span> | |
| <span class="c1"># Run the model</span> | |
| <span class="n">set_seed</span><span class="p">(</span><span class="n">GENERAL_SEED</span><span class="p">)</span> | |
| <span class="n">device</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">DEVICE</span><span class="p">)</span> | |
| <span class="n">dtype</span> <span class="o">=</span> <span class="n">to_dtype</span><span class="p">(</span><span class="n">DTYPE</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"</span><span class="se">\n</span><span class="s2">=== GPT-OSS Implementation ==="</span><span class="p">)</span> | |
| <span class="c1"># Initialize model with loaded weights</span> | |
| <span class="n">model</span> <span class="o">=</span> <span class="n">GptOssMoEMLP</span><span class="p">(</span> | |
| <span class="n">router_weight</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">),</span> | |
| <span class="n">router_bias</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">),</span> | |
| <span class="n">gate_up_proj</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">),</span> | |
| <span class="n">gate_up_proj_bias</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">),</span> | |
| <span class="n">down_proj</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">),</span> | |
| <span class="n">down_proj_bias</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">)</span> | |
| <span class="p">)</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Router weight sum: </span><span class="si">{</span><span class="n">model</span><span class="o">.</span><span class="n">router</span><span class="o">.</span><span class="n">weight</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Gate/up proj sum: </span><span class="si">{</span><span class="n">model</span><span class="o">.</span><span class="n">experts</span><span class="o">.</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Down proj sum: </span><span class="si">{</span><span class="n">model</span><span class="o">.</span><span class="n">experts</span><span class="o">.</span><span class="n">down_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="c1"># Benchmark the model using different input tensors on each iteration</span> | |
| <span class="n">tokens</span> <span class="o">=</span> <span class="n">BATCH_SIZE</span> <span class="o">*</span> <span class="n">SEQ_LEN</span> | |
| <span class="n">input_shape</span> <span class="o">=</span> <span class="p">(</span><span class="n">BATCH_SIZE</span><span class="p">,</span> <span class="n">SEQ_LEN</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">)</span> | |
| <span class="k">with</span> <span class="n">bench_context</span><span class="p">(</span><span class="n">warmup</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">iters</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">,</span> <span class="n">tokens</span><span class="o">=</span><span class="n">tokens</span><span class="p">,</span> | |
| <span class="n">save_json</span><span class="o">=</span><span class="s2">"gptoss_results.json"</span><span class="p">,</span> <span class="n">input_shape</span><span class="o">=</span><span class="n">input_shape</span><span class="p">,</span> <span class="n">input_seed_base</span><span class="o">=</span><span class="n">INPUT_SEED</span><span class="p">)</span> <span class="k">as</span> <span class="n">bench</span><span class="p">:</span> | |
| <span class="n">output</span><span class="p">,</span> <span class="n">stats</span> <span class="o">=</span> <span class="n">bench</span><span class="p">(</span><span class="n">model</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="se">\n</span><span class="s2">Output sum: </span><span class="si">{</span><span class="n">output</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| </pre></div></td></tr></table></div> | |
| </div> | |
| <div id="output-gptoss_run" class="cell-output"> | |
| <div class="cell-stdout">Configuration: | |
| Experts: 128 | |
| Hidden size: 1152 | |
| Top-k: 4 | |
| Batch size: 8 | |
| Sequence length: 512 | |
| Device: cuda | |
| Dtype: bfloat16 | |
| Loading weights from: /home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e | |
| Files in directory: [PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/down_proj.pt'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/down_proj_bias.pt'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/stderr.txt'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/gate_up_proj.pt'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/gate_up_proj_bias.pt'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/result.json'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/stdout.txt'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/router_weight.pt'), PosixPath('/home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e/router_bias.pt')] | |
| Loaded shared weights from artifacts | |
| Router weight sum: 12.588732 | |
| Gate/up sum: 1026.601807 | |
| Down sum: 206.729263 | |
| === GPT-OSS Implementation === | |
| Router weight sum: 12.562500 | |
| Gate/up proj sum: 1024.000000 | |
| Down proj sum: 207.000000 | |
| Average time: 62.308 ms | |
| Throughput: 65737 tokens/sec | |
| Memory allocated: 1.330 GB | |
| Memory increase: 0.380 GB | |
| Output sum: -4.968750 | |
| </div> | |
| <div class="uv-install-logs" id="uv-logs-gptoss_run"> | |
| <div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div> | |
| <div class="uv-logs-content" style="display: none;"> | |
| Downloading setuptools (1.1MiB) | |
| Downloading sympy (6.0MiB) | |
| Downloading nvidia-cusparse-cu12 (274.9MiB) | |
| Downloading nvidia-cuda-cupti-cu12 (9.8MiB) | |
| Downloading nvidia-cufft-cu12 (184.2MiB) | |
| Downloading nvidia-curand-cu12 (60.7MiB) | |
| Downloading numpy (15.9MiB) | |
| Downloading networkx (1.9MiB) | |
| Downloading nvidia-cudnn-cu12 (674.0MiB) | |
| Downloading nvidia-nccl-cu12 (307.4MiB) | |
| Downloading nvidia-cusolver-cu12 (255.1MiB) | |
| Downloading nvidia-cuda-nvrtc-cu12 (84.0MiB) | |
| Downloading nvidia-cusparselt-cu12 (273.9MiB) | |
| Downloading nvidia-cufile-cu12 (1.1MiB) | |
| Downloading nvidia-nvjitlink-cu12 (37.4MiB) | |
| Downloading nvidia-cublas-cu12 (566.8MiB) | |
| Downloading triton (148.4MiB) | |
| Downloading torch (846.8MiB) | |
| Downloading nvidia-cufile-cu12 | |
| Downloading setuptools | |
| Downloading networkx | |
| Downloading nvidia-cuda-cupti-cu12 | |
| Downloading numpy | |
| Downloading nvidia-nvjitlink-cu12 | |
| Downloading sympy | |
| Downloading nvidia-curand-cu12 | |
| Downloading nvidia-cuda-nvrtc-cu12 | |
| Downloading triton | |
| Downloading nvidia-cufft-cu12 | |
| Downloading nvidia-cusolver-cu12 | |
| Downloading nvidia-cusparse-cu12 | |
| Downloading nvidia-cusparselt-cu12 | |
| Downloading nvidia-nccl-cu12 | |
| Downloading nvidia-cublas-cu12 | |
| Downloading nvidia-cudnn-cu12 | |
| Downloading torch | |
| Installed 26 packages in 235ms | |
| </div> | |
| </div> | |
| <div class="cell-artifacts"> | |
| <h4>Artifacts:</h4> | |
| <a href="artifacts/gptoss_run/gptoss_results.json" class="artifact" target="_blank">gptoss_results.json</a> | |
| </div> | |
| </div> | |
| </div> | |
| <h2>MegaBlocks Implementation</h2> | |
| <p>This section benchmarks the MegaBlocks MoE implementation.</p> | |
| <div class="cell"> | |
| <div class="cell-header"> | |
| <span class="collapse-indicators"> | |
| <span onclick="toggleCode('megablocks_run')" style="cursor: pointer;">▶ code</span> | |
| <span onclick="toggleOutput('megablocks_run')" style="cursor: pointer;">▼ output</span> | |
| <span id="uv-indicator-megablocks_run" onclick="toggleUvLogsFromHeader('megablocks_run')" style="cursor: pointer;">▶ uv-logs</span> | |
| </span> | | |
| Cell: megablocks_run | deps: torch, numpy, kernels | 49.81s | |
| | <button class="run-btn" onclick="runCell('megablocks_run')">▶ run</button> | |
| <button class="copy-btn" onclick="copyCell('megablocks_run')">Copy</button> | |
| <a href="cells/megablocks_run.py" target="_blank" class="raw-btn">Raw</a> | |
| </div> | |
| <div id="code-megablocks_run" class="cell-code collapsed"> | |
| <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span> | |
| <span class="normal"> 2</span> | |
| <span class="normal"> 3</span> | |
| <span class="normal"> 4</span> | |
| <span class="normal"> 5</span> | |
| <span class="normal"> 6</span> | |
| <span class="normal"> 7</span> | |
| <span class="normal"> 8</span> | |
| <span class="normal"> 9</span> | |
| <span class="normal"> 10</span> | |
| <span class="normal"> 11</span> | |
| <span class="normal"> 12</span> | |
| <span class="normal"> 13</span> | |
| <span class="normal"> 14</span> | |
| <span class="normal"> 15</span> | |
| <span class="normal"> 16</span> | |
| <span class="normal"> 17</span> | |
| <span class="normal"> 18</span> | |
| <span class="normal"> 19</span> | |
| <span class="normal"> 20</span> | |
| <span class="normal"> 21</span> | |
| <span class="normal"> 22</span> | |
| <span class="normal"> 23</span> | |
| <span class="normal"> 24</span> | |
| <span class="normal"> 25</span> | |
| <span class="normal"> 26</span> | |
| <span class="normal"> 27</span> | |
| <span class="normal"> 28</span> | |
| <span class="normal"> 29</span> | |
| <span class="normal"> 30</span> | |
| <span class="normal"> 31</span> | |
| <span class="normal"> 32</span> | |
| <span class="normal"> 33</span> | |
| <span class="normal"> 34</span> | |
| <span class="normal"> 35</span> | |
| <span class="normal"> 36</span> | |
| <span class="normal"> 37</span> | |
| <span class="normal"> 38</span> | |
| <span class="normal"> 39</span> | |
| <span class="normal"> 40</span> | |
| <span class="normal"> 41</span> | |
| <span class="normal"> 42</span> | |
| <span class="normal"> 43</span> | |
| <span class="normal"> 44</span> | |
| <span class="normal"> 45</span> | |
| <span class="normal"> 46</span> | |
| <span class="normal"> 47</span> | |
| <span class="normal"> 48</span> | |
| <span class="normal"> 49</span> | |
| <span class="normal"> 50</span> | |
| <span class="normal"> 51</span> | |
| <span class="normal"> 52</span> | |
| <span class="normal"> 53</span> | |
| <span class="normal"> 54</span> | |
| <span class="normal"> 55</span> | |
| <span class="normal"> 56</span> | |
| <span class="normal"> 57</span> | |
| <span class="normal"> 58</span> | |
| <span class="normal"> 59</span> | |
| <span class="normal"> 60</span> | |
| <span class="normal"> 61</span> | |
| <span class="normal"> 62</span> | |
| <span class="normal"> 63</span> | |
| <span class="normal"> 64</span> | |
| <span class="normal"> 65</span> | |
| <span class="normal"> 66</span> | |
| <span class="normal"> 67</span> | |
| <span class="normal"> 68</span> | |
| <span class="normal"> 69</span> | |
| <span class="normal"> 70</span> | |
| <span class="normal"> 71</span> | |
| <span class="normal"> 72</span> | |
| <span class="normal"> 73</span> | |
| <span class="normal"> 74</span> | |
| <span class="normal"> 75</span> | |
| <span class="normal"> 76</span> | |
| <span class="normal"> 77</span> | |
| <span class="normal"> 78</span> | |
| <span class="normal"> 79</span> | |
| <span class="normal"> 80</span> | |
| <span class="normal"> 81</span> | |
| <span class="normal"> 82</span> | |
| <span class="normal"> 83</span> | |
| <span class="normal"> 84</span> | |
| <span class="normal"> 85</span> | |
| <span class="normal"> 86</span> | |
| <span class="normal"> 87</span> | |
| <span class="normal"> 88</span> | |
| <span class="normal"> 89</span> | |
| <span class="normal"> 90</span> | |
| <span class="normal"> 91</span> | |
| <span class="normal"> 92</span> | |
| <span class="normal"> 93</span> | |
| <span class="normal"> 94</span> | |
| <span class="normal"> 95</span> | |
| <span class="normal"> 96</span> | |
| <span class="normal"> 97</span> | |
| <span class="normal"> 98</span> | |
| <span class="normal"> 99</span> | |
| <span class="normal">100</span> | |
| <span class="normal">101</span> | |
| <span class="normal">102</span> | |
| <span class="normal">103</span> | |
| <span class="normal">104</span></pre></div></td><td class="code"><div><pre><span></span><span class="kn">import</span><span class="w"> </span><span class="nn">torch</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">torch</span><span class="w"> </span><span class="kn">import</span> <span class="n">nn</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">torch.nn</span><span class="w"> </span><span class="kn">import</span> <span class="n">functional</span> <span class="k">as</span> <span class="n">F</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">kernels</span><span class="w"> </span><span class="kn">import</span> <span class="n">get_kernel</span><span class="p">,</span> <span class="n">get_local_kernel</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">utils</span><span class="w"> </span><span class="kn">import</span> <span class="n">to_dtype</span><span class="p">,</span> <span class="n">tensor_stats</span><span class="p">,</span> <span class="n">set_seed</span><span class="p">,</span> <span class="n">bench_context</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">config</span><span class="w"> </span><span class="kn">import</span> <span class="p">(</span> | |
| <span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">,</span> <span class="n">TOP_K</span><span class="p">,</span> | |
| <span class="n">BATCH_SIZE</span><span class="p">,</span> <span class="n">SEQ_LEN</span><span class="p">,</span> <span class="n">DTYPE</span><span class="p">,</span> <span class="n">DEVICE</span><span class="p">,</span> | |
| <span class="n">WEIGHT_SEED</span><span class="p">,</span> <span class="n">EXPERT_SEED</span><span class="p">,</span> <span class="n">INPUT_SEED</span><span class="p">,</span> <span class="n">GENERAL_SEED</span> | |
| <span class="p">)</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">pathlib</span><span class="w"> </span><span class="kn">import</span> <span class="n">Path</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">collections</span><span class="w"> </span><span class="kn">import</span> <span class="n">namedtuple</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">os</span> | |
| <span class="c1"># Discover the upstream artifact directory from env</span> | |
| <span class="n">data_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'UVNOTE_INPUT_SAVE_DATA'</span><span class="p">,</span> <span class="s1">'.'</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Loading weights from: </span><span class="si">{</span><span class="n">data_dir</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="n">router_weight</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'router_weight.pt'</span><span class="p">)</span> | |
| <span class="n">router_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'router_bias.pt'</span><span class="p">)</span> | |
| <span class="n">gate_up_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'gate_up_proj.pt'</span><span class="p">)</span> | |
| <span class="n">gate_up_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'gate_up_proj_bias.pt'</span><span class="p">)</span> | |
| <span class="n">down_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'down_proj.pt'</span><span class="p">)</span> | |
| <span class="n">down_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">Path</span><span class="p">(</span><span class="n">data_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'down_proj_bias.pt'</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"Loaded shared weights from artifacts"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Router weight sum: </span><span class="si">{</span><span class="n">router_weight</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Gate/up sum: </span><span class="si">{</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Down sum: </span><span class="si">{</span><span class="n">down_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">build_megablocks_model</span><span class="p">(</span><span class="n">device</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="p">:</span> <span class="n">torch</span><span class="o">.</span><span class="n">dtype</span><span class="p">):</span> | |
| <span class="c1"># Download optimized kernels from the Hugging Face hub</span> | |
| <span class="n">megablocks</span> <span class="o">=</span> <span class="n">get_kernel</span><span class="p">(</span><span class="s2">"kernels-community/megablocks"</span><span class="p">)</span> | |
| <span class="c1"># megablocks = get_local_kernel(</span> | |
| <span class="c1"># Path("/home/ubuntu/Projects/megablocks-moe/build"), "megablocks")</span> | |
| <span class="n">model</span> <span class="o">=</span> <span class="n">megablocks</span><span class="o">.</span><span class="n">layers</span><span class="o">.</span><span class="n">MegaBlocksMoeMLP</span><span class="p">()</span> | |
| <span class="c1"># Create attribute container for expert weights</span> | |
| <span class="n">model</span><span class="o">.</span><span class="n">experts</span> <span class="o">=</span> <span class="n">namedtuple</span><span class="p">(</span> | |
| <span class="s2">"Experts"</span><span class="p">,</span> <span class="p">[</span><span class="s2">"gate_up_proj"</span><span class="p">,</span> <span class="s2">"gate_up_proj_bias"</span><span class="p">,</span> <span class="s2">"down_proj"</span><span class="p">,</span> <span class="s2">"down_proj_bias"</span><span class="p">,</span> <span class="s2">"hidden_size"</span><span class="p">]</span> | |
| <span class="p">)</span> | |
| <span class="c1"># Use loaded router weights for consistency</span> | |
| <span class="n">model</span><span class="o">.</span><span class="n">router</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Linear</span><span class="p">(</span><span class="n">HIDDEN_SIZE</span><span class="p">,</span> <span class="n">NUM_EXPERTS</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">)</span> | |
| <span class="k">with</span> <span class="n">torch</span><span class="o">.</span><span class="n">no_grad</span><span class="p">():</span> | |
| <span class="n">model</span><span class="o">.</span><span class="n">router</span><span class="o">.</span><span class="n">weight</span><span class="o">.</span><span class="n">copy_</span><span class="p">(</span><span class="n">router_weight</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">dtype</span><span class="p">))</span> | |
| <span class="n">model</span><span class="o">.</span><span class="n">router</span><span class="o">.</span><span class="n">bias</span><span class="o">.</span><span class="n">copy_</span><span class="p">(</span><span class="n">router_bias</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">dtype</span><span class="p">))</span> | |
| <span class="c1"># Attach loaded expert weights to the experts container</span> | |
| <span class="n">e</span> <span class="o">=</span> <span class="n">model</span><span class="o">.</span><span class="n">experts</span> | |
| <span class="n">e</span><span class="o">.</span><span class="n">alpha</span> <span class="o">=</span> <span class="mf">1.702</span> | |
| <span class="n">e</span><span class="o">.</span><span class="n">capacity_factor</span> <span class="o">=</span> <span class="mi">4</span> | |
| <span class="n">e</span><span class="o">.</span><span class="n">gate_up_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">clone</span><span class="p">()</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">))</span> | |
| <span class="n">e</span><span class="o">.</span><span class="n">gate_up_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">gate_up_proj_bias</span><span class="o">.</span><span class="n">clone</span><span class="p">()</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">))</span> | |
| <span class="n">e</span><span class="o">.</span><span class="n">down_proj</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">down_proj</span><span class="o">.</span><span class="n">clone</span><span class="p">()</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">))</span> | |
| <span class="n">e</span><span class="o">.</span><span class="n">down_proj_bias</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">nn</span><span class="o">.</span><span class="n">Parameter</span><span class="p">(</span><span class="n">down_proj_bias</span><span class="o">.</span><span class="n">clone</span><span class="p">()</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">))</span> | |
| <span class="n">e</span><span class="o">.</span><span class="n">hidden_size</span> <span class="o">=</span> <span class="n">HIDDEN_SIZE</span> | |
| <span class="c1"># Log weight statistics for comparison</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"[MegaBlocks] Router weight sum: </span><span class="si">{</span><span class="n">model</span><span class="o">.</span><span class="n">router</span><span class="o">.</span><span class="n">weight</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"[MegaBlocks] Gate/up projection shape: </span><span class="si">{</span><span class="nb">tuple</span><span class="p">(</span><span class="n">e</span><span class="o">.</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span><span class="si">}</span><span class="s2">, sum: </span><span class="si">{</span><span class="n">e</span><span class="o">.</span><span class="n">gate_up_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"[MegaBlocks] Down projection shape: </span><span class="si">{</span><span class="nb">tuple</span><span class="p">(</span><span class="n">e</span><span class="o">.</span><span class="n">down_proj</span><span class="o">.</span><span class="n">shape</span><span class="p">)</span><span class="si">}</span><span class="s2">, sum: </span><span class="si">{</span><span class="n">e</span><span class="o">.</span><span class="n">down_proj</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="k">return</span> <span class="n">model</span> | |
| <span class="c1"># Create a wrapper to match the interface of other implementations</span> | |
| <span class="k">class</span><span class="w"> </span><span class="nc">MegaBlocksMoEWrapper</span><span class="p">(</span><span class="n">nn</span><span class="o">.</span><span class="n">Module</span><span class="p">):</span> | |
| <span class="k">def</span><span class="w"> </span><span class="fm">__init__</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">megablocks_model</span><span class="p">):</span> | |
| <span class="nb">super</span><span class="p">()</span><span class="o">.</span><span class="fm">__init__</span><span class="p">()</span> | |
| <span class="bp">self</span><span class="o">.</span><span class="n">model</span> <span class="o">=</span> <span class="n">megablocks_model</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">forward</span><span class="p">(</span><span class="bp">self</span><span class="p">,</span> <span class="n">hidden_states</span><span class="p">):</span> | |
| <span class="c1"># MegaBlocks expects input in the format (batch, seq_len, hidden_dim)</span> | |
| <span class="n">output</span><span class="p">,</span> <span class="n">dummy_routing_weights</span> <span class="o">=</span> <span class="bp">self</span><span class="o">.</span><span class="n">model</span><span class="p">(</span><span class="n">hidden_states</span><span class="p">)</span> | |
| <span class="c1"># Return output and dummy routing weights for consistency with other implementations</span> | |
| <span class="c1"># dummy_routing_weights = torch.zeros(</span> | |
| <span class="c1"># hidden_states.shape[0] * hidden_states.shape[1], </span> | |
| <span class="c1"># NUM_EXPERTS, </span> | |
| <span class="c1"># device=hidden_states.device,</span> | |
| <span class="c1"># dtype=hidden_states.dtype</span> | |
| <span class="c1"># )</span> | |
| <span class="k">return</span> <span class="n">output</span><span class="p">,</span> <span class="n">dummy_routing_weights</span> | |
| <span class="c1"># Run the model</span> | |
| <span class="n">set_seed</span><span class="p">(</span><span class="n">GENERAL_SEED</span><span class="p">)</span> | |
| <span class="n">device</span> <span class="o">=</span> <span class="n">torch</span><span class="o">.</span><span class="n">device</span><span class="p">(</span><span class="n">DEVICE</span><span class="p">)</span> | |
| <span class="n">dtype</span> <span class="o">=</span> <span class="n">to_dtype</span><span class="p">(</span><span class="n">DTYPE</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"</span><span class="se">\n</span><span class="s2">=== MegaBlocks Implementation ==="</span><span class="p">)</span> | |
| <span class="c1"># Build MegaBlocks model with loaded weights</span> | |
| <span class="n">megablocks_model</span> <span class="o">=</span> <span class="n">build_megablocks_model</span><span class="p">(</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="p">)</span> | |
| <span class="n">model</span> <span class="o">=</span> <span class="n">MegaBlocksMoEWrapper</span><span class="p">(</span><span class="n">megablocks_model</span><span class="p">)</span><span class="o">.</span><span class="n">to</span><span class="p">(</span><span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">)</span> | |
| <span class="c1"># Benchmark the model using different input tensors on each iteration</span> | |
| <span class="n">tokens</span> <span class="o">=</span> <span class="n">BATCH_SIZE</span> <span class="o">*</span> <span class="n">SEQ_LEN</span> | |
| <span class="n">input_shape</span> <span class="o">=</span> <span class="p">(</span><span class="n">BATCH_SIZE</span><span class="p">,</span> <span class="n">SEQ_LEN</span><span class="p">,</span> <span class="n">HIDDEN_SIZE</span><span class="p">)</span> | |
| <span class="k">with</span> <span class="n">bench_context</span><span class="p">(</span><span class="n">warmup</span><span class="o">=</span><span class="mi">10</span><span class="p">,</span> <span class="n">iters</span><span class="o">=</span><span class="mi">50</span><span class="p">,</span> <span class="n">device</span><span class="o">=</span><span class="n">device</span><span class="p">,</span> <span class="n">dtype</span><span class="o">=</span><span class="n">dtype</span><span class="p">,</span> <span class="n">tokens</span><span class="o">=</span><span class="n">tokens</span><span class="p">,</span> | |
| <span class="n">save_json</span><span class="o">=</span><span class="s2">"megablocks_results.json"</span><span class="p">,</span> <span class="n">input_shape</span><span class="o">=</span><span class="n">input_shape</span><span class="p">,</span> <span class="n">input_seed_base</span><span class="o">=</span><span class="n">INPUT_SEED</span><span class="p">)</span> <span class="k">as</span> <span class="n">bench</span><span class="p">:</span> | |
| <span class="n">output</span><span class="p">,</span> <span class="n">stats</span> <span class="o">=</span> <span class="n">bench</span><span class="p">(</span><span class="n">model</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="se">\n</span><span class="s2">Output sum: </span><span class="si">{</span><span class="n">output</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">sum</span><span class="p">()</span><span class="o">.</span><span class="n">item</span><span class="p">()</span><span class="si">:</span><span class="s2">.6f</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| </pre></div></td></tr></table></div> | |
| </div> | |
| <div id="output-megablocks_run" class="cell-output"> | |
| <div class="cell-stdout">Configuration: | |
| Experts: 128 | |
| Hidden size: 1152 | |
| Top-k: 4 | |
| Batch size: 8 | |
| Sequence length: 512 | |
| Device: cuda | |
| Dtype: bfloat16 | |
| Loading weights from: /home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/f8d80463181591394d703c9cd286c7929b4a261ab3157d791f92a5933e5a011e | |
| Loaded shared weights from artifacts | |
| Router weight sum: 12.588732 | |
| Gate/up sum: 1026.601807 | |
| Down sum: 206.729263 | |
| === MegaBlocks Implementation === | |
| [MegaBlocks] Router weight sum: 12.562500 | |
| [MegaBlocks] Gate/up projection shape: (128, 1152, 2304), sum: 1024.000000 | |
| [MegaBlocks] Down projection shape: (128, 1152, 1152), sum: 207.000000 | |
| Average time: 26.933 ms | |
| Throughput: 152084 tokens/sec | |
| Memory allocated: 2.243 GB | |
| Memory increase: 1.292 GB | |
| Output sum: -4.968750 | |
| </div> | |
| <div class="uv-install-logs" id="uv-logs-megablocks_run"> | |
| <div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div> | |
| <div class="uv-logs-content" style="display: none;"> | |
| Downloading setuptools (1.1MiB) | |
| Downloading nvidia-curand-cu12 (60.7MiB) | |
| Downloading numpy (15.9MiB) | |
| Downloading nvidia-cusparselt-cu12 (273.9MiB) | |
| Downloading nvidia-nccl-cu12 (307.4MiB) | |
| Downloading nvidia-cufile-cu12 (1.1MiB) | |
| Downloading nvidia-cublas-cu12 (566.8MiB) | |
| Downloading nvidia-nvjitlink-cu12 (37.4MiB) | |
| Downloading nvidia-cuda-cupti-cu12 (9.8MiB) | |
| Downloading hf-xet (3.0MiB) | |
| Downloading sympy (6.0MiB) | |
| Downloading nvidia-cusolver-cu12 (255.1MiB) | |
| Downloading nvidia-cufft-cu12 (184.2MiB) | |
| Downloading nvidia-cusparse-cu12 (274.9MiB) | |
| Downloading networkx (1.9MiB) | |
| Downloading nvidia-cuda-nvrtc-cu12 (84.0MiB) | |
| Downloading nvidia-cudnn-cu12 (674.0MiB) | |
| Downloading torch (846.8MiB) | |
| Downloading triton (148.4MiB) | |
| Downloading nvidia-cufile-cu12 | |
| Downloading hf-xet | |
| Downloading setuptools | |
| Downloading networkx | |
| Downloading nvidia-cuda-cupti-cu12 | |
| Downloading numpy | |
| Downloading nvidia-nvjitlink-cu12 | |
| Downloading sympy | |
| Downloading nvidia-curand-cu12 | |
| Downloading nvidia-cuda-nvrtc-cu12 | |
| Downloading triton | |
| Downloading nvidia-cufft-cu12 | |
| Downloading nvidia-cusolver-cu12 | |
| Downloading nvidia-cusparse-cu12 | |
| Downloading nvidia-cusparselt-cu12 | |
| Downloading nvidia-nccl-cu12 | |
| Downloading nvidia-cublas-cu12 | |
| Downloading nvidia-cudnn-cu12 | |
| Downloading torch | |
| Installed 37 packages in 280ms | |
| </div> | |
| </div> | |
| <div class="cell-stderr">Fetching 66 files: 0%| | 0/66 [00:00<?, ?it/s] | |
| Fetching 66 files: 2%|▏ | 1/66 [00:00<00:18, 3.49it/s] | |
| Fetching 66 files: 26%|██▌ | 17/66 [00:01<00:03, 15.86it/s] | |
| Fetching 66 files: 100%|██████████| 66/66 [00:01<00:00, 57.94it/s]</div> | |
| <div class="cell-artifacts"> | |
| <h4>Artifacts:</h4> | |
| <a href="artifacts/megablocks_run/megablocks_results.json" class="artifact" target="_blank">megablocks_results.json</a> | |
| </div> | |
| </div> | |
| </div> | |
| <h2>Performance Comparison</h2> | |
| <p>This section loads the benchmark results and creates visualizations comparing the two implementations.</p> | |
| <div class="cell"> | |
| <div class="cell-header"> | |
| <span class="collapse-indicators"> | |
| <span onclick="toggleCode('visualization')" style="cursor: pointer;">▶ code</span> | |
| <span onclick="toggleOutput('visualization')" style="cursor: pointer;">▼ output</span> | |
| <span id="uv-indicator-visualization" onclick="toggleUvLogsFromHeader('visualization')" style="cursor: pointer;">▶ uv-logs</span> | |
| </span> | | |
| Cell: visualization | deps: matplotlib | 3.33s | |
| | <button class="run-btn" onclick="runCell('visualization')">▶ run</button> | |
| <button class="copy-btn" onclick="copyCell('visualization')">Copy</button> | |
| <a href="cells/visualization.py" target="_blank" class="raw-btn">Raw</a> | |
| </div> | |
| <div id="code-visualization" class="cell-code collapsed"> | |
| <div class="highlight"><table class="highlighttable"><tr><td class="linenos"><div class="linenodiv"><pre><span class="normal"> 1</span> | |
| <span class="normal"> 2</span> | |
| <span class="normal"> 3</span> | |
| <span class="normal"> 4</span> | |
| <span class="normal"> 5</span> | |
| <span class="normal"> 6</span> | |
| <span class="normal"> 7</span> | |
| <span class="normal"> 8</span> | |
| <span class="normal"> 9</span> | |
| <span class="normal"> 10</span> | |
| <span class="normal"> 11</span> | |
| <span class="normal"> 12</span> | |
| <span class="normal"> 13</span> | |
| <span class="normal"> 14</span> | |
| <span class="normal"> 15</span> | |
| <span class="normal"> 16</span> | |
| <span class="normal"> 17</span> | |
| <span class="normal"> 18</span> | |
| <span class="normal"> 19</span> | |
| <span class="normal"> 20</span> | |
| <span class="normal"> 21</span> | |
| <span class="normal"> 22</span> | |
| <span class="normal"> 23</span> | |
| <span class="normal"> 24</span> | |
| <span class="normal"> 25</span> | |
| <span class="normal"> 26</span> | |
| <span class="normal"> 27</span> | |
| <span class="normal"> 28</span> | |
| <span class="normal"> 29</span> | |
| <span class="normal"> 30</span> | |
| <span class="normal"> 31</span> | |
| <span class="normal"> 32</span> | |
| <span class="normal"> 33</span> | |
| <span class="normal"> 34</span> | |
| <span class="normal"> 35</span> | |
| <span class="normal"> 36</span> | |
| <span class="normal"> 37</span> | |
| <span class="normal"> 38</span> | |
| <span class="normal"> 39</span> | |
| <span class="normal"> 40</span> | |
| <span class="normal"> 41</span> | |
| <span class="normal"> 42</span> | |
| <span class="normal"> 43</span> | |
| <span class="normal"> 44</span> | |
| <span class="normal"> 45</span> | |
| <span class="normal"> 46</span> | |
| <span class="normal"> 47</span> | |
| <span class="normal"> 48</span> | |
| <span class="normal"> 49</span> | |
| <span class="normal"> 50</span> | |
| <span class="normal"> 51</span> | |
| <span class="normal"> 52</span> | |
| <span class="normal"> 53</span> | |
| <span class="normal"> 54</span> | |
| <span class="normal"> 55</span> | |
| <span class="normal"> 56</span> | |
| <span class="normal"> 57</span> | |
| <span class="normal"> 58</span> | |
| <span class="normal"> 59</span> | |
| <span class="normal"> 60</span> | |
| <span class="normal"> 61</span> | |
| <span class="normal"> 62</span> | |
| <span class="normal"> 63</span> | |
| <span class="normal"> 64</span> | |
| <span class="normal"> 65</span> | |
| <span class="normal"> 66</span> | |
| <span class="normal"> 67</span> | |
| <span class="normal"> 68</span> | |
| <span class="normal"> 69</span> | |
| <span class="normal"> 70</span> | |
| <span class="normal"> 71</span> | |
| <span class="normal"> 72</span> | |
| <span class="normal"> 73</span> | |
| <span class="normal"> 74</span> | |
| <span class="normal"> 75</span> | |
| <span class="normal"> 76</span> | |
| <span class="normal"> 77</span> | |
| <span class="normal"> 78</span> | |
| <span class="normal"> 79</span> | |
| <span class="normal"> 80</span> | |
| <span class="normal"> 81</span> | |
| <span class="normal"> 82</span> | |
| <span class="normal"> 83</span> | |
| <span class="normal"> 84</span> | |
| <span class="normal"> 85</span> | |
| <span class="normal"> 86</span> | |
| <span class="normal"> 87</span> | |
| <span class="normal"> 88</span> | |
| <span class="normal"> 89</span> | |
| <span class="normal"> 90</span> | |
| <span class="normal"> 91</span> | |
| <span class="normal"> 92</span> | |
| <span class="normal"> 93</span> | |
| <span class="normal"> 94</span> | |
| <span class="normal"> 95</span> | |
| <span class="normal"> 96</span> | |
| <span class="normal"> 97</span> | |
| <span class="normal"> 98</span> | |
| <span class="normal"> 99</span> | |
| <span class="normal">100</span> | |
| <span class="normal">101</span> | |
| <span class="normal">102</span> | |
| <span class="normal">103</span> | |
| <span class="normal">104</span> | |
| <span class="normal">105</span> | |
| <span class="normal">106</span> | |
| <span class="normal">107</span> | |
| <span class="normal">108</span> | |
| <span class="normal">109</span> | |
| <span class="normal">110</span> | |
| <span class="normal">111</span> | |
| <span class="normal">112</span> | |
| <span class="normal">113</span> | |
| <span class="normal">114</span> | |
| <span class="normal">115</span> | |
| <span class="normal">116</span> | |
| <span class="normal">117</span> | |
| <span class="normal">118</span> | |
| <span class="normal">119</span> | |
| <span class="normal">120</span> | |
| <span class="normal">121</span> | |
| <span class="normal">122</span> | |
| <span class="normal">123</span> | |
| <span class="normal">124</span> | |
| <span class="normal">125</span> | |
| <span class="normal">126</span> | |
| <span class="normal">127</span> | |
| <span class="normal">128</span> | |
| <span class="normal">129</span> | |
| <span class="normal">130</span> | |
| <span class="normal">131</span> | |
| <span class="normal">132</span> | |
| <span class="normal">133</span> | |
| <span class="normal">134</span> | |
| <span class="normal">135</span> | |
| <span class="normal">136</span> | |
| <span class="normal">137</span> | |
| <span class="normal">138</span> | |
| <span class="normal">139</span> | |
| <span class="normal">140</span> | |
| <span class="normal">141</span> | |
| <span class="normal">142</span> | |
| <span class="normal">143</span> | |
| <span class="normal">144</span> | |
| <span class="normal">145</span> | |
| <span class="normal">146</span> | |
| <span class="normal">147</span> | |
| <span class="normal">148</span> | |
| <span class="normal">149</span> | |
| <span class="normal">150</span> | |
| <span class="normal">151</span> | |
| <span class="normal">152</span> | |
| <span class="normal">153</span> | |
| <span class="normal">154</span> | |
| <span class="normal">155</span> | |
| <span class="normal">156</span> | |
| <span class="normal">157</span> | |
| <span class="normal">158</span> | |
| <span class="normal">159</span> | |
| <span class="normal">160</span> | |
| <span class="normal">161</span> | |
| <span class="normal">162</span></pre></div></td><td class="code"><div><pre><span></span><span class="kn">import</span><span class="w"> </span><span class="nn">json</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">matplotlib.pyplot</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">plt</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">numpy</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="nn">np</span> | |
| <span class="kn">from</span><span class="w"> </span><span class="nn">pathlib</span><span class="w"> </span><span class="kn">import</span> <span class="n">Path</span> | |
| <span class="kn">import</span><span class="w"> </span><span class="nn">os</span> | |
| <span class="c1"># Get result directories from environment variables</span> | |
| <span class="n">gptoss_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'UVNOTE_INPUT_GPTOSS_RUN'</span><span class="p">,</span> <span class="s1">'.'</span><span class="p">)</span> | |
| <span class="n">megablocks_dir</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="n">environ</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s1">'UVNOTE_INPUT_MEGABLOCKS_RUN'</span><span class="p">,</span> <span class="s1">'.'</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Loading benchmark results from:"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" GPT-OSS dir: </span><span class="si">{</span><span class="n">gptoss_dir</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" MegaBlocks dir: </span><span class="si">{</span><span class="n">megablocks_dir</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="c1"># Load benchmark results</span> | |
| <span class="n">gptoss_file</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">gptoss_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'gptoss_results.json'</span> | |
| <span class="n">megablocks_file</span> <span class="o">=</span> <span class="n">Path</span><span class="p">(</span><span class="n">megablocks_dir</span><span class="p">)</span> <span class="o">/</span> <span class="s1">'megablocks_results.json'</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Loading results from:"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" GPT-OSS: </span><span class="si">{</span><span class="n">gptoss_file</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" MegaBlocks: </span><span class="si">{</span><span class="n">megablocks_file</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="k">if</span> <span class="ow">not</span> <span class="n">gptoss_file</span><span class="o">.</span><span class="n">exists</span><span class="p">():</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Warning: </span><span class="si">{</span><span class="n">gptoss_file</span><span class="si">}</span><span class="s2"> not found"</span><span class="p">)</span> | |
| <span class="k">if</span> <span class="ow">not</span> <span class="n">megablocks_file</span><span class="o">.</span><span class="n">exists</span><span class="p">():</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Warning: </span><span class="si">{</span><span class="n">megablocks_file</span><span class="si">}</span><span class="s2"> not found"</span><span class="p">)</span> | |
| <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">gptoss_file</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> | |
| <span class="n">gptoss_results</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> | |
| <span class="k">with</span> <span class="nb">open</span><span class="p">(</span><span class="n">megablocks_file</span><span class="p">,</span> <span class="s1">'r'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span> | |
| <span class="n">megablocks_results</span> <span class="o">=</span> <span class="n">json</span><span class="o">.</span><span class="n">load</span><span class="p">(</span><span class="n">f</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"GPT-OSS results keys: </span><span class="si">{</span><span class="nb">list</span><span class="p">(</span><span class="n">gptoss_results</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"MegaBlocks results keys: </span><span class="si">{</span><span class="nb">list</span><span class="p">(</span><span class="n">megablocks_results</span><span class="o">.</span><span class="n">keys</span><span class="p">())</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="c1"># Helper function to extract metrics from either old or new JSON format</span> | |
| <span class="k">def</span><span class="w"> </span><span class="nf">get_metric</span><span class="p">(</span><span class="n">results</span><span class="p">,</span> <span class="n">metric_name</span><span class="p">,</span> <span class="n">default</span><span class="o">=</span><span class="mi">0</span><span class="p">):</span> | |
| <span class="w"> </span><span class="sd">"""Extract metric from results, handling both old and new JSON formats."""</span> | |
| <span class="c1"># New format (with stats dict)</span> | |
| <span class="k">if</span> <span class="s1">'stats'</span> <span class="ow">in</span> <span class="n">results</span><span class="p">:</span> | |
| <span class="k">return</span> <span class="n">results</span><span class="p">[</span><span class="s1">'stats'</span><span class="p">]</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="n">metric_name</span><span class="p">,</span> <span class="n">default</span><span class="p">)</span> | |
| <span class="c1"># Old format (direct keys)</span> | |
| <span class="k">elif</span> <span class="n">metric_name</span> <span class="ow">in</span> <span class="n">results</span><span class="p">:</span> | |
| <span class="k">return</span> <span class="n">results</span><span class="p">[</span><span class="n">metric_name</span><span class="p">]</span> | |
| <span class="k">else</span><span class="p">:</span> | |
| <span class="k">return</span> <span class="n">default</span> | |
| <span class="c1"># Create comparison plots</span> | |
| <span class="n">fig</span><span class="p">,</span> <span class="p">((</span><span class="n">ax1</span><span class="p">,</span> <span class="n">ax2</span><span class="p">),</span> <span class="p">(</span><span class="n">ax3</span><span class="p">,</span> <span class="n">ax4</span><span class="p">))</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">subplots</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">figsize</span><span class="o">=</span><span class="p">(</span><span class="mi">15</span><span class="p">,</span> <span class="mi">10</span><span class="p">))</span> | |
| <span class="c1"># Performance comparison</span> | |
| <span class="n">implementations</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'GPT-OSS'</span><span class="p">,</span> <span class="s1">'MegaBlocks'</span><span class="p">]</span> | |
| <span class="c1"># Extract timing metrics (handle both avg_ms and avg_time_ms)</span> | |
| <span class="n">gpt_time</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">gptoss_results</span><span class="p">,</span> <span class="s1">'avg_ms'</span><span class="p">,</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">gptoss_results</span><span class="p">,</span> <span class="s1">'avg_time_ms'</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> | |
| <span class="n">mega_time</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">megablocks_results</span><span class="p">,</span> <span class="s1">'avg_ms'</span><span class="p">,</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">megablocks_results</span><span class="p">,</span> <span class="s1">'avg_time_ms'</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> | |
| <span class="n">times</span> <span class="o">=</span> <span class="p">[</span><span class="n">gpt_time</span><span class="p">,</span> <span class="n">mega_time</span><span class="p">]</span> | |
| <span class="c1"># Extract throughput metrics</span> | |
| <span class="n">gpt_throughput</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">gptoss_results</span><span class="p">,</span> <span class="s1">'tokens_per_s'</span><span class="p">,</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">gptoss_results</span><span class="p">,</span> <span class="s1">'throughput_tokens_per_sec'</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> | |
| <span class="n">mega_throughput</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">megablocks_results</span><span class="p">,</span> <span class="s1">'tokens_per_s'</span><span class="p">,</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">megablocks_results</span><span class="p">,</span> <span class="s1">'throughput_tokens_per_sec'</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> | |
| <span class="n">throughputs</span> <span class="o">=</span> <span class="p">[</span><span class="n">gpt_throughput</span><span class="p">,</span> <span class="n">mega_throughput</span><span class="p">]</span> | |
| <span class="c1"># Extract memory metrics</span> | |
| <span class="n">gpt_memory</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">gptoss_results</span><span class="p">,</span> <span class="s1">'memory_allocated_gb'</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> | |
| <span class="n">mega_memory</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">megablocks_results</span><span class="p">,</span> <span class="s1">'memory_allocated_gb'</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> | |
| <span class="n">memory_usage</span> <span class="o">=</span> <span class="p">[</span><span class="n">gpt_memory</span><span class="p">,</span> <span class="n">mega_memory</span><span class="p">]</span> | |
| <span class="n">gpt_mem_inc</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">gptoss_results</span><span class="p">,</span> <span class="s1">'memory_increase_gb'</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> | |
| <span class="n">mega_mem_inc</span> <span class="o">=</span> <span class="n">get_metric</span><span class="p">(</span><span class="n">megablocks_results</span><span class="p">,</span> <span class="s1">'memory_increase_gb'</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> | |
| <span class="n">memory_increase</span> <span class="o">=</span> <span class="p">[</span><span class="n">gpt_mem_inc</span><span class="p">,</span> <span class="n">mega_mem_inc</span><span class="p">]</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"Extracted metrics:"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Times (ms): </span><span class="si">{</span><span class="n">times</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Throughputs: </span><span class="si">{</span><span class="n">throughputs</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Memory usage (GB): </span><span class="si">{</span><span class="n">memory_usage</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">" Memory increase (GB): </span><span class="si">{</span><span class="n">memory_increase</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="n">colors</span> <span class="o">=</span> <span class="p">[</span><span class="s1">'#2E8B57'</span><span class="p">,</span> <span class="s1">'#4169E1'</span><span class="p">]</span> | |
| <span class="c1"># Latency comparison</span> | |
| <span class="n">bars1</span> <span class="o">=</span> <span class="n">ax1</span><span class="o">.</span><span class="n">bar</span><span class="p">(</span><span class="n">implementations</span><span class="p">,</span> <span class="n">times</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="n">colors</span><span class="p">)</span> | |
| <span class="n">ax1</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s1">'Average Time (ms)'</span><span class="p">)</span> | |
| <span class="n">ax1</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="s1">'Latency Comparison'</span><span class="p">)</span> | |
| <span class="n">ax1</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span> | |
| <span class="c1"># Add values on bars</span> | |
| <span class="k">for</span> <span class="n">bar</span><span class="p">,</span> <span class="n">time</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">bars1</span><span class="p">,</span> <span class="n">times</span><span class="p">):</span> | |
| <span class="n">height</span> <span class="o">=</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_height</span><span class="p">()</span> | |
| <span class="n">ax1</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">bar</span><span class="o">.</span><span class="n">get_x</span><span class="p">()</span> <span class="o">+</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_width</span><span class="p">()</span><span class="o">/</span><span class="mf">2.</span><span class="p">,</span> <span class="n">height</span> <span class="o">+</span> <span class="n">height</span><span class="o">*</span><span class="mf">0.01</span><span class="p">,</span> | |
| <span class="sa">f</span><span class="s1">'</span><span class="si">{</span><span class="n">time</span><span class="si">:</span><span class="s1">.2f</span><span class="si">}</span><span class="s1">ms'</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s1">'center'</span><span class="p">,</span> <span class="n">va</span><span class="o">=</span><span class="s1">'bottom'</span><span class="p">)</span> | |
| <span class="c1"># Throughput comparison </span> | |
| <span class="n">bars2</span> <span class="o">=</span> <span class="n">ax2</span><span class="o">.</span><span class="n">bar</span><span class="p">(</span><span class="n">implementations</span><span class="p">,</span> <span class="n">throughputs</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="n">colors</span><span class="p">)</span> | |
| <span class="n">ax2</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s1">'Tokens per Second'</span><span class="p">)</span> | |
| <span class="n">ax2</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="s1">'Throughput Comparison'</span><span class="p">)</span> | |
| <span class="n">ax2</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span> | |
| <span class="c1"># Add values on bars</span> | |
| <span class="k">for</span> <span class="n">bar</span><span class="p">,</span> <span class="n">throughput</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">bars2</span><span class="p">,</span> <span class="n">throughputs</span><span class="p">):</span> | |
| <span class="n">height</span> <span class="o">=</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_height</span><span class="p">()</span> | |
| <span class="n">ax2</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">bar</span><span class="o">.</span><span class="n">get_x</span><span class="p">()</span> <span class="o">+</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_width</span><span class="p">()</span><span class="o">/</span><span class="mf">2.</span><span class="p">,</span> <span class="n">height</span> <span class="o">+</span> <span class="n">height</span><span class="o">*</span><span class="mf">0.01</span><span class="p">,</span> | |
| <span class="sa">f</span><span class="s1">'</span><span class="si">{</span><span class="n">throughput</span><span class="si">:</span><span class="s1">.0f</span><span class="si">}</span><span class="s1">'</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s1">'center'</span><span class="p">,</span> <span class="n">va</span><span class="o">=</span><span class="s1">'bottom'</span><span class="p">)</span> | |
| <span class="c1"># Memory usage comparison</span> | |
| <span class="n">bars3</span> <span class="o">=</span> <span class="n">ax3</span><span class="o">.</span><span class="n">bar</span><span class="p">(</span><span class="n">implementations</span><span class="p">,</span> <span class="n">memory_usage</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="n">colors</span><span class="p">)</span> | |
| <span class="n">ax3</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s1">'Memory Allocated (GB)'</span><span class="p">)</span> | |
| <span class="n">ax3</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="s1">'Memory Usage Comparison'</span><span class="p">)</span> | |
| <span class="n">ax3</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span> | |
| <span class="c1"># Add values on bars</span> | |
| <span class="k">for</span> <span class="n">bar</span><span class="p">,</span> <span class="n">mem</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">bars3</span><span class="p">,</span> <span class="n">memory_usage</span><span class="p">):</span> | |
| <span class="n">height</span> <span class="o">=</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_height</span><span class="p">()</span> | |
| <span class="n">ax3</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">bar</span><span class="o">.</span><span class="n">get_x</span><span class="p">()</span> <span class="o">+</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_width</span><span class="p">()</span><span class="o">/</span><span class="mf">2.</span><span class="p">,</span> <span class="n">height</span> <span class="o">+</span> <span class="n">height</span><span class="o">*</span><span class="mf">0.01</span><span class="p">,</span> | |
| <span class="sa">f</span><span class="s1">'</span><span class="si">{</span><span class="n">mem</span><span class="si">:</span><span class="s1">.2f</span><span class="si">}</span><span class="s1">GB'</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s1">'center'</span><span class="p">,</span> <span class="n">va</span><span class="o">=</span><span class="s1">'bottom'</span><span class="p">)</span> | |
| <span class="c1"># Memory increase comparison</span> | |
| <span class="n">bars4</span> <span class="o">=</span> <span class="n">ax4</span><span class="o">.</span><span class="n">bar</span><span class="p">(</span><span class="n">implementations</span><span class="p">,</span> <span class="n">memory_increase</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="n">colors</span><span class="p">)</span> | |
| <span class="n">ax4</span><span class="o">.</span><span class="n">set_ylabel</span><span class="p">(</span><span class="s1">'Memory Increase (GB)'</span><span class="p">)</span> | |
| <span class="n">ax4</span><span class="o">.</span><span class="n">set_title</span><span class="p">(</span><span class="s1">'Memory Increase Comparison'</span><span class="p">)</span> | |
| <span class="n">ax4</span><span class="o">.</span><span class="n">grid</span><span class="p">(</span><span class="kc">True</span><span class="p">,</span> <span class="n">alpha</span><span class="o">=</span><span class="mf">0.3</span><span class="p">)</span> | |
| <span class="c1"># Add values on bars</span> | |
| <span class="k">for</span> <span class="n">bar</span><span class="p">,</span> <span class="n">mem_inc</span> <span class="ow">in</span> <span class="nb">zip</span><span class="p">(</span><span class="n">bars4</span><span class="p">,</span> <span class="n">memory_increase</span><span class="p">):</span> | |
| <span class="n">height</span> <span class="o">=</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_height</span><span class="p">()</span> | |
| <span class="n">ax4</span><span class="o">.</span><span class="n">text</span><span class="p">(</span><span class="n">bar</span><span class="o">.</span><span class="n">get_x</span><span class="p">()</span> <span class="o">+</span> <span class="n">bar</span><span class="o">.</span><span class="n">get_width</span><span class="p">()</span><span class="o">/</span><span class="mf">2.</span><span class="p">,</span> <span class="n">height</span> <span class="o">+</span> <span class="n">height</span><span class="o">*</span><span class="mf">0.01</span><span class="p">,</span> | |
| <span class="sa">f</span><span class="s1">'</span><span class="si">{</span><span class="n">mem_inc</span><span class="si">:</span><span class="s1">.3f</span><span class="si">}</span><span class="s1">GB'</span><span class="p">,</span> <span class="n">ha</span><span class="o">=</span><span class="s1">'center'</span><span class="p">,</span> <span class="n">va</span><span class="o">=</span><span class="s1">'bottom'</span><span class="p">)</span> | |
| <span class="n">plt</span><span class="o">.</span><span class="n">tight_layout</span><span class="p">()</span> | |
| <span class="n">plt</span><span class="o">.</span><span class="n">savefig</span><span class="p">(</span><span class="s1">'small_moe_comparison.png'</span><span class="p">,</span> <span class="n">dpi</span><span class="o">=</span><span class="mi">150</span><span class="p">,</span> <span class="n">bbox_inches</span><span class="o">=</span><span class="s1">'tight'</span><span class="p">)</span> | |
| <span class="n">plt</span><span class="o">.</span><span class="n">show</span><span class="p">()</span> | |
| <span class="c1"># Print summary table</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"</span><span class="se">\n</span><span class="s2">"</span> <span class="o">+</span> <span class="s2">"="</span><span class="o">*</span><span class="mi">60</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"PERFORMANCE COMPARISON SUMMARY"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"="</span><span class="o">*</span><span class="mi">60</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="s1">'Metric'</span><span class="si">:</span><span class="s2"><25</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="s1">'GPT-OSS'</span><span class="si">:</span><span class="s2"><15</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="s1">'MegaBlocks'</span><span class="si">:</span><span class="s2"><15</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="s1">'Winner'</span><span class="si">:</span><span class="s2"><10</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"-"</span> <span class="o">*</span> <span class="mi">60</span><span class="p">)</span> | |
| <span class="c1"># Determine winners</span> | |
| <span class="n">latency_winner</span> <span class="o">=</span> <span class="s2">"GPT-OSS"</span> <span class="k">if</span> <span class="n">times</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><</span> <span class="n">times</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">else</span> <span class="s2">"MegaBlocks"</span> | |
| <span class="n">throughput_winner</span> <span class="o">=</span> <span class="s2">"GPT-OSS"</span> <span class="k">if</span> <span class="n">throughputs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">></span> <span class="n">throughputs</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">else</span> <span class="s2">"MegaBlocks"</span> | |
| <span class="n">memory_winner</span> <span class="o">=</span> <span class="s2">"GPT-OSS"</span> <span class="k">if</span> <span class="n">memory_usage</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><</span> <span class="n">memory_usage</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">else</span> <span class="s2">"MegaBlocks"</span> | |
| <span class="n">mem_inc_winner</span> <span class="o">=</span> <span class="s2">"GPT-OSS"</span> <span class="k">if</span> <span class="n">memory_increase</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><</span> <span class="n">memory_increase</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">else</span> <span class="s2">"MegaBlocks"</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="s1">'Latency (ms)'</span><span class="si">:</span><span class="s2"><25</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">times</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.2f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">times</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.2f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">latency_winner</span><span class="si">:</span><span class="s2"><10</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="s1">'Throughput (tok/s)'</span><span class="si">:</span><span class="s2"><25</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">throughputs</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.0f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">throughputs</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.0f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">throughput_winner</span><span class="si">:</span><span class="s2"><10</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="s1">'Memory Usage (GB)'</span><span class="si">:</span><span class="s2"><25</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">memory_usage</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.3f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">memory_usage</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.3f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">memory_winner</span><span class="si">:</span><span class="s2"><10</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="s1">'Memory Increase (GB)'</span><span class="si">:</span><span class="s2"><25</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">memory_increase</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.3f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">memory_increase</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span><span class="si">:</span><span class="s2"><15.3f</span><span class="si">}</span><span class="s2"> </span><span class="si">{</span><span class="n">mem_inc_winner</span><span class="si">:</span><span class="s2"><10</span><span class="si">}</span><span class="s2">"</span><span class="p">)</span> | |
| <span class="c1"># Speed ratio</span> | |
| <span class="n">speed_ratio</span> <span class="o">=</span> <span class="n">times</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="o">/</span> <span class="n">times</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="k">if</span> <span class="n">times</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><</span> <span class="n">times</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> <span class="k">else</span> <span class="n">times</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">/</span> <span class="n">times</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span> | |
| <span class="n">faster_impl</span> <span class="o">=</span> <span class="n">latency_winner</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="se">\n</span><span class="si">{</span><span class="n">faster_impl</span><span class="si">}</span><span class="s2"> is </span><span class="si">{</span><span class="n">speed_ratio</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">x faster"</span><span class="p">)</span> | |
| <span class="c1"># Throughput ratio </span> | |
| <span class="n">throughput_ratio</span> <span class="o">=</span> <span class="nb">max</span><span class="p">(</span><span class="n">throughputs</span><span class="p">)</span> <span class="o">/</span> <span class="nb">min</span><span class="p">(</span><span class="n">throughputs</span><span class="p">)</span> | |
| <span class="n">higher_throughput</span> <span class="o">=</span> <span class="n">throughput_winner</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="sa">f</span><span class="s2">"</span><span class="si">{</span><span class="n">higher_throughput</span><span class="si">}</span><span class="s2"> has </span><span class="si">{</span><span class="n">throughput_ratio</span><span class="si">:</span><span class="s2">.2f</span><span class="si">}</span><span class="s2">x higher throughput"</span><span class="p">)</span> | |
| <span class="nb">print</span><span class="p">(</span><span class="s2">"="</span><span class="o">*</span><span class="mi">60</span><span class="p">)</span> | |
| </pre></div></td></tr></table></div> | |
| </div> | |
| <div id="output-visualization" class="cell-output"> | |
| <div class="cell-stdout">Loading benchmark results from: | |
| GPT-OSS dir: /home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/fc17d5998a27217e1676a638ddeceb18cab662c6e9b30c9a62218784604c9a26 | |
| MegaBlocks dir: /home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/6e3545a8e3c2ca65ca800a7e1c1824fded11e28258efcd83355514bb0646e166 | |
| Loading results from: | |
| GPT-OSS: /home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/fc17d5998a27217e1676a638ddeceb18cab662c6e9b30c9a62218784604c9a26/gptoss_results.json | |
| MegaBlocks: /home/ubuntu/Projects/uvnote-megablocks-bench/.uvnote/cache/6e3545a8e3c2ca65ca800a7e1c1824fded11e28258efcd83355514bb0646e166/megablocks_results.json | |
| GPT-OSS results keys: ['avg_time_ms', 'throughput_tokens_per_sec', 'memory_allocated_gb', 'memory_cached_gb', 'memory_increase_gb', 'device', 'dtype', 'tokens', 'warmup_iters', 'timing_iters'] | |
| MegaBlocks results keys: ['avg_time_ms', 'throughput_tokens_per_sec', 'memory_allocated_gb', 'memory_cached_gb', 'memory_increase_gb', 'device', 'dtype', 'tokens', 'warmup_iters', 'timing_iters'] | |
| Extracted metrics: | |
| Times (ms): [62.308485079556704, 26.93254135781899] | |
| Throughputs: [65737.43519474348, 152083.67994618745] | |
| Memory usage (GB): [1.329831600189209, 2.2425241470336914] | |
| Memory increase (GB): [0.3795137405395508, 1.2922062873840332] | |
| ============================================================ | |
| PERFORMANCE COMPARISON SUMMARY | |
| ============================================================ | |
| Metric GPT-OSS MegaBlocks Winner | |
| ------------------------------------------------------------ | |
| Latency (ms) 62.31 26.93 MegaBlocks | |
| Throughput (tok/s) 65737 152084 MegaBlocks | |
| Memory Usage (GB) 1.330 2.243 GPT-OSS | |
| Memory Increase (GB) 0.380 1.292 GPT-OSS | |
| MegaBlocks is 2.31x faster | |
| MegaBlocks has 2.31x higher throughput | |
| ============================================================ | |
| </div> | |
| <div class="uv-install-logs" id="uv-logs-visualization"> | |
| <div class="uv-logs-header" onclick="toggleUvLogs(this)">▶ UV Install Logs</div> | |
| <div class="uv-logs-content" style="display: none;"> | |
| Downloading pillow (6.3MiB) | |
| Downloading numpy (15.9MiB) | |
| Downloading fonttools (4.7MiB) | |
| Downloading matplotlib (8.3MiB) | |
| Downloading kiwisolver (1.4MiB) | |
| Downloading kiwisolver | |
| Downloading pillow | |
| Downloading fonttools | |
| Downloading matplotlib | |
| Downloading numpy | |
| Installed 11 packages in 29ms | |
| </div> | |
| </div> | |
| <div class="cell-artifacts"> | |
| <h4>Artifacts:</h4> | |
| <a href="artifacts/visualization/small_moe_comparison.png" class="artifact" target="_blank">small_moe_comparison.png</a> | |
| <div class="artifact-preview"> | |
| <img src="artifacts/visualization/small_moe_comparison.png" alt="small_moe_comparison.png"> | |
| </div> | |
| </div> | |
| </div> | |
| </div> | |
| <h2>Conclusion</h2> | |
| <p>This focused benchmark compares the GPT-OSS (non-training mode) and MegaBlocks MoE implementations on the same hardware with identical weights and inputs. The comparison focuses on:</p> | |
| <ol> | |
| <li><strong>Latency</strong>: Average forward pass time</li> | |
| <li><strong>Throughput</strong>: Tokens processed per second </li> | |
| <li><strong>Memory Usage</strong>: GPU memory consumption</li> | |
| <li><strong>Memory Efficiency</strong>: Memory increase during execution</li> | |
| </ol> | |
| <p>Both implementations use:<br /> | |
| - 64 experts with top-2 routing<br /> | |
| - 768 hidden dimensions<br /> | |
| - Batch size of 8, sequence length of 512<br /> | |
| - bfloat16 precision<br /> | |
| - Identical pre-generated weights for fair comparison </p> | |
| <p>The results show the performance characteristics of each approach, helping identify the optimal implementation for different use cases.</p> | |
| </div> | |
| </body> | |
| </html> |