Spaces:

build-small-hackathon
/

Scrypt

Running on Zero

App Files Files Community

Scrypt / space

Commit History

space: ssr_mode=False on launch — gradio 6 SSR's Node proxy doesn't forward the raw /pty websocket

95ab054

IMJONEZZ commited on 12 days ago

space: transformers 5 apply_chat_template returns BatchEncoding — use return_dict + **enc into generate (fixes AttributeError on .shape)

aac926a

IMJONEZZ commited on 12 days ago

space: load Nemotron the normal way — transformers-native (no trust_remote_code), NO mamba_ssm/causal_conv1d. Those custom Triton CUDA kernels were the segfault (THCPModule_initExtension); native falls back to pure-torch Mamba on ZeroGPU.

0c2e095

IMJONEZZ commited on 12 days ago

space: adopt the org's proven NPCverse structure — gradio 6 Server + @app .api + app.launch() (installs ZeroGPU hooks), transformers 5 (compatible with gradio 6; trust_remote_code uses our repo's modeling). Replaces the custom engine.launch+route-surgery that broke the hooks and segfaulted.

9203831

IMJONEZZ commited on 12 days ago

space: drive @spaces.GPU through Gradio's API (gr.api + gradio_client), not run_in_threadpool — the threadpool call inits CUDA off-thread and segfaults. Matches how the org's NPCverse/the-deal spaces invoke GPU work.

13015f6

IMJONEZZ commited on 12 days ago

space: finetuned Warden on ZeroGPU the documented way — bf16 + .to('cuda') module-level + @spaces.GPU(xlarge), no bitsandbytes/device_map (the actual fix). Direct run_in_threadpool call verified by the probe.

321303b

IMJONEZZ commited on 12 days ago

space: ZeroGPU diagnostic — measure CPU RAM/disk/VRAM + confirm a @spaces.GPU call works, before loading the model the documented .to('cuda') way

10c83ac

IMJONEZZ commited on 12 days ago

space: move /static mount ahead of gradio catch-all (styling regression fix)

ee38482

IMJONEZZ commited on 12 days ago

space: add /api/probe to verify live Warden generation end-to-end

6152ad5

IMJONEZZ commited on 12 days ago

space: revert to Gradio SDK + CPU llama-cpp-python (keeps the prize; ZeroGPU was the problem, not the SDK)

e577af2

IMJONEZZ commited on 12 days ago

space: load model lazily inside the GPU worker — module-level device_map=cuda + bnb poisoned the ZeroGPU fork's CUDA context

c1a8f99

IMJONEZZ commited on 12 days ago

space: route GPU calls through Gradio (gr.api + gradio_client) so the ZeroGPU per-request CUDA hooks fire

4468bdc

IMJONEZZ commited on 12 days ago

space: duration=120 for cold start + /api/status fast-path (causal_conv1d) probe

3af751e

IMJONEZZ commited on 12 days ago

space: blocking GPU generate instead of threaded streamer (hung across ZeroGPU fork); 503 on failure so the game falls back cleanly

255e227

IMJONEZZ commited on 12 days ago

play: reserve a bottom row + taller frame so the board prompt isn't clipped

a0de8fb

IMJONEZZ commited on 12 days ago

play: autosize the terminal so the full board always fits (cards were clipping)

6f42620

IMJONEZZ commited on 12 days ago

space: gradio 5.49 — transformers<5 needs hub<1.0, which gradio 6 forbids

1330ecb

IMJONEZZ commited on 12 days ago

space: serve via Blocks.launch (ZeroGPU handshake) + pin transformers<5 for the bnb4 checkpoint format

8051e61

IMJONEZZ commited on 12 days ago

space: load the released nf4 Warden from the hub (1GB Space LFS cap rules out in-repo weights)

caef9bc

IMJONEZZ commited on 12 days ago

space: load the Warden shipped in the repo (no boot download)

a6fd68b

IMJONEZZ commited on 12 days ago

space: surface mamba install diagnostics in /api/status; bnb4 prequant script

b5186d6

IMJONEZZ commited on 12 days ago

space: WebGL renderer + customGlyphs — card art was warping in the browser

34b513d

IMJONEZZ commited on 13 days ago

space: bootstrap mamba-ssm/causal-conv1d at runtime for Nemotron-H

52d29cc

IMJONEZZ commited on 13 days ago

space: disable gradio SSR on mount — the Node shell was stealing port 7860

40ab456

IMJONEZZ commited on 13 days ago

space: /api/status — expose Warden load state for ops

d49d2f3

IMJONEZZ commited on 13 days ago

space: ZeroGPU port — Gradio SDK runtime, on-Space Warden inference

d94c85e

IMJONEZZ commited on 13 days ago

SCRYPT: initial commit — game, sandbox, Warden, Space web layer

9fca766

IMJONEZZ commited on 13 days ago

Commit History

space: ssr_mode=False on launch — gradio 6 SSR's Node proxy doesn't forward the raw /pty websocket 95ab054

space: transformers 5 apply_chat_template returns BatchEncoding — use return_dict + **enc into generate (fixes AttributeError on .shape) aac926a

space: load Nemotron the normal way — transformers-native (no trust_remote_code), NO mamba_ssm/causal_conv1d. Those custom Triton CUDA kernels were the segfault (THCPModule_initExtension); native falls back to pure-torch Mamba on ZeroGPU. 0c2e095

space: drive @spaces.GPU through Gradio's API (gr.api + gradio_client), not run_in_threadpool — the threadpool call inits CUDA off-thread and segfaults. Matches how the org's NPCverse/the-deal spaces invoke GPU work. 13015f6

space: finetuned Warden on ZeroGPU the documented way — bf16 + .to('cuda') module-level + @spaces.GPU(xlarge), no bitsandbytes/device_map (the actual fix). Direct run_in_threadpool call verified by the probe. 321303b

space: ZeroGPU diagnostic — measure CPU RAM/disk/VRAM + confirm a @spaces.GPU call works, before loading the model the documented .to('cuda') way 10c83ac

space: move /static mount ahead of gradio catch-all (styling regression fix) ee38482

space: add /api/probe to verify live Warden generation end-to-end 6152ad5

space: revert to Gradio SDK + CPU llama-cpp-python (keeps the prize; ZeroGPU was the problem, not the SDK) e577af2

space: load model lazily inside the GPU worker — module-level device_map=cuda + bnb poisoned the ZeroGPU fork's CUDA context c1a8f99

space: route GPU calls through Gradio (gr.api + gradio_client) so the ZeroGPU per-request CUDA hooks fire 4468bdc

space: duration=120 for cold start + /api/status fast-path (causal_conv1d) probe 3af751e

space: blocking GPU generate instead of threaded streamer (hung across ZeroGPU fork); 503 on failure so the game falls back cleanly 255e227

play: reserve a bottom row + taller frame so the board prompt isn't clipped a0de8fb

play: autosize the terminal so the full board always fits (cards were clipping) 6f42620

space: gradio 5.49 — transformers<5 needs hub<1.0, which gradio 6 forbids 1330ecb

space: serve via Blocks.launch (ZeroGPU handshake) + pin transformers<5 for the bnb4 checkpoint format 8051e61

space: load the released nf4 Warden from the hub (1GB Space LFS cap rules out in-repo weights) caef9bc

space: load the Warden shipped in the repo (no boot download) a6fd68b

space: surface mamba install diagnostics in /api/status; bnb4 prequant script b5186d6

space: WebGL renderer + customGlyphs — card art was warping in the browser 34b513d

space: bootstrap mamba-ssm/causal-conv1d at runtime for Nemotron-H 52d29cc

space: disable gradio SSR on mount — the Node shell was stealing port 7860 40ab456

space: /api/status — expose Warden load state for ops d49d2f3

space: ZeroGPU port — Gradio SDK runtime, on-Space Warden inference d94c85e

SCRYPT: initial commit — game, sandbox, Warden, Space web layer 9fca766

space: ssr_mode=False on launch — gradio 6 SSR's Node proxy doesn't forward the raw /pty websocket

95ab054

space: transformers 5 apply_chat_template returns BatchEncoding — use return_dict + **enc into generate (fixes AttributeError on .shape)

aac926a

space: load Nemotron the normal way — transformers-native (no trust_remote_code), NO mamba_ssm/causal_conv1d. Those custom Triton CUDA kernels were the segfault (THCPModule_initExtension); native falls back to pure-torch Mamba on ZeroGPU.

0c2e095

space: drive @spaces.GPU through Gradio's API (gr.api + gradio_client), not run_in_threadpool — the threadpool call inits CUDA off-thread and segfaults. Matches how the org's NPCverse/the-deal spaces invoke GPU work.

13015f6

space: finetuned Warden on ZeroGPU the documented way — bf16 + .to('cuda') module-level + @spaces.GPU(xlarge), no bitsandbytes/device_map (the actual fix). Direct run_in_threadpool call verified by the probe.

321303b

space: ZeroGPU diagnostic — measure CPU RAM/disk/VRAM + confirm a @spaces.GPU call works, before loading the model the documented .to('cuda') way

10c83ac

space: move /static mount ahead of gradio catch-all (styling regression fix)

ee38482

space: add /api/probe to verify live Warden generation end-to-end

6152ad5

space: revert to Gradio SDK + CPU llama-cpp-python (keeps the prize; ZeroGPU was the problem, not the SDK)

e577af2

space: load model lazily inside the GPU worker — module-level device_map=cuda + bnb poisoned the ZeroGPU fork's CUDA context

c1a8f99

space: route GPU calls through Gradio (gr.api + gradio_client) so the ZeroGPU per-request CUDA hooks fire

4468bdc

space: duration=120 for cold start + /api/status fast-path (causal_conv1d) probe

3af751e

space: blocking GPU generate instead of threaded streamer (hung across ZeroGPU fork); 503 on failure so the game falls back cleanly

255e227

play: reserve a bottom row + taller frame so the board prompt isn't clipped

a0de8fb

play: autosize the terminal so the full board always fits (cards were clipping)

6f42620

space: gradio 5.49 — transformers<5 needs hub<1.0, which gradio 6 forbids

1330ecb

space: serve via Blocks.launch (ZeroGPU handshake) + pin transformers<5 for the bnb4 checkpoint format

8051e61

space: load the released nf4 Warden from the hub (1GB Space LFS cap rules out in-repo weights)

caef9bc

space: load the Warden shipped in the repo (no boot download)

a6fd68b

space: surface mamba install diagnostics in /api/status; bnb4 prequant script

b5186d6

space: WebGL renderer + customGlyphs — card art was warping in the browser

34b513d

space: bootstrap mamba-ssm/causal-conv1d at runtime for Nemotron-H

52d29cc

space: disable gradio SSR on mount — the Node shell was stealing port 7860

40ab456

space: /api/status — expose Warden load state for ops

d49d2f3

space: ZeroGPU port — Gradio SDK runtime, on-Space Warden inference

d94c85e

SCRYPT: initial commit — game, sandbox, Warden, Space web layer

9fca766