MelodyDeterminism-Demo / README_PATCH.md
Simo76's picture
Upload 9 files
493de78 verified

A newer version of the Gradio SDK is available: 6.8.0

Upgrade

MelodyDeterminism Patch (GPU + determinism + benchmark)

Cosa include

  • Backend NumPy/CuPy con selezione automatica e RNG deterministico (Philox/PCG64).
  • Riduzioni deterministiche (TreeFixed, KahanFixed), softmax robusta, sampling canonico.
  • Metadati di tolleranza (max_abs_err / max_rel_err).
  • Benchmark di overhead per batch n e vocab v su CPU/GPU.
  • Test edge (logit estremi, maschere, dtypes, invarianti).

Setup rapido

  1. Copia core/ e tests/ nel tuo Space/Repo.
  2. Unisci requirements.txt (aggiungi CuPy se usi GPU in Space).
  3. In app.py, importa e usa le funzioni per la tua UI Gradio.
  4. Space hardware: imposta una GPU (es. T4/A10) su Hugging Face.

Gradio (snippet)

import gradio as gr
from core.backend import set_seed, backend_name
from core.bench import bench_suite
from core.softmax import softmax_canonical
from core.sampling import sample_canonical
from core.metrics import tol_stats
from core.deterministic import reduce_tree_fixed, sum_kahan_fixed

def run_suite(seed, n, v, dtype):
    import numpy as np
    set_seed(int(seed))
    # Input sintetico
    from core.backend import xp
    x = xp.random.standard_normal((int(n), int(v))).astype(getattr(xp, dtype))
    # una riga di esempio per le tolleranze
    p = softmax_canonical(x[0])
    idx = sample_canonical(p, seed=seed, token_idx=0)
    stats = {"backend": backend_name(), "token0": int(idx)}
    return stats

with gr.Blocks(theme=gr.themes.Soft()) as demo:
    with gr.Tab("Deterministic"):
        seed = gr.Number(42, precision=0, label="Seed")
        n = gr.Slider(1, 64, 8, step=1, label="Batch n")
        v = gr.Dropdown([1024, 8192, 32768], value=8192, label="Vocab v")
        dtype = gr.Radio(["float32", "float64"], value="float32", label="dtype")
        run = gr.Button("Esegui suite")
        out = gr.JSON(label="Output + metadata")
        run.click(run_suite, [seed, n, v, dtype], [out])
    with gr.Tab("Benchmark"):
        runb = gr.Button("Benchmark")
        table = gr.Dataframe(headers=["n","v","t_std_ms","t_can_ms","overhead_pct"], label="Latenze (ms)")
        def _bench():
            return bench_suite()
        runb.click(_bench, outputs=[table])
# demo.queue(concurrency_count=2, max_size=8).launch()

Note deterministiche

  • RNG: Philox (GPU) / PCG64 (CPU) con mapping u→searchsorted(CDF, side='left').
  • BLAS: per i test forziamo OMP/MKL threads = 1 per ridurre variabilità.
  • Dtypes: preferisci float32; per softmax/riduzioni usiamo accumulo float64.

Policy tie-break

searchsorted(..., side='left') ⇒ tie-break verso min-id in caso di parità della CDF.

Esecuzione test

pytest -q