cahlen's picture
Add proper model card metadata (YAML frontmatter)
29fe620 verified
metadata
language:
  - en
license: mit
tags:
  - cuda
  - gpu
  - number-theory
  - computational-mathematics
  - continued-fractions
  - zaremba
  - ramsey
  - kronecker-coefficients
  - class-numbers
  - hausdorff-dimension
  - ramanujan-machine
  - erdos-straus
  - prime-convergents
  - flint-hills
  - spectral-methods
  - bigcompute
library_name: other
pipeline_tag: other
datasets:
  - cahlen/zaremba-density
  - cahlen/zaremba-conjecture-data
  - cahlen/class-numbers-real-quadratic
  - cahlen/kronecker-coefficients
  - cahlen/hausdorff-dimension-spectrum
  - cahlen/continued-fraction-spectra
  - cahlen/ramanujan-machine-results

bigcompute.science CUDA Kernels

51 custom CUDA kernels for GPU-accelerated computational mathematics research. These kernels power the experiments at bigcompute.science.

All kernels are standalone β€” compile with nvcc, run from the command line. No PyTorch dependency.

Hardware

Developed and tested on:

  • 8x NVIDIA B200 (183 GB VRAM each, sm_90)
  • NVIDIA RTX 5090 (32 GB VRAM, sm_120)

Most kernels will run on any CUDA GPU (sm_50+). Compile with your target architecture:

nvcc -O3 -arch=sm_XX -o kernel kernel.cu -lm

Kernels by Experiment

Zaremba's Conjecture (25 kernels)

Density enumeration (zaremba-density/) β€” complete CF tree enumeration with bitset marking:

  • zaremba_density_gpu.cu β€” production kernel, 65+ runs to 10^12
  • zaremba_density_v2.cu β€” alternative implementation
  • zaremba_density_gpu_worksteal_v2.cu β€” work-stealing variant for load balancing

Transfer operator (zaremba-transfer-operator/) β€” Chebyshev collocation spectral method:

  • transfer_operator.cu β€” spectral gap computation for Ruelle operator

Effective bound (zaremba-effective-bound/) β€” Bourgain-Kontorovich proof framework:

  • spectral_gaps_fast.cu β€” bulk spectral gap verification
  • spectral_gaps_primes.cu β€” prime-indexed gaps
  • certify_rho_cuda.cu β€” arb ball arithmetic certification
  • compute_Q0.cu / Q0_frolenkov_kan.cu β€” effective constant extraction
  • count_representations.cu β€” CF representation counting
  • dolgopyat_exact.cu / dolgopyat_profile.cu β€” Dolgopyat estimate profiling
  • exponential_sum.cu β€” exponential sum bounds
  • extract_eigenfunction.cu β€” transfer operator eigenfunction extraction
  • flat_spectral_gap.cu β€” uniform spectral gap verification
  • matrix_enum.cu / matrix_enum_multipass.cu β€” SL(2,Z) matrix enumeration
  • minor_arc_primes.cu / minor_arc_profile.cu β€” minor arc estimates
  • verify_all_gaps_fp64.cu / verify_gaps_interval.cu / verify_gaps_v2.cu β€” gap verification suite
  • compute_c1_rigorous.cu β€” rigorous constant computation

Cayley diameters (zaremba-cayley-diameter/) β€” BFS on Cayley graphs of SL(2,Z/pZ):

  • cayley_diameter.cu / cayley_gpu.cu β€” full BFS diameter computation

Transitivity (zaremba-transitivity/) β€” algebraic verification:

  • check_transitivity.cu β€” Dickson classification check

Ramsey R(5,5) (7 kernels)

ramsey-r55/ β€” search for 2-colorings of complete graphs with no monochromatic K5:

  • ramsey_gpu.cu β€” base simulated annealing kernel
  • ramsey_incremental.cu / ramsey_incremental_v2.cu β€” incremental K5 counter
  • ramsey_extend.cu / ramsey_extend_all.cu β€” exhaustive extension checking (4.4T extensions of K42 to K43)
  • ramsey_fullcount.cu β€” complete clique enumeration
  • ramsey_search.cu / ramsey_global.cu / ramsey_verified.cu β€” search variants

Class Numbers (4 kernels)

class-numbers/ β€” class numbers of real quadratic fields via BSGS:

  • class_numbers_v2.cu β€” production kernel (10^9 to 10^12 range)
  • class_number_rqf.cu β€” real quadratic field specialization
  • class_number_fast.cu β€” optimized inner loop
  • sieve_gpu.cu β€” GPU prime sieve

Kronecker Coefficients (3 kernels)

kronecker-coefficients/ β€” character tables and Kronecker triple computation:

  • kronecker_gpu.cu β€” full character table (S20: 3.7s, S30: 7.4 min, S40: 9.5 hr)
  • kronecker_fast.cu β€” optimized triple-sum
  • kronecker_compute.cu β€” targeted triple computation

Ramanujan Machine (2 kernels)

ramanujan-machine/ β€” automated discovery of continued fraction formulas:

  • ramanujan_gpu.cu β€” v1 kernel (equal-degree polynomials, exhausted)
  • ramanujan_v2.cu β€” v2 kernel (asymmetric-degree, where new discoveries live)

Prime Convergents (2 kernels)

prime-convergents/ β€” prime statistics of CF convergents:

  • prime_convergents.cu β€” v1 (uint64, depth ~38)
  • prime_convergents_v2.cu β€” v2 (uint128, depth ~75, 128-bit Miller-Rabin)

Erdos-Straus Conjecture (1 kernel)

erdos-straus/ β€” solution counting for 4/p = 1/x + 1/y + 1/z:

  • erdos_straus.cu β€” per-prime f(p) enumeration, tested to 10^9

Spectral Computations (4 kernels)

hausdorff-spectrum/ β€” Hausdorff dimension via transfer operator + Chebyshev collocation:

  • hausdorff_spectrum.cu β€” all 2^20 - 1 subsets of {1,...,20}

lyapunov-spectrum/ β€” Lyapunov exponents of CF digit sets:

  • lyapunov_spectrum.cu β€” full spectrum computation

minkowski-spectrum/ β€” Minkowski question-mark function:

  • minkowski_spectrum.cu β€” singularity spectrum

flint-hills/ β€” Flint Hills series partial sums:

  • flint_hills.cu β€” high-precision partial sum to 10B terms

Results

All computation results are open:

License

MIT

Citation

@misc{humphreys2026bigcompute,
  author = {Humphreys, Cahlen},
  title = {bigcompute.science: GPU-Accelerated Computational Mathematics},
  year = {2026},
  url = {https://bigcompute.science}
}

Human-AI collaborative research (Cahlen Humphreys + Claude). All code and data open for verification.