bigcompute-cuda-kernels / README.md

cahlen

Add proper model card metadata (YAML frontmatter)

29fe620 verified 2 days ago

preview code

raw

history blame contribute delete

6.09 kB

metadata

language:
  - en
license: mit
tags:
  - cuda
  - gpu
  - number-theory
  - computational-mathematics
  - continued-fractions
  - zaremba
  - ramsey
  - kronecker-coefficients
  - class-numbers
  - hausdorff-dimension
  - ramanujan-machine
  - erdos-straus
  - prime-convergents
  - flint-hills
  - spectral-methods
  - bigcompute
library_name: other
pipeline_tag: other
datasets:
  - cahlen/zaremba-density
  - cahlen/zaremba-conjecture-data
  - cahlen/class-numbers-real-quadratic
  - cahlen/kronecker-coefficients
  - cahlen/hausdorff-dimension-spectrum
  - cahlen/continued-fraction-spectra
  - cahlen/ramanujan-machine-results

bigcompute.science CUDA Kernels

51 custom CUDA kernels for GPU-accelerated computational mathematics research. These kernels power the experiments at bigcompute.science.

All kernels are standalone — compile with nvcc, run from the command line. No PyTorch dependency.

Hardware

Developed and tested on:

8x NVIDIA B200 (183 GB VRAM each, sm_90)
NVIDIA RTX 5090 (32 GB VRAM, sm_120)

Most kernels will run on any CUDA GPU (sm_50+). Compile with your target architecture:

nvcc -O3 -arch=sm_XX -o kernel kernel.cu -lm

Kernels by Experiment

Zaremba's Conjecture (25 kernels)

Density enumeration (zaremba-density/) — complete CF tree enumeration with bitset marking:

zaremba_density_gpu.cu — production kernel, 65+ runs to 10^12
zaremba_density_v2.cu — alternative implementation
zaremba_density_gpu_worksteal_v2.cu — work-stealing variant for load balancing

Transfer operator (zaremba-transfer-operator/) — Chebyshev collocation spectral method:

transfer_operator.cu — spectral gap computation for Ruelle operator

Effective bound (zaremba-effective-bound/) — Bourgain-Kontorovich proof framework:

spectral_gaps_fast.cu — bulk spectral gap verification
spectral_gaps_primes.cu — prime-indexed gaps
certify_rho_cuda.cu — arb ball arithmetic certification
compute_Q0.cu / Q0_frolenkov_kan.cu — effective constant extraction
count_representations.cu — CF representation counting
dolgopyat_exact.cu / dolgopyat_profile.cu — Dolgopyat estimate profiling
exponential_sum.cu — exponential sum bounds
extract_eigenfunction.cu — transfer operator eigenfunction extraction
flat_spectral_gap.cu — uniform spectral gap verification
matrix_enum.cu / matrix_enum_multipass.cu — SL(2,Z) matrix enumeration
minor_arc_primes.cu / minor_arc_profile.cu — minor arc estimates
verify_all_gaps_fp64.cu / verify_gaps_interval.cu / verify_gaps_v2.cu — gap verification suite
compute_c1_rigorous.cu — rigorous constant computation

Cayley diameters (zaremba-cayley-diameter/) — BFS on Cayley graphs of SL(2,Z/pZ):

cayley_diameter.cu / cayley_gpu.cu — full BFS diameter computation

Transitivity (zaremba-transitivity/) — algebraic verification:

check_transitivity.cu — Dickson classification check

Ramsey R(5,5) (7 kernels)

ramsey-r55/ — search for 2-colorings of complete graphs with no monochromatic K5:

ramsey_gpu.cu — base simulated annealing kernel
ramsey_incremental.cu / ramsey_incremental_v2.cu — incremental K5 counter
ramsey_extend.cu / ramsey_extend_all.cu — exhaustive extension checking (4.4T extensions of K42 to K43)
ramsey_fullcount.cu — complete clique enumeration
ramsey_search.cu / ramsey_global.cu / ramsey_verified.cu — search variants

Class Numbers (4 kernels)

class-numbers/ — class numbers of real quadratic fields via BSGS:

class_numbers_v2.cu — production kernel (10^9 to 10^12 range)
class_number_rqf.cu — real quadratic field specialization
class_number_fast.cu — optimized inner loop
sieve_gpu.cu — GPU prime sieve

Kronecker Coefficients (3 kernels)

kronecker-coefficients/ — character tables and Kronecker triple computation:

kronecker_gpu.cu — full character table (S20: 3.7s, S30: 7.4 min, S40: 9.5 hr)
kronecker_fast.cu — optimized triple-sum
kronecker_compute.cu — targeted triple computation

Ramanujan Machine (2 kernels)

ramanujan-machine/ — automated discovery of continued fraction formulas:

ramanujan_gpu.cu — v1 kernel (equal-degree polynomials, exhausted)
ramanujan_v2.cu — v2 kernel (asymmetric-degree, where new discoveries live)

Prime Convergents (2 kernels)

prime-convergents/ — prime statistics of CF convergents:

prime_convergents.cu — v1 (uint64, depth ~38)
prime_convergents_v2.cu — v2 (uint128, depth ~75, 128-bit Miller-Rabin)

Erdos-Straus Conjecture (1 kernel)

erdos-straus/ — solution counting for 4/p = 1/x + 1/y + 1/z:

erdos_straus.cu — per-prime f(p) enumeration, tested to 10^9

Spectral Computations (4 kernels)

hausdorff-spectrum/ — Hausdorff dimension via transfer operator + Chebyshev collocation:

hausdorff_spectrum.cu — all 2^20 - 1 subsets of {1,...,20}

lyapunov-spectrum/ — Lyapunov exponents of CF digit sets:

lyapunov_spectrum.cu — full spectrum computation

minkowski-spectrum/ — Minkowski question-mark function:

minkowski_spectrum.cu — singularity spectrum

flint-hills/ — Flint Hills series partial sums:

flint_hills.cu — high-precision partial sum to 10B terms

Results

All computation results are open:

Website: bigcompute.science
Datasets: huggingface.co/cahlen
Source code: github.com/cahlen/idontknow
MCP server: mcp.bigcompute.science

License

MIT

Citation

@misc{humphreys2026bigcompute,
  author = {Humphreys, Cahlen},
  title = {bigcompute.science: GPU-Accelerated Computational Mathematics},
  year = {2026},
  url = {https://bigcompute.science}
}

Human-AI collaborative research (Cahlen Humphreys + Claude). All code and data open for verification.