README.md · cahlen/bigcompute-cuda-kernels at main

File size: 6,091 Bytes

---
language:
  - en
license: mit
tags:
  - cuda
  - gpu
  - number-theory
  - computational-mathematics
  - continued-fractions
  - zaremba
  - ramsey
  - kronecker-coefficients
  - class-numbers
  - hausdorff-dimension
  - ramanujan-machine
  - erdos-straus
  - prime-convergents
  - flint-hills
  - spectral-methods
  - bigcompute
library_name: other
pipeline_tag: other
datasets:
  - cahlen/zaremba-density
  - cahlen/zaremba-conjecture-data
  - cahlen/class-numbers-real-quadratic
  - cahlen/kronecker-coefficients
  - cahlen/hausdorff-dimension-spectrum
  - cahlen/continued-fraction-spectra
  - cahlen/ramanujan-machine-results
---

# bigcompute.science CUDA Kernels

51 custom CUDA kernels for GPU-accelerated computational mathematics research. These kernels power the experiments at [bigcompute.science](https://bigcompute.science).

All kernels are standalone — compile with `nvcc`, run from the command line. No PyTorch dependency.

## Hardware

Developed and tested on:
- **8x NVIDIA B200** (183 GB VRAM each, sm_90)
- **NVIDIA RTX 5090** (32 GB VRAM, sm_120)

Most kernels will run on any CUDA GPU (sm_50+). Compile with your target architecture:
```bash
nvcc -O3 -arch=sm_XX -o kernel kernel.cu -lm
```

## Kernels by Experiment

### Zaremba's Conjecture (25 kernels)

**Density enumeration** (`zaremba-density/`) — complete CF tree enumeration with bitset marking:
- `zaremba_density_gpu.cu` — production kernel, 65+ runs to 10^12
- `zaremba_density_v2.cu` — alternative implementation
- `zaremba_density_gpu_worksteal_v2.cu` — work-stealing variant for load balancing

**Transfer operator** (`zaremba-transfer-operator/`) — Chebyshev collocation spectral method:
- `transfer_operator.cu` — spectral gap computation for Ruelle operator

**Effective bound** (`zaremba-effective-bound/`) — Bourgain-Kontorovich proof framework:
- `spectral_gaps_fast.cu` — bulk spectral gap verification
- `spectral_gaps_primes.cu` — prime-indexed gaps
- `certify_rho_cuda.cu` — arb ball arithmetic certification
- `compute_Q0.cu` / `Q0_frolenkov_kan.cu` — effective constant extraction
- `count_representations.cu` — CF representation counting
- `dolgopyat_exact.cu` / `dolgopyat_profile.cu` — Dolgopyat estimate profiling
- `exponential_sum.cu` — exponential sum bounds
- `extract_eigenfunction.cu` — transfer operator eigenfunction extraction
- `flat_spectral_gap.cu` — uniform spectral gap verification
- `matrix_enum.cu` / `matrix_enum_multipass.cu` — SL(2,Z) matrix enumeration
- `minor_arc_primes.cu` / `minor_arc_profile.cu` — minor arc estimates
- `verify_all_gaps_fp64.cu` / `verify_gaps_interval.cu` / `verify_gaps_v2.cu` — gap verification suite
- `compute_c1_rigorous.cu` — rigorous constant computation

**Cayley diameters** (`zaremba-cayley-diameter/`) — BFS on Cayley graphs of SL(2,Z/pZ):
- `cayley_diameter.cu` / `cayley_gpu.cu` — full BFS diameter computation

**Transitivity** (`zaremba-transitivity/`) — algebraic verification:
- `check_transitivity.cu` — Dickson classification check

### Ramsey R(5,5) (7 kernels)

`ramsey-r55/` — search for 2-colorings of complete graphs with no monochromatic K5:
- `ramsey_gpu.cu` — base simulated annealing kernel
- `ramsey_incremental.cu` / `ramsey_incremental_v2.cu` — incremental K5 counter
- `ramsey_extend.cu` / `ramsey_extend_all.cu` — exhaustive extension checking (4.4T extensions of K42 to K43)
- `ramsey_fullcount.cu` — complete clique enumeration
- `ramsey_search.cu` / `ramsey_global.cu` / `ramsey_verified.cu` — search variants

### Class Numbers (4 kernels)

`class-numbers/` — class numbers of real quadratic fields via BSGS:
- `class_numbers_v2.cu` — production kernel (10^9 to 10^12 range)
- `class_number_rqf.cu` — real quadratic field specialization
- `class_number_fast.cu` — optimized inner loop
- `sieve_gpu.cu` — GPU prime sieve

### Kronecker Coefficients (3 kernels)

`kronecker-coefficients/` — character tables and Kronecker triple computation:
- `kronecker_gpu.cu` — full character table (S20: 3.7s, S30: 7.4 min, S40: 9.5 hr)
- `kronecker_fast.cu` — optimized triple-sum
- `kronecker_compute.cu` — targeted triple computation

### Ramanujan Machine (2 kernels)

`ramanujan-machine/` — automated discovery of continued fraction formulas:
- `ramanujan_gpu.cu` — v1 kernel (equal-degree polynomials, exhausted)
- `ramanujan_v2.cu` — v2 kernel (asymmetric-degree, where new discoveries live)

### Prime Convergents (2 kernels)

`prime-convergents/` — prime statistics of CF convergents:
- `prime_convergents.cu` — v1 (uint64, depth ~38)
- `prime_convergents_v2.cu` — v2 (uint128, depth ~75, 128-bit Miller-Rabin)

### Erdos-Straus Conjecture (1 kernel)

`erdos-straus/` — solution counting for 4/p = 1/x + 1/y + 1/z:
- `erdos_straus.cu` — per-prime f(p) enumeration, tested to 10^9

### Spectral Computations (4 kernels)

`hausdorff-spectrum/` — Hausdorff dimension via transfer operator + Chebyshev collocation:
- `hausdorff_spectrum.cu` — all 2^20 - 1 subsets of {1,...,20}

`lyapunov-spectrum/` — Lyapunov exponents of CF digit sets:
- `lyapunov_spectrum.cu` — full spectrum computation

`minkowski-spectrum/` — Minkowski question-mark function:
- `minkowski_spectrum.cu` — singularity spectrum

`flint-hills/` — Flint Hills series partial sums:
- `flint_hills.cu` — high-precision partial sum to 10B terms

## Results

All computation results are open:
- **Website**: [bigcompute.science](https://bigcompute.science)
- **Datasets**: [huggingface.co/cahlen](https://huggingface.co/cahlen)
- **Source code**: [github.com/cahlen/idontknow](https://github.com/cahlen/idontknow)
- **MCP server**: [mcp.bigcompute.science](https://mcp.bigcompute.science)

## License

MIT

## Citation

```bibtex
@misc{humphreys2026bigcompute,
  author = {Humphreys, Cahlen},
  title = {bigcompute.science: GPU-Accelerated Computational Mathematics},
  year = {2026},
  url = {https://bigcompute.science}
}
```

*Human-AI collaborative research (Cahlen Humphreys + Claude). All code and data open for verification.*