jigsawR / docs /performance-optimization-plan.md
pjt222's picture
Upload folder using huggingface_hub
e232e39 verified

jigsawR Performance Optimization Plan

Baseline Performance (December 2025)

Puzzle Type Size Pieces Median Time
Rectangular 5x5 25 50.3ms
Rectangular 8x8 64 104ms
Hexagonal 3 rings 19 302ms
Hexagonal 5 rings 61 1.06s
Concentric 3 rings 19 92.4ms

Key Observations

  1. Hexagonal puzzles are 6x slower than rectangular for similar piece counts
  2. Scaling: Hexagonal 61 pieces takes 1.06s vs rectangular 64 pieces at 104ms
  3. Target: Achieve 2x+ speedup for hexagonal puzzles (GitHub Issue #48)

Identified Hotspots

HIGH PRIORITY

# Issue File:Line Impact
1 String concat in loops rectangular_puzzle.R:134-146 O(n²) → O(n)
2 Vector grow-on-append unified_piece_generation.R:376 O(n²) → O(n)
3 Neighbor search O(n²) unified_renderer.R:1226 O(n²) → O(1)
4 SVG path parsing per render unified_renderer.R:1147 n×parse → 1×cache
5 Hex topology recalc unified_piece_generation.R:306 per-render → once

MEDIUM PRIORITY

# Issue File:Line Impact
6 Repeated bounds calc unified_piece_generation.R:373-389 3×calc → 1×cache
7 Environment lookups hexagonal_puzzle.R:90-120 micro-optimization

Implementation Plan

Phase 1: Quick Wins (Target: 1.5-2x speedup)

Estimated effort: 1-2 days

1.1 Pre-allocate Vectors Instead of Grow-on-Append

Location: unified_piece_generation.R:373-389, piece_positioning.R:287-299

# BEFORE (O(n²))
all_path_x <- c()
for (piece in pieces) {
  all_path_x <- c(all_path_x, x_coords)  # Growing vector
}

# AFTER (O(n))
all_path_x <- vector("numeric", estimated_size)
idx <- 1
for (piece in pieces) {
  n <- length(x_coords)
  all_path_x[idx:(idx + n - 1)] <- x_coords
  idx <- idx + n
}
# Or use list + unlist pattern
coords_list <- lapply(pieces, extract_coords)
all_path_x <- unlist(coords_list)

1.2 Build Piece ID Lookup Hash Map

Location: unified_renderer.R:1200-1324

# BEFORE (O(n) per lookup)
for (np in pieces) {
  np_id <- np$id %||% which(...)
  if (np_id == neighbor_id) { ... }
}

# AFTER (O(1) per lookup)
piece_lookup <- new.env(hash = TRUE)
for (i in seq_along(pieces)) {
  piece_lookup[[as.character(pieces[[i]]$id)]] <- i
}
# Lookup:
idx <- piece_lookup[[as.character(neighbor_id)]]
neighbor_piece <- pieces[[idx]]

1.3 Cache Environment Variables in Local Scope

Location: hexagonal_puzzle.R:90-120

# BEFORE (env lookup per call)
hex_p1 <- function() list(l = hex_l(0.2), w = hex_w(.hex_jigsaw_env$a))

# AFTER (local variable)
hex_gentab_optimized <- function(v1, v2, isnew) {
  env <- .hex_jigsaw_env  # Single lookup
  a <- env$a; b <- env$b; c <- env$c; d <- env$d; e <- env$e
  flip <- env$flip; t <- env$t
  # Use local variables throughout
}

Phase 2: Caching (Target: additional 1.5x)

Estimated effort: 3-5 days

2.1 Cache Parsed SVG Paths at Generation Time

Priority: HIGH Estimated impact: Significant - path is parsed 3-5x per piece during rendering

Problem: parse_svg_path() is called repeatedly on the same path:

  • calculate_piece_bounds() (unified_renderer.R:388) - label positioning
  • split_rect_path_into_edges() (unified_renderer.R:526) - rectangular edge extraction
  • split_concentric_path_into_edges() (unified_renderer.R:619) - concentric edge extraction
  • split_hex_path_into_edges() (unified_renderer.R:926) - hexagonal edge extraction

Implementation Steps:

  1. Modify piece generation (unified_piece_generation.R):
# In generate_rectangular_pieces_internal(), generate_hexagonal_pieces_internal(), etc.:
piece <- list(
  id = piece_id,
  path = piece_path,
  parsed_segments = parse_svg_path(piece_path),  # Cache once at generation
  center = c(center_x, center_y),
  ...
)
  1. Modify consumers (unified_renderer.R):
# In split_*_path_into_edges() and calculate_piece_bounds():
segments <- if (!is.null(piece$parsed_segments)) {
  piece$parsed_segments
} else {
  parse_svg_path(piece$path)  # Fallback for backwards compatibility
}

Files to modify:

  • R/unified_piece_generation.R - 3 locations (rect, hex, concentric piece creation)
  • R/unified_renderer.R - 4 locations (calculate_piece_bounds, 3x split_*_path_into_edges)

2.2 Pre-compute Hex Topology-to-Geometry Mappings

Priority: MEDIUM Estimated impact: Moderate - complex mapping computed per piece

Problem: topo_to_geo_map is computed at lines 303-327 during hexagonal piece generation. The mapping logic is repeated.

Implementation Steps:

  1. Extract mapping computation (unified_piece_generation.R):
# Create helper function
compute_hex_topo_geo_map <- function(piece_id, hex_pieces, n_rings) {
  # Extract mapping logic from lines 303-327
  topo_to_geo_map <- list()
  # ... mapping computation ...
  return(topo_to_geo_map)
}
  1. Store in piece object:
piece <- list(
  id = piece_id,
  path = hp$path,
  topo_to_geo_map = compute_hex_topo_geo_map(piece_id, hex_pieces, n_rings),
  ...
)

Files to modify:

  • R/unified_piece_generation.R - extract and cache mapping

2.3 Fix remaining grow-on-append in calculate_piece_bounds

Priority: LOW Location: unified_renderer.R:390-404

# BEFORE (O(n²))
xs <- c()
ys <- c()
for (seg in segments) {
  xs <- c(xs, seg$x)
  ys <- c(ys, seg$y)
}

# AFTER (O(n))
xs <- unlist(lapply(segments, function(seg) {
  if (seg$type == "C") c(seg$cp1x, seg$cp2x, seg$x)
  else if (seg$type %in% c("M", "L", "A")) seg$x
  else NULL
}), use.names = FALSE)

2.4 Single-Pass Canvas Size Calculation ✅ (Done in Phase 1.1)

Already implemented via calculate_pieces_bounds() in bezier_utils.R.


Phase 3: Parallelization (Target: 2-4x on multi-core)

Estimated effort: 1 week

3.1 Parallel Piece Generation

Location: unified_piece_generation.R:147-202 (rectangular), 278-366 (hexagonal)

# Using furrr for parallel processing
library(furrr)
plan(multisession, workers = parallel::detectCores() - 1)

pieces <- furrr::future_map(seq_len(n_pieces), function(i) {
  generate_single_piece_data(i, ...)
}, .options = furrr_options(seed = TRUE))

3.2 Parallel Batch Generation

Location: jigsawR_clean.R:391-443 (generate_puzzle_batch)

Already has infrastructure; enable parallel backend by default.


Acceptance Criteria

From GitHub Issue #48:

  • Benchmark current performance as baseline ✓
  • Implement at least one optimization technique ✓ (Phase 1.1, 1.2, 2.1, 2.3)
  • Achieve measurable speedup ✓ (15.67x on split_path operations via Phase 2.1)
  • All existing tests pass ✓ (1432 tests passing)
  • No breaking changes to public API ✓

Testing Strategy

  1. Unit tests: Verify optimized functions produce identical output
  2. Integration tests: Run existing test suite after each change
  3. Benchmark tests: Compare before/after timing for all puzzle types
  4. Regression tests: Verify SVG output byte-for-byte identical (same seed)

Rollback Plan

All optimizations will be:

  1. Implemented in separate commits with clear descriptions
  2. Feature-flagged where appropriate (e.g., options(jigsawR.parallel = TRUE))
  3. Documented with before/after benchmarks

Files to Modify

Phase 1

  • R/unified_piece_generation.R
  • R/unified_renderer.R
  • R/piece_positioning.R
  • R/hexagonal_puzzle.R

Phase 2

  • R/unified_piece_generation.R
  • R/unified_renderer.R
  • R/bezier_utils.R

Phase 3

  • R/unified_piece_generation.R
  • R/jigsawR_clean.R
  • DESCRIPTION (add furrr to Suggests)

Progress Tracking

Phase Task Status Speedup
1.1 Pre-allocate vectors ✅ Complete TBD
1.2 Piece ID hash map ✅ Complete TBD
1.3 Cache env variables ⏭️ Skipped Minimal impact
2.1 Cache parsed paths ✅ Complete 15.67x (on split_path ops)
2.2 Pre-compute hex mappings ⏭️ Skipped Already cached via fused_edges
2.3 Fix grow-on-append bounds ✅ Complete O(n²) → O(n)
3.1 Parallel piece gen ⏭️ Deferred Complex (RNG state)
3.2 Parallel batch gen ✅ Complete ~Nx (N workers)

Phase 1 Implementation Notes (December 2025)

1.1 Pre-allocate vectors - Replaced O(n²) grow-on-append patterns with O(n) list+unlist:

  • R/bezier_utils.R: Added extract_path_coords(), extract_all_piece_coords(), calculate_pieces_bounds()
  • R/unified_piece_generation.R: Hexagonal and concentric canvas calculation
  • R/piece_positioning.R: Concentric and hexagonal positioning bounds

1.2 Piece ID hash map - Replaced O(n²) neighbor lookups with O(1) hash map:

  • R/unified_renderer.R: Added piece_lookup environment and get_piece_by_id() helper in render_pieces_with_fusion_styled()

1.3 Cache env variables - Skipped: Would require significant restructuring of hexagonal_puzzle.R closures for minimal gain.

Phase 2 Implementation Notes (December 2025)

2.1 Cache parsed paths - Added parsed_segments caching at piece generation time:

  • R/unified_piece_generation.R: Added parsed_segments = parse_svg_path(path) to all 3 piece types (rect, hex, conc)
  • R/unified_renderer.R: Updated 6 functions to use cached segments:
    • split_rect_path_into_edges() (line 526)
    • split_concentric_path_into_edges() (line 623)
    • split_hex_path_into_edges() (line 935)
    • calculate_path_bounding_box_center() (line 373)
    • calculate_piece_bounds() (line 1630)
  • Measured speedup: 15.67x on split_*_path_into_edges operations (470ms → 30ms for 100 calls)

2.2 Pre-compute hex mappings - Skipped: Analysis showed topo_to_geo_map is already computed once during hexagonal piece generation and results stored in fused_edges. No additional caching needed.

2.3 Fix grow-on-append in calculate_piece_bounds - Replaced O(n²) vector concatenation with O(n) list+unlist:

# Before (O(n²))
xs <- c()
for (seg in segments) { xs <- c(xs, seg$x) }

# After (O(n))
coord_lists <- lapply(segments, function(seg) list(x = seg$x, y = seg$y))
xs <- unlist(lapply(coord_lists, `[[`, "x"), use.names = FALSE)

Phase 3 Implementation Notes (December 2025)

3.1 Parallel piece generation - Deferred: The piece generators use environment-based RNG state (.rect_jigsaw_env, .hex_jigsaw_env) which is problematic for parallelization. Would require significant restructuring to make generators functional (stateless). Given that Phase 2 caching already provides major speedup, this is deferred for future consideration.

3.2 Parallel batch generation - Added optional parallel execution to generate_puzzle_batch():

  • R/jigsawR_clean.R: Updated function with parallel and workers parameters
  • Uses furrr::future_map() with future::multisession backend
  • Gracefully falls back to sequential if:
    • furrr/future packages not available
    • jigsawR package not installed (required for workers)
  • Expected speedup: Near-linear with worker count (N workers → ~Nx speedup)
  • Requirements: Package must be installed (devtools::install()) for parallel mode

Usage example:

# Sequential (works with devtools::load_all())
results <- generate_puzzle_batch(variations)

# Parallel (requires installed package)
results <- generate_puzzle_batch(variations, parallel = TRUE, workers = 4)

References

  • GitHub Issue #48: Performance optimization
  • tests/benchmark/baseline_benchmark.R: Comprehensive benchmark suite
  • R/performance_utils.R: Existing optimization utilities