Skip to content

[Feat] Analysis: Thread-safety blockers preventing r.proj parallelization #7196

@HUN-sp

Description

@HUN-sp

Is your feature request related to a problem? Please describe.

r.proj has a commented-out #pragma omp parallel for at main.c:715,
disabled since 2012 (revision r52882) due to segmentation faults.
Reprojecting large rasters (100M+ cells) with bicubic or lanczos
interpolation is extremely slow on modern multi-core systems with no
way to utilize available CPU cores.

Describe the solution you'd like

Parallelize the reprojection loop in r.proj using OpenMP by resolving
the 5 thread-safety blockers identified below.

Describe alternatives you've considered

  • Workflow-level parallelization via PyGRASS GridModule (works today
    but requires user scripting and has process overhead)
  • Using gdalwarp externally (loses GRASS metadata, requires
    export/import round-trip)
  • Waiting for Issue [Feat] make parallel raster reading thread safe #5738 (not required — r.proj's reprojection loop
    does not call Rast_get_row(), as documented below)

Additional context

The 5 blockers

  1. Shared tile cache — get_block() in readcell.c:115–145
    c->grid[], c->refs[], and G_lrand48() (global PRNG seed) are all shared mutable state. If Thread A evicts a tile Thread B is reading, Thread B gets a dangling pointer → segfault. This is the direct cause of the 2012 failure.
  2. Static globals in GPJ_transform() — lib/proj/do_proj.c:42
    cstatic double METERS_in = 1.0, METERS_out = 1.0;
    Every thread writes to these on every call. In r.proj's loop the values happen to be constant (same CRS pair per pixel), so the race currently produces correct results by accident — but it is undefined behavior in C and fragile to any future change.
  3. Shared PJ * transformation object — main.c:727
    A single tproj.pj object is shared across all threads. PROJ documentation explicitly states a PJ * on PJ_DEFAULT_CTX is not safe for concurrent use. Each thread needs its own PJ_CONTEXT and PJ *.
  4. Single output row buffer obuffer — main.c:719
    Safe for column-level parallelism within one row, but parallelizing the outer row loop requires per-thread row buffers.
  5. G_percent() inside the loop — main.c:709
    Not thread-safe inside a parallel region (same class of issue as [Bug] G_percent is not safe to be called from parallel code #5776).

Relationship to #5738
r.proj's reprojection loop does not call Rast_get_row() — the GRASS raster library's shared state (R__.fileinfo[x].cur_row etc., root cause of the r.patch failure in #2809) is not involved. The 5 blockers above are entirely within raster/r.proj/ and lib/proj/do_proj.c.
This means r.proj can be parallelized independently of #5738, without waiting for a library-wide rewrite.

Question
Rast_put_row() is called after each completed output row. Does it carry shared state analogous to Rast_get_row()? If yes, output writes would need to be serialized or use pre-allocated indexed row buffers. Happy to investigate further if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions