-
-
Notifications
You must be signed in to change notification settings - Fork 425
[Feat] Analysis: Thread-safety blockers preventing r.proj parallelization #7196
Description
Is your feature request related to a problem? Please describe.
r.proj has a commented-out #pragma omp parallel for at main.c:715,
disabled since 2012 (revision r52882) due to segmentation faults.
Reprojecting large rasters (100M+ cells) with bicubic or lanczos
interpolation is extremely slow on modern multi-core systems with no
way to utilize available CPU cores.
Describe the solution you'd like
Parallelize the reprojection loop in r.proj using OpenMP by resolving
the 5 thread-safety blockers identified below.
Describe alternatives you've considered
- Workflow-level parallelization via PyGRASS GridModule (works today
but requires user scripting and has process overhead) - Using gdalwarp externally (loses GRASS metadata, requires
export/import round-trip) - Waiting for Issue [Feat] make parallel raster reading thread safe #5738 (not required — r.proj's reprojection loop
does not call Rast_get_row(), as documented below)
Additional context
The 5 blockers
- Shared tile cache — get_block() in readcell.c:115–145
c->grid[], c->refs[], and G_lrand48() (global PRNG seed) are all shared mutable state. If Thread A evicts a tile Thread B is reading, Thread B gets a dangling pointer → segfault. This is the direct cause of the 2012 failure. - Static globals in GPJ_transform() — lib/proj/do_proj.c:42
cstatic double METERS_in = 1.0, METERS_out = 1.0;
Every thread writes to these on every call. In r.proj's loop the values happen to be constant (same CRS pair per pixel), so the race currently produces correct results by accident — but it is undefined behavior in C and fragile to any future change. - Shared PJ * transformation object — main.c:727
A single tproj.pj object is shared across all threads. PROJ documentation explicitly states a PJ * on PJ_DEFAULT_CTX is not safe for concurrent use. Each thread needs its own PJ_CONTEXT and PJ *. - Single output row buffer obuffer — main.c:719
Safe for column-level parallelism within one row, but parallelizing the outer row loop requires per-thread row buffers. - G_percent() inside the loop — main.c:709
Not thread-safe inside a parallel region (same class of issue as [Bug] G_percent is not safe to be called from parallel code #5776).
Relationship to #5738
r.proj's reprojection loop does not call Rast_get_row() — the GRASS raster library's shared state (R__.fileinfo[x].cur_row etc., root cause of the r.patch failure in #2809) is not involved. The 5 blockers above are entirely within raster/r.proj/ and lib/proj/do_proj.c.
This means r.proj can be parallelized independently of #5738, without waiting for a library-wide rewrite.
Question
Rast_put_row() is called after each completed output row. Does it carry shared state analogous to Rast_get_row()? If yes, output writes would need to be serialized or use pre-allocated indexed row buffers. Happy to investigate further if useful.