- 
                Notifications
    You must be signed in to change notification settings 
- Fork 299
move POINT geometries out of a list column, into a matrix #2059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| Memory size: decreases with a factor 18: library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
n = 1e6
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
st_as_sf(df, coords = c("x", "y")) |>
  st_geometry() -> pts
(s1 = object.size(pts))
# 24002864 bytes
(s2 = object.size(pts[]))
# 432002312 bytes
as.numeric(s2)/as.numeric(s1)
# [1] 17.99795 | 
| This one is pretty obvious: library(sf)
# Linking to GEOS 3.10.2, GDAL 3.4.3, PROJ 8.2.1; sf_use_s2() is TRUE
n = 1e6
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
st_as_sf(df, coords = c("x", "y")) |>
  st_geometry() -> pts
pts   |> st_is_empty() |> system.time()
#    user  system elapsed 
#   1.692   0.072   1.764 
system.time(pts0 <- pts[])
#    user  system elapsed 
#   1.104   0.036   1.140 
pts0 |> st_is_empty()  |> system.time()
#    user  system elapsed 
#   1.853   0.057   1.909 
pts |> st_combine() |> system.time()
#    user  system elapsed 
#   0.002   0.000   0.002 
pts0 |> st_combine() |> system.time()
#    user  system elapsed 
#   3.119   0.155   3.275  | 
| This is a very cool idea! The use of an attribute to maintain a semblance of backward compatibility with the structure is clever. I do wonder if this change is going to result in enough performance gains to justify the maintenance cost of the additional representation. My adventures playing with this branch are below...basically all the ways I know about based on the benchmarks above.  While I certainly don't mind if this representation exists in sf, my first impression is that it's rather complicated and not that fast. If performance is the motivation, I wonder if broadening the sf API to work with other S3 classes (e.g.,  Another low-hanging performance fruit would be to drop the cached bbox and geometry classes (that's most of why the geos package is faster than sf in most cases). library(sf)
#> Linking to GEOS 3.11.1, GDAL 3.6.0, PROJ 9.1.1; sf_use_s2() is TRUE
n = 1e5
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
st_as_sf(df, coords = c("x", "y")) |>
    st_geometry() -> pts
pts0 <- pts[]
wk_xy <- wk::as_xy(df)
geos_pts <- geos::as_geos_geometry(wk_xy)
lobstr::obj_size(pts)
#> 2.40 MB
lobstr::obj_size(pts0)
#> 12.80 MB
lobstr::obj_size(wk_xy)
#> 2.40 MB
# not really accurate...only includes R allocs
lobstr::obj_size(geos_pts)
#> 7.20 MB
bench::mark(
    st_is_empty(pts),
    st_is_empty(pts0),
    wk::wk_meta(wk_xy)$size == 0,
    wk::wk_meta(geos_pts)$size == 0,
    geos::geos_is_empty(geos_pts)
)
#> # A tibble: 5 × 6
#>   expression                           min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                      <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 st_is_empty(pts)                 33.83ms  34.48ms      28.9    1.16MB     12.8
#> 2 st_is_empty(pts0)                46.59ms   46.8ms      21.4    1.15MB     49.8
#> 3 wk::wk_meta(wk_xy)$size == 0      2.06ms   2.25ms     421.      3.1MB     53.9
#> 4 wk::wk_meta(geos_pts)$size == 0   4.21ms   4.43ms     221.     3.06MB     31.1
#> 5 geos::geos_is_empty(geos_pts)   756.29µs 840.21µs    1180.   406.18KB     22.7
bench::mark(
    st_combine(pts),
    st_combine(pts0),
    wk::wk_collection(wk_xy, wk::wk_geometry_type("multipoint")),
    geos::geos_make_collection(geos_pts, "multipoint"),
    check = FALSE
)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 4 × 6
#>   expression                                                        min   median
#>   <bch:expr>                                                   <bch:tm> <bch:tm>
#> 1 st_combine(pts)                                               97.99µs 100.61µs
#> 2 st_combine(pts0)                                             139.98ms 181.44ms
#> 3 wk::wk_collection(wk_xy, wk::wk_geometry_type("multipoint"))   1.85ms   2.03ms
#> 4 geos::geos_make_collection(geos_pts, "multipoint")             6.09ms   6.51ms
#> # … with 3 more variables: `itr/sec` <dbl>, mem_alloc <bch:byt>, `gc/sec` <dbl> | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes the POINT object (XY and  XYZ) more like sp, but:
> data.frame(z=rnorm(3), x=1:3, y=3:1) |> st_as_sf(coords = c("x", "y")) -> x
> as(x$geometry[], "Spatial")
SpatialPoints:
     coords.x1 coords.x2
[1,]         1         3
[2,]         2         2
[3,]         3         1
Coordinate Reference System (CRS) arguments: NA 
> as(x$geometry, "Spatial")
Error in (function (classes, fdef, mtable)  : 
  unable to find an inherited method for function 'coordinates' for signature '"NULL"'
> as(x, "Spatial")
Error in h(simpleError(msg, call)) : 
  error in evaluating the argument 'x' in selecting a method for function 'addAttrToGeom': unable to find an inherited method for function 'coordinates' for signature '"NULL"'
> sp_x <- as(x$geometry[], "Spatial")
> st_as_sfc(sp_x)
Geometry set for 3 features 
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 1 ymin: 1 xmax: 3 ymax: 3
CRS:           NA
POINT (1 3)
POINT (2 2)
POINT (3 1)
> st_as_sfc(sp_x) |> unclass() |> str()
List of 3
 $ : 'XY' num [1:2] 1 3
 $ : 'XY' num [1:2] 2 2
 $ : 'XY' num [1:2] 3 1
 - attr(*, "precision")= num 0
 - attr(*, "bbox")= 'bbox' Named num [1:4] 1 1 3 3
  ..- attr(*, "names")= chr [1:4] "xmin" "ymin" "xmax" "ymax"
 - attr(*, "crs")=List of 2
  ..$ input: chr NA
  ..$ wkt  : chr NA
  ..- attr(*, "class")= chr "crs"
 - attr(*, "n_empty")= int 0
| @rsbivand thanks, that should now work; I finished implementation so far only to pass  | 
| I just encountered an other case where this pull request could provide a big speedup. When using  | 
| ... which is rather stupid (the 20 seconds), given that #2046 suggests  | 
| Today ALTREP support for lists appeared in R SVN. Do you think it can somehow affect  | 
| Thanks for reporting! I'm not an altrep expert, but would expect no miracles; I think it's mostly useful for (i) data structures out of memory (e.g. in a memory mapped area) or (ii) unrealised arrays (e.g. 1:1e12, where you're going to compute the value when you need it rather than realize it in memory). The first would help with massive point sets (but merely make it possible, not fast), the second doesn't help us here. The speedup of this PR (if any) is due to the flat memory model of matrices vs. the arbitrary memory layout of lists; accessing lists elements is more expensive because of that; the smaller memory footprint may be even more important. | 
| FWIW I actually think ALTREP for lists will be very useful here. It will let you use the data structure you've prototyped here without breaking backwards compatibility with users of the previous data structure (i.e., everybody). | 
| Great idea, do you know of an example of such a thing? | 
| Great example; that should run better, now. | 
| Thanks @edzer , I tested a bit more and noticed two other examples changing for unexpected behavior. Sorry for having these issues iteratively, the popup up in some of the unit test. Let me also know if your not looking for these detailed tests yet require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
require(dplyr)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
n <- 4
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
dfsf<-st_as_sf(df, coords = c("x", "y"))
# The filtered data turns empty:
dfsf |> filter(rep(T,4))
#> Simple feature collection with 4 features and 1 field (with 4 geometries empty)
#> Geometry type: GEOMETRY
#> Dimension:     XY
#> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
#> CRS:           NA
#>             z                 geometry
#> 1 -0.32671172 GEOMETRYCOLLECTION EMPTY
#> 2 -0.05574357 GEOMETRYCOLLECTION EMPTY
#> 3 -1.34424667 GEOMETRYCOLLECTION EMPTY
#> 4 -0.24700773 GEOMETRYCOLLECTION EMPTY
# there is no empty point inserted
dfsf$geometry[2]<-st_point()
dfsf
#> Simple feature collection with 4 features and 1 field (with 4 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 0.02291049 ymin: 0.2227787 xmax: 0.6861906 ymax: 0.4822213
#> CRS:           NA
#>             z                     geometry
#> 1 -0.29728105  POINT (0.3049286 0.4746749)
#> 2 -0.33397322  POINT (0.5350945 0.2988531)
#> 3 -0.93209304  POINT (0.6861906 0.4822213)
#> 4  0.08929689 POINT (0.02291049 0.2227787)
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.1.2 (2021-11-01)
#>  os       Ubuntu 22.04.2 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Amsterdam
#>  date     2023-06-30
#>  pandoc   2.19.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  ...
#>  dplyr       * 1.1.2   2023-04-20 [1] CRAN (R 4.1.2)
#>  ...
#>  sf          * 1.0-14  2023-06-29 [1] Github (r-spatial/sf@3ec76b8)
#>  ...
#> 
#>  [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.1
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ────────────────────────────────────────────────────────────────────────────── | 
| This should now also run. | 
| That works for the examples above but filter fails if there is an empty location: require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
n <- 4
data.frame(z = rnorm(n), x = runif(n), y = runif(n)) -> df
dfsf<-st_as_sf(df, coords = c("x", "y"))
dfsf$geometry[2]<-st_point()
dfsf |> dplyr::filter(rep(T,4))
#> Error in st_as_sf.data.frame(NextMethod(), coords = attr(.data, "sf_column"), : missing values in coordinates not allowedI also see a few other changes in behavior/test error but I have not been able to make small reproducible examples out of those yet | 
| It seems dplyr filter drops the CRS: require(dplyr)
require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
data(meuse, package = "sp")
meuse_sf = st_as_sf(meuse, coords = c("x", "y"), crs = 28992, agr = "constant")
st_crs(meuse_sf)
#> Coordinate Reference System:
#>   User input: EPSG:28992 
#>   wkt:
....
#>         BBOX[50.75,3.2,53.7,7.22]],
#>     ID["EPSG",28992]]
st_crs(dplyr::filter(meuse_sf, copper>80))
#> Coordinate Reference System: NA
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
.....
#>  sf          * 1.0-14  2023-07-01 [1] Github (r-spatial/sf@203187d)
.....
#> 
#>  [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.1
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────Created on 2023-07-02 with reprex v2.0.2 | 
| Hi, I have been doing some more testing. In general I use the test suite from  I have found a few more cases that show some strange/undesired? behavior. require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
d<-st_as_sf(data.frame(x=c(1:4), y=c(1:3,NA)), coords=c("x","y"), na.fail = F)
st_is_empty(st_geometry(d))
#> [1] FALSE FALSE FALSE FALSE
st_is_empty(d) 
#> [1] FALSE FALSE FALSE FALSE
# should be c(F,F,F, T)?, now is c(F,F,F,F)
dput(st_geometry(d))
#> structure(list(NULL, NULL, NULL, NULL), points = structure(c(1, 
#> 2, 3, 4, 1, 2, 3, NA), dim = c(4L, 2L)), points_dim = "XY", n_empty = 0L, precision = 0, crs = structure(list(
#>     input = NA_character_, wkt = NA_character_), class = "crs"), bbox = structure(c(xmin = 1, 
#> ymin = 1, xmax = 4, ymax = 3), class = "bbox"), class = c("sfc_POINT", 
#> "sfc"))
dput(st_geometry(d[1:4,])) 
#> structure(list(structure(c(1, 1), class = c("XY", "POINT", "sfg"
#> )), structure(c(2, 2), class = c("XY", "POINT", "sfg")), structure(c(3, 
#> 3), class = c("XY", "POINT", "sfg")), structure(c(4, NA), class = c("XY", 
#> "POINT", "sfg"))), n_empty = 0L, precision = 0, crs = structure(list(
#>     input = NA_character_, wkt = NA_character_), class = "crs"), bbox = structure(c(xmin = 1, 
#> ymin = 1, xmax = 4, ymax = 3), class = "bbox"), class = c("sfc_POINT", 
#> "sfc"))
# sub setting directly reduces to the old structure, keeping the matrix might be nicer and makes sure the benefits are throughout scripts
e<-st_as_sf(data.frame(x=c(1:3, NA), y=c(1:3,NA)), coords=c("x","y"), na.fail = F)
st_is_empty(st_geometry(e)) 
#> [1] FALSE FALSE FALSE  TRUE
# works correctly
st_is_empty(st_geometry(dplyr::filter(e, rep(T,4)))) 
#> Error in scan(text = lst[[length(lst)]], quiet = TRUE): scan() expected 'a real', got 'ParseException:'
#> Error in (function (msg) : ParseException: Unexpected EOF parsing WKB
# fails,  one difference I see is that a Z dimension is added to points_dim.
sessioninfo::session_info(pkgs = "attached")
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.2 (2023-10-31)
#>  os       Ubuntu 22.04.3 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Amsterdam
#>  date     2024-01-22
#>  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package * version date (UTC) lib source
#>  sf      * 1.0-14  2024-01-22 [1] Github (r-spatial/sf@d38b3dc)
#> 
#>  [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────Created on 2024-01-22 with reprex v2.0.2 | 
| Another small change/corner case casting and empty  require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
st_cast(st_sfc(),"POINT")
#> Error in cls[2, ]: incorrect number of dimensions
sessioninfo::session_info("attached")
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.2 (2023-10-31)
#>  os       Ubuntu 22.04.3 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Amsterdam
#>  date     2024-01-23
#>  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package * version date (UTC) lib source
#>  sf      * 1.0-14  2024-01-22 [1] Github (r-spatial/sf@d38b3dc)
#> 
#>  [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────This used to work fine: require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
st_cast(st_sfc(),"POINT")
#> Geometry set for 0 features 
#> Bounding box:  xmin: NA ymin: NA xmax: NA ymax: NA
#> CRS:           NA
sessioninfo::session_info("attached")
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.3.2 (2023-10-31)
#>  os       Ubuntu 22.04.3 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Amsterdam
#>  date     2024-01-23
#>  pandoc   3.1.1 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package * version date (UTC) lib source
#>  sf      * 1.0-15  2023-12-18 [1] CRAN (R 4.3.2)
#> 
#>  [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.3
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────Created on 2024-01-23 with reprex v2.0.2 | 
| Great to see progress! @edzer am I right to notice that this resolves the cast error I posted today but does not address the remarks from yesterday? | 
| Additionally I found this issue with replacing values: require(sf)
d<-st_as_sf(data.frame(x=1:3,y=1:3), coords = c("x","y"))
e<-st_as_sf(data.frame(x=4:5,y=4:5), coords = c("x","y"))
# Works correctly if both are of the list type:
replace(st_geometry(d)[T,], c(F,T,T), st_geometry(e)[T,])
#> Geometry set for 3 features 
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 1 ymin: 1 xmax: 5 ymax: 5
#> CRS:           NA
#> POINT (1 1)
#> POINT (4 4)
#> POINT (5 5)
# However not of the replacement is to the matrix type:
replace(st_geometry(d)[T,], c(F,T,T), st_geometry(e))
#> Geometry set for 3 features  (with 2 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 1 ymin: 1 xmax: 1 ymax: 1
#> CRS:           NA
#> POINT (1 1)
#> POINT EMPTY
#> POINT EMPTY
# Replace is equivalent to:
dd<-st_geometry(d)[T,]
ee<-st_geometry(e)
dd[c(F,T,T)]<-ee
dd
#> Geometry set for 3 features  (with 2 geometries empty)
#> Geometry type: POINT
#> Dimension:     XY
#> Bounding box:  xmin: 1 ymin: 1 xmax: 1 ymax: 1
#> CRS:           NA
#> POINT (1 1)
#> POINT EMPTY
#> POINT EMPTY
sessioninfo::package_info("attached")
#>  package * version date (UTC) lib source
#>  sf      * 1.0-16  2024-01-23 [1] Github (r-spatial/sf@4ce501e) | 
Fix value replacement for sfc_POINT
ensure vctrs operations work
| I just check the most recent version of this branch  require(sf)
#> Loading required package: sf
#> Linking to GEOS 3.10.2, GDAL 3.4.1, PROJ 8.2.1; sf_use_s2() is TRUE
st_sfc(st_point(),st_point(c(1,2))) |> st_is_empty()
#> [1]  TRUE FALSE
data.frame(x=c(NA,1),y=c(NA,2)) |> st_as_sf(coords=c('x','y'), na.fail=F) |> st_is_empty()
#> [1] TRUE TRUE
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.4.2 (2024-10-31)
#>  os       Ubuntu 22.04.5 LTS
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_US.UTF-8
#>  ctype    en_US.UTF-8
#>  tz       Europe/Amsterdam
#>  date     2025-04-10
#>  pandoc   3.2 @ /usr/lib/rstudio/resources/app/bin/quarto/bin/tools/x86_64/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version date (UTC) lib source
#>  class         7.3-23  2025-01-01 [4] CRAN (R 4.4.2)
#>  classInt      0.4-11  2025-01-08 [1] CRAN (R 4.4.0)
#>  cli           3.6.4   2025-02-13 [1] CRAN (R 4.4.2)
#>  DBI           1.2.3   2024-06-02 [1] CRAN (R 4.4.0)
#>  digest        0.6.37  2024-08-19 [1] CRAN (R 4.4.0)
#>  e1071         1.7-16  2024-09-16 [1] CRAN (R 4.4.0)
#>  evaluate      1.0.1   2024-10-10 [1] CRAN (R 4.4.0)
#>  fastmap       1.2.0   2024-05-15 [1] CRAN (R 4.4.0)
#>  fs            1.6.5   2024-10-30 [1] CRAN (R 4.4.0)
#>  glue          1.8.0   2024-09-30 [1] CRAN (R 4.4.0)
#>  htmltools     0.5.8.1 2024-04-04 [1] CRAN (R 4.4.0)
#>  KernSmooth    2.23-26 2025-01-01 [4] CRAN (R 4.4.2)
#>  knitr         1.49    2024-11-08 [1] CRAN (R 4.4.0)
#>  lifecycle     1.0.4   2023-11-07 [1] CRAN (R 4.4.0)
#>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.4.0)
#>  proxy         0.4-27  2022-06-09 [1] CRAN (R 4.4.0)
#>  Rcpp          1.0.14  2025-01-12 [1] CRAN (R 4.4.2)
#>  reprex        2.1.1   2024-07-06 [1] CRAN (R 4.4.0)
#>  rlang         1.1.5   2025-01-17 [1] CRAN (R 4.4.2)
#>  rmarkdown     2.29    2024-11-04 [1] CRAN (R 4.4.0)
#>  rstudioapi    0.17.1  2024-10-22 [1] CRAN (R 4.4.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.4.0)
#>  sf          * 1.0-21  2025-04-10 [1] Github (r-spatial/sf@08e0e1b)
#>  units         0.8-7.1 2025-04-10 [1] Github (r-quantities/units@3fa5ae6)
#>  withr         3.0.2   2024-10-28 [1] CRAN (R 4.4.0)
#>  xfun          0.51.2  2025-02-25 [1] Github (yihui/xfun@e2197a9)
#>  yaml          2.3.10  2024-07-26 [1] CRAN (R 4.4.0)
#> 
#>  [1] /home/bart/R/x86_64-pc-linux-gnu-library/4.4
#>  [2] /usr/local/lib/R/site-library
#>  [3] /usr/lib/R/site-library
#>  [4] /usr/lib/R/library
#> 
#> ────────────────────────────────────────────────────────────────────────────── | 
This PR moves
POINTgeometries out of the list column, leaves a list column filled withNULLvalues in place, and moves the point coordinates in annx 2 (or 3 for XYM/XYZ, or 4 for XYZM) matrix attribute calledpoints. The class dim label (XY/XYZ/XYM/XYZM) is put in attributepoints_dim. This happens when:st_as_sf.data.frameis called with e.g.coordsspecifying the coordinate columssfc, i.e. when coming back fromst_read()or any of the GEOS functions.Right now,
a[]restores the original list ofsfgobjects:This may give some performance changes, both in memory footprint and runtime; still need to do benchmarks.
Comments welcome!