From 6eb13b2ee12c7290c59be53387dfed11cfa6c5a7 Mon Sep 17 00:00:00 2001 From: Aimee Barciauskas Date: Fri, 10 Oct 2025 13:30:12 -0700 Subject: [PATCH 1/3] initial updates to benchmark-tiles.ipynb --- .../titiler/titiler-cmr/benchmark-tiles.ipynb | 89 ++++++++++--------- 1 file changed, 47 insertions(+), 42 deletions(-) diff --git a/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb b/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb index 058ed5c..08a39ce 100644 --- a/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb +++ b/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb @@ -9,6 +9,8 @@ "\n", "This notebook walks you through a workflow to **benchmark performance** of a [TiTiler-CMR](https://github.com/developmentseed/titiler-cmr) deployment for a given Earthdata CMR dataset.\n", "\n", + "This notebook benchmarks tiling [GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) at GES DISC](https://data.nasa.gov/dataset/gpm-imerg-final-precipitation-l3-1-day-0-1-degree-x-0-1-degree-v07-gpm-3imergdf-at-ges-dis-13ed8) as an example using the titiler-cmr `xarray` backend and [HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m v2.0](https://www.earthdata.nasa.gov/data/catalog/lpcloud-hlss30-2.0) as an example using the titiler-cmr `rasterio` backend.\n", + "\n", "\n", "> **What is TiTiler-CMR?**\n", ">\n", @@ -50,13 +52,15 @@ "source": [ "## TiTiler-CMR Setup\n", "\n", - "`titiler-cmr` is a NASA-focused application that accepts Concept IDs and uses the Common Metadata Repository (CMR) to discover and serve associated granules as tiles. You can deploy your own instance of `titiler_cmr` using the [official guide](https://github.com/developmentseed/titiler-cmr), or use a public instance that is already deployed.\n", + "`titiler-cmr` is a NASA-focused application that accepts Concept IDs and uses the Common Metadata Repository (CMR) to discover and serve associated granules as tiles. You can deploy your own instance of `titiler-cmr` using the [official guide](https://github.com/developmentseed/titiler-cmr), or use an existing deployment.\n", "\n", - "For this walkthrough, we will use the public instance hosted by [Open VEDA](https://staging.openveda.cloud/api/titiler-cmr/).\n", + "For this walkthrough, we will use the [Open VEDA](https://www.earthdata.nasa.gov/data/tools/veda) deployment: [https://staging.openveda.cloud/api/titiler-cmr/](https://staging.openveda.cloud/api/titiler-cmr/).\n", "\n", "To get started with a dataset, you need to:\n", "- Choose a Titiler-CMR endpoint\n", "- Pick a CMR dataset (by concept ID)\n", + "\n", + "For the following, you Can use titiler-cmr's compatibility endpoint [ADD LINK TO DOCS ONCE https://github.com/developmentseed/titiler-cmr/pull/80 IS MERGED]:\n", "- Identify the assets/variables/bands you want to visualize\n", "- Define a temporal interval (`start/end` ISO range) and, if needed, a time step (e.g., daily).\n", "- Select a backend that matches your dataset’s structure\n", @@ -73,7 +77,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": null, "id": "55ad2e88", "metadata": {}, "outputs": [ @@ -105,7 +109,8 @@ "metadata": {}, "source": [ "## Tile Generation Benchmarking\n", - "In this part, we are going to measure the tile generation performance across different zoom levels using `titiler_cmr_benchmark.benchmark_viewport` function. \n", + "\n", + "The code below demonstrates how to benchmark tile generation performance across different zoom levels using `titiler_cmr_benchmark.benchmark_viewport` function. \n", "This function simulates the load of a typical viewport render in a slippy map, where multiple adjacent tiles must be fetched in parallel to draw a single view.\n" ] }, @@ -119,18 +124,15 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": null, "id": "4e7a8f91-ce75-4afc-85de-b1a6b4b0f48d", "metadata": {}, "outputs": [], "source": [ "endpoint = \"https://staging.openveda.cloud/api/titiler-cmr\"\n", "\n", - "concept_id = \"C2723754864-GES_DISC\"\n", - "datetime_range = \"2022-04-01T00:00:01Z/2022-04-02T23:59:59Z\"\n", - "variable = \"precipitation\"\n", - "\n", "ds_xarray = DatasetParams(\n", + " # GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) at GES DISC\n", " concept_id=\"C2723754864-GES_DISC\",\n", " backend=\"xarray\",\n", " datetime_range=\"2022-03-01T00:00:01Z/2022-03-01T23:59:59Z\",\n", @@ -142,7 +144,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": null, "id": "9823ed5f-5828-47eb-834f-81d52890c2ec", "metadata": {}, "outputs": [ @@ -191,16 +193,16 @@ "metadata": {}, "source": [ "### Zoom Levels\n", - "Zoom levels determine the detail and extent of the area being rendered. At lower zoom levels, a single tile covers a large spatial area and may intersect many granules. This usually translates to more I/O, more resampling/mosaic work, higher latency, and higher chance of timeouts errors.\n", + "Zoom levels determine the detail and extent of the area being rendered. At lower zoom levels, a single tile covers a large spatial area and requires more data be loaded relative to higher zoom levels. Sometimes, data must be loaded from multiple granules. Loading data from multiple granules is not required for the example dataset, as each granule has global coverage. In general, lower zoom levels translates to more I/O, more resampling/mosaic work, higher latency, and higher chance of timeouts errors.\n", "\n", - "As you increase zoom, each tile covers a smaller area, reducing the number of intersecting granules and the amount of work per request. \n", + "As you increase zoom, each tile covers a smaller area, reducing the amount of work per request. \n", "\n", "We'll define a range of zoom levels to test to see how performance varies." ] }, { "cell_type": "code", - "execution_count": 8, + "execution_count": null, "id": "bde8b55b", "metadata": {}, "outputs": [], @@ -227,7 +229,7 @@ }, { "cell_type": "code", - "execution_count": 9, + "execution_count": null, "id": "627eca1c", "metadata": {}, "outputs": [ @@ -267,7 +269,7 @@ }, { "cell_type": "code", - "execution_count": 11, + "execution_count": null, "id": "15ba4c0b-6241-4fe0-8e82-d9271b4c668c", "metadata": {}, "outputs": [ @@ -449,7 +451,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "id": "7b361ecc", "metadata": {}, "outputs": [ @@ -730,7 +732,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": null, "id": "a69a9d32", "metadata": {}, "outputs": [ @@ -839,20 +841,20 @@ "metadata": {}, "source": [ "### Rasterio Backend (COG/Band-based datasets)\n", - "In this example, we will benchmark a CMR dataset that is structured as Cloud Optimized GeoTIFFs (COGs) with individual bands. We will use the `rasterio` backend for this dataset.\n", + "In this example, we will benchmark [HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m v2.0 (HLS)](https://www.earthdata.nasa.gov/data/catalog/lpcloud-hlss30-2.0). HLS granules are Cloud Optimized GeoTIFFs (COGs) with individual bands. We will use the `rasterio` backend for this dataset.\n", "\n", - "In general, the lower the zoom level, the more files need to be opened to render a tile, which can lead to increased latency. Additionally, datasets with larger file sizes or more complex structures may also experience higher latency.\n", + "Since HLS granules each cover a small spatial extent, the lower the zoom level (more zoomed out), the more files need to be opened to render a tile. So lower zoom levels lead to increased latency. Additionally, datasets with larger file sizes or more complex structures may also experience higher latency.\n", "\n", - "In Rasterio, each `/tile` request:\n", + "With the `rasterio` backend, each `/tile` request:\n", "- finds all granules intersecting the tile footprint and the selected datetime interval\n", - "- reads & mosaics them (across space/time), resamples, stacks bands, then encodes the image \n", + "- reads & mosaics those granules (across space/time), resamples, stacks bands, then encodes the image \n", "\n", - "In contrast to the xarray backend, the rasterio backend’s tile latency depends strongly on the width of the datetime interval." + "Tile latency depends strongly on the length of the datetime interval and the temporal resolution of the dataset." ] }, { "cell_type": "code", - "execution_count": 15, + "execution_count": null, "id": "81ea0403", "metadata": {}, "outputs": [], @@ -863,7 +865,6 @@ " datetime_range=\"2023-10-01T00:00:01Z/2023-10-07T00:00:01Z\",\n", " bands=[\"B04\", \"B03\", \"B02\"],\n", " bands_regex=\"B[0-9][0-9]\",\n", - " step=\"P1D\",\n", " temporal_mode=\"point\",\n", ")\n", "ds_hls_week = DatasetParams(\n", @@ -872,7 +873,6 @@ " datetime_range=\"2023-10-01T00:00:01Z/2023-10-20T00:00:01Z\",\n", " bands=[\"B04\", \"B03\", \"B02\"],\n", " bands_regex=\"B[0-9][0-9]\",\n", - " step=\"P1W\",\n", " temporal_mode=\"point\",\n", ")\n", "\n", @@ -885,7 +885,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": null, "id": "19806ae2", "metadata": {}, "outputs": [ @@ -1192,7 +1192,7 @@ }, { "cell_type": "code", - "execution_count": 12, + "execution_count": null, "id": "e034280d", "metadata": {}, "outputs": [ @@ -1499,7 +1499,7 @@ }, { "cell_type": "code", - "execution_count": 13, + "execution_count": null, "id": "3b5338d5", "metadata": {}, "outputs": [ @@ -1529,7 +1529,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": null, "id": "71b008e2", "metadata": {}, "outputs": [ @@ -1564,6 +1564,8 @@ "source": [ "## Benchmarking using custom bounds\n", "\n", + "UPDATE ME\n", + "\n", "In this part, we are going to measure response latency across the tiles at different zoom levels using `benchmark_viewport` function. \n", "This function simulates the load of a typical viewport render in a slippy map, where multiple adjacent tiles must be fetched in parallel to draw a single view.\n", "\n", @@ -1774,9 +1776,11 @@ "source": [ "#### Band Combinations\n", "\n", - "In Rasterio backend, you can specify multiple bands to be rendered in a single tile request. This is useful for visualizing different aspects of the data, such as true color composites or vegetation indices.\n", + "With the `rasterio` backend, you can specify multiple bands to be rendered in a single tile request. This is useful for visualizing different aspects of the data, such as true color composites or vegetation indices.\n", + "\n", + "More bands typically mean larger payloads and potentially higher latency, especially if the bands are stored in separate files. \n", "\n", - "More bands typically mean larger payloads and potentially higher latency, especially if the bands are stored in separate files. " + "BUT what we see is similar latency, possibly due to concurrency in titiler-cmr [CHECK THIS]." ] }, { @@ -2063,21 +2067,16 @@ "\n", "In this notebook, we explored how to check the performance of tile rendering performance in TiTiler-CMR using different datasets and backends. We observed how factors such as zoom levels, temporal intervals, and dataset structures impact the latency of tile requests.\n", "\n", - "In general, Xarray backend:\n", - "- Performance depends strongly on the zoom levels, \n", - "- Reads a single timestep for `/tile` requests so interval width generally does not change tile latency.\n", - "\n", - "In Raterio backend:\n", - "- Covers all the granules intersecting the tile footprint and the selected datetime interval,\n", - "- Performance depends on zoom levels and the width of the datetime interval, and band selection\n", - "- Higher zoom levels **(e.g., z > 8)** tend to have more stable and lower latency due to fewer intersecting granules. However, performance plateaus around z≈9 for many datasets.\n", + "In general, for either backend:\n", + "- Performance depends on the dataset's spatial characteristics and the zoom level\n", + "- Performance depends on the dataset's temporal characteristics and the width of the datetime interval\n", "\n", + "For the `rasterio` backend:\n", + "- Band selection does not impact performance because of titiler-cmr concurrency CHECK THIS\n", "\n", "Takeaways: \n", - "- Prefer **single-day** (or narrow) intervals for responsive rendering\n", - "- The bigger the time range, the more data needs to be scanned and processed\n", - "- Avoid very low zooms for heavy composites; consider **minzoom ≥ 7**\n", "\n", + "Use this tool to assess datasets of interest. If performance is poor, consider restricting applications use of the API to higher (more zoomed in) zoom levels and narrowing time frames.\n", "\n", "### Further Reading\n", "- [TiTiler-CMR GitHub Repository](https://github.com/developmentseed/titiler-cmr)\n", @@ -2085,6 +2084,12 @@ "- [Tile Matrix Sets and Zoom Levels](https://docs.opengeospatial.org/is/17-083r2/17-083r2.html#_tile_matrix_sets_and_zoom_levels)\n", "- [Earthdata Cloud CMR Datasets](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html#datasets)\n" ] + }, + { + "cell_type": "markdown", + "id": "7fb8c8e1", + "metadata": {}, + "source": [] } ], "metadata": { From d8d8578333874a9591a53487da23235bbd4789d3 Mon Sep 17 00:00:00 2001 From: Aimee Barciauskas Date: Fri, 10 Oct 2025 14:41:09 -0700 Subject: [PATCH 2/3] reviewed benchmark stats --- .../titiler/titiler-cmr/benchmark-stats.ipynb | 13 ++++++------- .../titiler/titiler-cmr/benchmark-tiles.ipynb | 8 ++++---- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/docs/visualization/titiler/titiler-cmr/benchmark-stats.ipynb b/docs/visualization/titiler/titiler-cmr/benchmark-stats.ipynb index f39f8df..65ff7ee 100644 --- a/docs/visualization/titiler/titiler-cmr/benchmark-stats.ipynb +++ b/docs/visualization/titiler/titiler-cmr/benchmark-stats.ipynb @@ -5,11 +5,11 @@ "id": "b44dbc99", "metadata": {}, "source": [ - "# Benchmarking statistics\n", + "# Benchmarking TiTiler-CMR Statistics API Endpoints\n", "\n", "This notebook shows how to benchmark the `/timeseries/statistics` endpoint of a TiTiler-CMR deployment and understand how performance varies under different parameters. \n", "\n", - "In Titiler-CMR, the `/timeseries/statistics` endpoint computes statistics for all points/intervals along a timeseries and over a specified geometry. The performance of this endpoint can vary based on several factors that we will explore in this notebook.\n", + "In TiTiler-CMR, the `/timeseries/statistics` endpoint computes statistics for all points/intervals along a timeseries and over a specified geometry. The performance of this endpoint can vary based on several factors explored in this notebook.\n", "\n", "-----------------------------------\n", "\n", @@ -46,11 +46,10 @@ "source": [ "### Introduction\n", "\n", - "The `/timeseries/statistics` endpoint will produce summary statistics for an AOI for all points along a timeseries. This typically involves reading multiple granules, performing reprojection/resampling/mosaicking, and then computing statistics over the specified area of interest .\n", + "The `/timeseries/statistics` endpoint will produce summary statistics for an AOI for all points along a timeseries. This typically involves reading multiple granules, performing reprojection/resampling/mosaicking, and then computing statistics over the specified area of interest.\n", "\n", "This endpoint returns a GeoJSON FeatureCollection with statistics for each time point in the timeseries.\n", "\n", - "\n", "The performance of this endpoint can vary based on several factors, including:\n", "- The size and complexity of the geometry (e.g., a small polygon vs a large bounding box)\n", "- The number of granules that need to be read and processed to cover the geometry\n", @@ -62,7 +61,7 @@ "id": "034cd95c", "metadata": {}, "source": [ - "We want to define the parameters for the CMR dataset we want to benchmark. The `DatasetParams` class encapsulates all the necessary information to interact with a specific dataset via TiTiler-CMR.\n" + "The `DatasetParams` class encapsulates all the necessary information to interact with a specific dataset via TiTiler-CMR.\n" ] }, { @@ -256,7 +255,7 @@ "id": "2b91cd56-7c4a-4db0-b7d4-22ead571ec23", "metadata": {}, "source": [ - "RasterIO backend also supports similar statistics backend." + "The `rasterio` backend also supports similar statistics responses." ] }, { @@ -685,7 +684,7 @@ "source": [ "### Time Range \n", "\n", - "For statistics benchmarking, the number of timesteps matters too. Longer time series (more timesteps) will generally take longer to process. This sweep varies the time window length (number of timesteps) while keeping the geometry size constant to see how that affects performance.\n", + "For statistics benchmarking, the number of timesteps matters too. Longer time series (more timesteps) will generally take longer to process. This benchmark varies the time window length (number of timesteps) while keeping the geometry size constant to see how that affects performance.\n", "\n", "The time series API supports the following parameters: \n", "- **datetime (str)**: Either a date-time, an interval, or a comma-separated list of date-times or intervals. Date and time expressions adhere to rfc3339 ('2020-06-01T09:00:00Z') format.\n", diff --git a/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb b/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb index 08a39ce..7b9bbef 100644 --- a/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb +++ b/docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb @@ -5,7 +5,7 @@ "id": "b44dbc99", "metadata": {}, "source": [ - "# Benchmarking tile generation\n", + "# Benchmarking TiTiler-CMR Tiles API Endpoints\n", "\n", "This notebook walks you through a workflow to **benchmark performance** of a [TiTiler-CMR](https://github.com/developmentseed/titiler-cmr) deployment for a given Earthdata CMR dataset.\n", "\n", @@ -2067,9 +2067,9 @@ "\n", "In this notebook, we explored how to check the performance of tile rendering performance in TiTiler-CMR using different datasets and backends. We observed how factors such as zoom levels, temporal intervals, and dataset structures impact the latency of tile requests.\n", "\n", - "In general, for either backend:\n", - "- Performance depends on the dataset's spatial characteristics and the zoom level\n", - "- Performance depends on the dataset's temporal characteristics and the width of the datetime interval\n", + "In general, for either backend, performance depends on:\n", + "1. A dataset's spatial characteristics, specifically the resolution and granule-level spatial extent, impact performance at different zoom levels.\n", + "2. A dataset's temporal resolution impacts performance for different datetime intervals.\n", "\n", "For the `rasterio` backend:\n", "- Band selection does not impact performance because of titiler-cmr concurrency CHECK THIS\n", From c405e072f627e9510da7f880f6604c8cb3518bd7 Mon Sep 17 00:00:00 2001 From: Aimee Barciauskas Date: Fri, 10 Oct 2025 15:12:18 -0700 Subject: [PATCH 3/3] links change --- docs/visualization/titiler/titiler-cmr/compatibility.ipynb | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/visualization/titiler/titiler-cmr/compatibility.ipynb b/docs/visualization/titiler/titiler-cmr/compatibility.ipynb index 9578811..a422290 100644 --- a/docs/visualization/titiler/titiler-cmr/compatibility.ipynb +++ b/docs/visualization/titiler/titiler-cmr/compatibility.ipynb @@ -22,7 +22,7 @@ "- An Earthdata login account: https://urs.earthdata.nasa.gov/\n", "- A valid `netrc` file with your Earthdata credentials or use interactive login.\n", "\n", - "For this walkthrough, we will use the public instance hosted by [Open VEDA](https://staging.openveda.cloud/api/titiler-cmr/)." + "For this walkthrough, we will use the TiTiler-CMR instance hosted by [Open VEDA](https://www.earthdata.nasa.gov/data/tools/veda): [https://staging.openveda.cloud/api/titiler-cmr/](https://staging.openveda.cloud/api/titiler-cmr/)." ] }, { @@ -50,7 +50,7 @@ "metadata": {}, "source": [ "### Introduction to TiTiler-CMR\n", - "[`Titiler-CMR`](https://github.com/developmentseed/titiler-cmr) is a dynamic map tile server that provides on-demand access to Earth science data managed by NASA's Common Metadata Repository (CMR). It allows users to dynamically generate and serve map tiles from multidimensional data formats like NetCDF and HDF5.\n", + "[`TiTiler-CMR`](https://github.com/developmentseed/titiler-cmr) is a dynamic map tile server that provides on-demand access to Earth science data managed by NASA's Common Metadata Repository (CMR). It allows users to dynamically generate and serve map tiles from multidimensional data formats like NetCDF and HDF5.\n", "\n", "To get started with TiTiler-CMR, you typically need to:\n", "- Choose a Titiler-CMR endpoint\n", @@ -66,7 +66,7 @@ "Here, we first explore a dataset using `earthaccess` to collect the necessary information such as **concept_id**, **backend**, and **variable**, then run a compatibility check using the `check_titiler_cmr_compatibility` helper function. If you already know your dataset, you can skip the exploration steps step 2 directly. \n", "\n", "## Step 1: Explore data with `earthaccess`\n", - "You can use [`earthaccess`](https://github.com/nsidc/earthaccess) to search for dataset and inspect the individual granules used in your query. This helps you validate which files were accessed, their sizes, and the temporal range.\n", + "You can use [`earthaccess`](https://github.com/nsidc/earthaccess) to search for datasets and inspect the individual granules used in your query. This helps you validate which files were accessed, their sizes, and the temporal range.\n", "\n", "First you need to authenticate to Earthdata. " ]