Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 6 additions & 7 deletions docs/visualization/titiler/titiler-cmr/benchmark-stats.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@
"id": "b44dbc99",
"metadata": {},
"source": [
"# Benchmarking statistics\n",
"# Benchmarking TiTiler-CMR Statistics API Endpoints\n",
"\n",
"This notebook shows how to benchmark the `/timeseries/statistics` endpoint of a TiTiler-CMR deployment and understand how performance varies under different parameters. \n",
"\n",
"In Titiler-CMR, the `/timeseries/statistics` endpoint computes statistics for all points/intervals along a timeseries and over a specified geometry. The performance of this endpoint can vary based on several factors that we will explore in this notebook.\n",
"In TiTiler-CMR, the `/timeseries/statistics` endpoint computes statistics for all points/intervals along a timeseries and over a specified geometry. The performance of this endpoint can vary based on several factors explored in this notebook.\n",
"\n",
"-----------------------------------\n",
"\n",
Expand Down Expand Up @@ -46,11 +46,10 @@
"source": [
"### Introduction\n",
"\n",
"The `/timeseries/statistics` endpoint will produce summary statistics for an AOI for all points along a timeseries. This typically involves reading multiple granules, performing reprojection/resampling/mosaicking, and then computing statistics over the specified area of interest .\n",
"The `/timeseries/statistics` endpoint will produce summary statistics for an AOI for all points along a timeseries. This typically involves reading multiple granules, performing reprojection/resampling/mosaicking, and then computing statistics over the specified area of interest.\n",
"\n",
"This endpoint returns a GeoJSON FeatureCollection with statistics for each time point in the timeseries.\n",
"\n",
"\n",
"The performance of this endpoint can vary based on several factors, including:\n",
"- The size and complexity of the geometry (e.g., a small polygon vs a large bounding box)\n",
"- The number of granules that need to be read and processed to cover the geometry\n",
Expand All @@ -62,7 +61,7 @@
"id": "034cd95c",
"metadata": {},
"source": [
"We want to define the parameters for the CMR dataset we want to benchmark. The `DatasetParams` class encapsulates all the necessary information to interact with a specific dataset via TiTiler-CMR.\n"
"The `DatasetParams` class encapsulates all the necessary information to interact with a specific dataset via TiTiler-CMR.\n"
]
},
{
Expand Down Expand Up @@ -256,7 +255,7 @@
"id": "2b91cd56-7c4a-4db0-b7d4-22ead571ec23",
"metadata": {},
"source": [
"RasterIO backend also supports similar statistics backend."
"The `rasterio` backend also supports similar statistics responses."
]
},
{
Expand Down Expand Up @@ -685,7 +684,7 @@
"source": [
"### Time Range \n",
"\n",
"For statistics benchmarking, the number of timesteps matters too. Longer time series (more timesteps) will generally take longer to process. This sweep varies the time window length (number of timesteps) while keeping the geometry size constant to see how that affects performance.\n",
"For statistics benchmarking, the number of timesteps matters too. Longer time series (more timesteps) will generally take longer to process. This benchmark varies the time window length (number of timesteps) while keeping the geometry size constant to see how that affects performance.\n",
"\n",
"The time series API supports the following parameters: \n",
"- **datetime (str)**: Either a date-time, an interval, or a comma-separated list of date-times or intervals. Date and time expressions adhere to rfc3339 ('2020-06-01T09:00:00Z') format.\n",
Expand Down
91 changes: 48 additions & 43 deletions docs/visualization/titiler/titiler-cmr/benchmark-tiles.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -5,10 +5,12 @@
"id": "b44dbc99",
"metadata": {},
"source": [
"# Benchmarking tile generation\n",
"# Benchmarking TiTiler-CMR Tiles API Endpoints\n",
"\n",
"This notebook walks you through a workflow to **benchmark performance** of a [TiTiler-CMR](https://github.com/developmentseed/titiler-cmr) deployment for a given Earthdata CMR dataset.\n",
"\n",
"This notebook benchmarks tiling [GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) at GES DISC](https://data.nasa.gov/dataset/gpm-imerg-final-precipitation-l3-1-day-0-1-degree-x-0-1-degree-v07-gpm-3imergdf-at-ges-dis-13ed8) as an example using the titiler-cmr `xarray` backend and [HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m v2.0](https://www.earthdata.nasa.gov/data/catalog/lpcloud-hlss30-2.0) as an example using the titiler-cmr `rasterio` backend.\n",
"\n",
"\n",
"> **What is TiTiler-CMR?**\n",
">\n",
Expand Down Expand Up @@ -50,13 +52,15 @@
"source": [
"## TiTiler-CMR Setup\n",
"\n",
"`titiler-cmr` is a NASA-focused application that accepts Concept IDs and uses the Common Metadata Repository (CMR) to discover and serve associated granules as tiles. You can deploy your own instance of `titiler_cmr` using the [official guide](https://github.com/developmentseed/titiler-cmr), or use a public instance that is already deployed.\n",
"`titiler-cmr` is a NASA-focused application that accepts Concept IDs and uses the Common Metadata Repository (CMR) to discover and serve associated granules as tiles. You can deploy your own instance of `titiler-cmr` using the [official guide](https://github.com/developmentseed/titiler-cmr), or use an existing deployment.\n",
"\n",
"For this walkthrough, we will use the public instance hosted by [Open VEDA](https://staging.openveda.cloud/api/titiler-cmr/).\n",
"For this walkthrough, we will use the [Open VEDA](https://www.earthdata.nasa.gov/data/tools/veda) deployment: [https://staging.openveda.cloud/api/titiler-cmr/](https://staging.openveda.cloud/api/titiler-cmr/).\n",
"\n",
"To get started with a dataset, you need to:\n",
"- Choose a Titiler-CMR endpoint\n",
"- Pick a CMR dataset (by concept ID)\n",
"\n",
"For the following, you Can use titiler-cmr's compatibility endpoint [ADD LINK TO DOCS ONCE https://github.com/developmentseed/titiler-cmr/pull/80 IS MERGED]:\n",
"- Identify the assets/variables/bands you want to visualize\n",
"- Define a temporal interval (`start/end` ISO range) and, if needed, a time step (e.g., daily).\n",
"- Select a backend that matches your dataset’s structure\n",
Expand All @@ -73,7 +77,7 @@
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": null,
"id": "55ad2e88",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -105,7 +109,8 @@
"metadata": {},
"source": [
"## Tile Generation Benchmarking\n",
"In this part, we are going to measure the tile generation performance across different zoom levels using `titiler_cmr_benchmark.benchmark_viewport` function. \n",
"\n",
"The code below demonstrates how to benchmark tile generation performance across different zoom levels using `titiler_cmr_benchmark.benchmark_viewport` function. \n",
"This function simulates the load of a typical viewport render in a slippy map, where multiple adjacent tiles must be fetched in parallel to draw a single view.\n"
]
},
Expand All @@ -119,18 +124,15 @@
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"id": "4e7a8f91-ce75-4afc-85de-b1a6b4b0f48d",
"metadata": {},
"outputs": [],
"source": [
"endpoint = \"https://staging.openveda.cloud/api/titiler-cmr\"\n",
"\n",
"concept_id = \"C2723754864-GES_DISC\"\n",
"datetime_range = \"2022-04-01T00:00:01Z/2022-04-02T23:59:59Z\"\n",
"variable = \"precipitation\"\n",
"\n",
"ds_xarray = DatasetParams(\n",
" # GPM IMERG Final Precipitation L3 1 day 0.1 degree x 0.1 degree V07 (GPM_3IMERGDF) at GES DISC\n",
" concept_id=\"C2723754864-GES_DISC\",\n",
" backend=\"xarray\",\n",
" datetime_range=\"2022-03-01T00:00:01Z/2022-03-01T23:59:59Z\",\n",
Expand All @@ -142,7 +144,7 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": null,
"id": "9823ed5f-5828-47eb-834f-81d52890c2ec",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -191,16 +193,16 @@
"metadata": {},
"source": [
"### Zoom Levels\n",
"Zoom levels determine the detail and extent of the area being rendered. At lower zoom levels, a single tile covers a large spatial area and may intersect many granules. This usually translates to more I/O, more resampling/mosaic work, higher latency, and higher chance of timeouts errors.\n",
"Zoom levels determine the detail and extent of the area being rendered. At lower zoom levels, a single tile covers a large spatial area and requires more data be loaded relative to higher zoom levels. Sometimes, data must be loaded from multiple granules. Loading data from multiple granules is not required for the example dataset, as each granule has global coverage. In general, lower zoom levels translates to more I/O, more resampling/mosaic work, higher latency, and higher chance of timeouts errors.\n",
"\n",
"As you increase zoom, each tile covers a smaller area, reducing the number of intersecting granules and the amount of work per request. \n",
"As you increase zoom, each tile covers a smaller area, reducing the amount of work per request. \n",
"\n",
"We'll define a range of zoom levels to test to see how performance varies."
]
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": null,
"id": "bde8b55b",
"metadata": {},
"outputs": [],
Expand All @@ -227,7 +229,7 @@
},
{
"cell_type": "code",
"execution_count": 9,
"execution_count": null,
"id": "627eca1c",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -267,7 +269,7 @@
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": null,
"id": "15ba4c0b-6241-4fe0-8e82-d9271b4c668c",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -449,7 +451,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": null,
"id": "7b361ecc",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -730,7 +732,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": null,
"id": "a69a9d32",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -839,20 +841,20 @@
"metadata": {},
"source": [
"### Rasterio Backend (COG/Band-based datasets)\n",
"In this example, we will benchmark a CMR dataset that is structured as Cloud Optimized GeoTIFFs (COGs) with individual bands. We will use the `rasterio` backend for this dataset.\n",
"In this example, we will benchmark [HLS Sentinel-2 Multi-spectral Instrument Surface Reflectance Daily Global 30m v2.0 (HLS)](https://www.earthdata.nasa.gov/data/catalog/lpcloud-hlss30-2.0). HLS granules are Cloud Optimized GeoTIFFs (COGs) with individual bands. We will use the `rasterio` backend for this dataset.\n",
"\n",
"In general, the lower the zoom level, the more files need to be opened to render a tile, which can lead to increased latency. Additionally, datasets with larger file sizes or more complex structures may also experience higher latency.\n",
"Since HLS granules each cover a small spatial extent, the lower the zoom level (more zoomed out), the more files need to be opened to render a tile. So lower zoom levels lead to increased latency. Additionally, datasets with larger file sizes or more complex structures may also experience higher latency.\n",
"\n",
"In Rasterio, each `/tile` request:\n",
"With the `rasterio` backend, each `/tile` request:\n",
"- finds all granules intersecting the tile footprint and the selected datetime interval\n",
"- reads & mosaics them (across space/time), resamples, stacks bands, then encodes the image \n",
"- reads & mosaics those granules (across space/time), resamples, stacks bands, then encodes the image \n",
"\n",
"In contrast to the xarray backend, the rasterio backend’s tile latency depends strongly on the width of the datetime interval."
"Tile latency depends strongly on the length of the datetime interval and the temporal resolution of the dataset."
]
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": null,
"id": "81ea0403",
"metadata": {},
"outputs": [],
Expand All @@ -863,7 +865,6 @@
" datetime_range=\"2023-10-01T00:00:01Z/2023-10-07T00:00:01Z\",\n",
" bands=[\"B04\", \"B03\", \"B02\"],\n",
" bands_regex=\"B[0-9][0-9]\",\n",
" step=\"P1D\",\n",
" temporal_mode=\"point\",\n",
")\n",
"ds_hls_week = DatasetParams(\n",
Expand All @@ -872,7 +873,6 @@
" datetime_range=\"2023-10-01T00:00:01Z/2023-10-20T00:00:01Z\",\n",
" bands=[\"B04\", \"B03\", \"B02\"],\n",
" bands_regex=\"B[0-9][0-9]\",\n",
" step=\"P1W\",\n",
" temporal_mode=\"point\",\n",
")\n",
"\n",
Expand All @@ -885,7 +885,7 @@
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": null,
"id": "19806ae2",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1192,7 +1192,7 @@
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": null,
"id": "e034280d",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1499,7 +1499,7 @@
},
{
"cell_type": "code",
"execution_count": 13,
"execution_count": null,
"id": "3b5338d5",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1529,7 +1529,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": null,
"id": "71b008e2",
"metadata": {},
"outputs": [
Expand Down Expand Up @@ -1564,6 +1564,8 @@
"source": [
"## Benchmarking using custom bounds\n",
"\n",
"UPDATE ME\n",
"\n",
"In this part, we are going to measure response latency across the tiles at different zoom levels using `benchmark_viewport` function. \n",
"This function simulates the load of a typical viewport render in a slippy map, where multiple adjacent tiles must be fetched in parallel to draw a single view.\n",
"\n",
Expand Down Expand Up @@ -1774,9 +1776,11 @@
"source": [
"#### Band Combinations\n",
"\n",
"In Rasterio backend, you can specify multiple bands to be rendered in a single tile request. This is useful for visualizing different aspects of the data, such as true color composites or vegetation indices.\n",
"With the `rasterio` backend, you can specify multiple bands to be rendered in a single tile request. This is useful for visualizing different aspects of the data, such as true color composites or vegetation indices.\n",
"\n",
"More bands typically mean larger payloads and potentially higher latency, especially if the bands are stored in separate files. \n",
"\n",
"More bands typically mean larger payloads and potentially higher latency, especially if the bands are stored in separate files. "
"BUT what we see is similar latency, possibly due to concurrency in titiler-cmr [CHECK THIS]."
]
},
{
Expand Down Expand Up @@ -2063,28 +2067,29 @@
"\n",
"In this notebook, we explored how to check the performance of tile rendering performance in TiTiler-CMR using different datasets and backends. We observed how factors such as zoom levels, temporal intervals, and dataset structures impact the latency of tile requests.\n",
"\n",
"In general, Xarray backend:\n",
"- Performance depends strongly on the zoom levels, \n",
"- Reads a single timestep for `/tile` requests so interval width generally does not change tile latency.\n",
"\n",
"In Raterio backend:\n",
"- Covers all the granules intersecting the tile footprint and the selected datetime interval,\n",
"- Performance depends on zoom levels and the width of the datetime interval, and band selection\n",
"- Higher zoom levels **(e.g., z > 8)** tend to have more stable and lower latency due to fewer intersecting granules. However, performance plateaus around z≈9 for many datasets.\n",
"In general, for either backend, performance depends on:\n",
"1. A dataset's spatial characteristics, specifically the resolution and granule-level spatial extent, impact performance at different zoom levels.\n",
"2. A dataset's temporal resolution impacts performance for different datetime intervals.\n",
"\n",
"For the `rasterio` backend:\n",
"- Band selection does not impact performance because of titiler-cmr concurrency CHECK THIS\n",
"\n",
"Takeaways: \n",
"- Prefer **single-day** (or narrow) intervals for responsive rendering\n",
"- The bigger the time range, the more data needs to be scanned and processed\n",
"- Avoid very low zooms for heavy composites; consider **minzoom ≥ 7**\n",
"\n",
"Use this tool to assess datasets of interest. If performance is poor, consider restricting applications use of the API to higher (more zoomed in) zoom levels and narrowing time frames.\n",
"\n",
"### Further Reading\n",
"- [TiTiler-CMR GitHub Repository](https://github.com/developmentseed/titiler-cmr)\n",
"- [Titiler-CMR API Documentation](https://staging.openveda.cloud/api/titiler-cmr/api.html#/)\n",
"- [Tile Matrix Sets and Zoom Levels](https://docs.opengeospatial.org/is/17-083r2/17-083r2.html#_tile_matrix_sets_and_zoom_levels)\n",
"- [Earthdata Cloud CMR Datasets](https://cmr.earthdata.nasa.gov/search/site/docs/search/api.html#datasets)\n"
]
},
{
"cell_type": "markdown",
"id": "7fb8c8e1",
"metadata": {},
"source": []
}
],
"metadata": {
Expand Down
6 changes: 3 additions & 3 deletions docs/visualization/titiler/titiler-cmr/compatibility.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
"- An Earthdata login account: https://urs.earthdata.nasa.gov/\n",
"- A valid `netrc` file with your Earthdata credentials or use interactive login.\n",
"\n",
"For this walkthrough, we will use the public instance hosted by [Open VEDA](https://staging.openveda.cloud/api/titiler-cmr/)."
"For this walkthrough, we will use the TiTiler-CMR instance hosted by [Open VEDA](https://www.earthdata.nasa.gov/data/tools/veda): [https://staging.openveda.cloud/api/titiler-cmr/](https://staging.openveda.cloud/api/titiler-cmr/)."
]
},
{
Expand Down Expand Up @@ -50,7 +50,7 @@
"metadata": {},
"source": [
"### Introduction to TiTiler-CMR\n",
"[`Titiler-CMR`](https://github.com/developmentseed/titiler-cmr) is a dynamic map tile server that provides on-demand access to Earth science data managed by NASA's Common Metadata Repository (CMR). It allows users to dynamically generate and serve map tiles from multidimensional data formats like NetCDF and HDF5.\n",
"[`TiTiler-CMR`](https://github.com/developmentseed/titiler-cmr) is a dynamic map tile server that provides on-demand access to Earth science data managed by NASA's Common Metadata Repository (CMR). It allows users to dynamically generate and serve map tiles from multidimensional data formats like NetCDF and HDF5.\n",
"\n",
"To get started with TiTiler-CMR, you typically need to:\n",
"- Choose a Titiler-CMR endpoint\n",
Expand All @@ -66,7 +66,7 @@
"Here, we first explore a dataset using `earthaccess` to collect the necessary information such as **concept_id**, **backend**, and **variable**, then run a compatibility check using the `check_titiler_cmr_compatibility` helper function. If you already know your dataset, you can skip the exploration steps step 2 directly. \n",
"\n",
"## Step 1: Explore data with `earthaccess`\n",
"You can use [`earthaccess`](https://github.com/nsidc/earthaccess) to search for dataset and inspect the individual granules used in your query. This helps you validate which files were accessed, their sizes, and the temporal range.\n",
"You can use [`earthaccess`](https://github.com/nsidc/earthaccess) to search for datasets and inspect the individual granules used in your query. This helps you validate which files were accessed, their sizes, and the temporal range.\n",
"\n",
"First you need to authenticate to Earthdata. "
]
Expand Down