Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions css/components/_tables.scss
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,16 @@ table.requirements {
width: 40%;
}
}

table.requirements-extended {
td, th {
padding: .5rem;
vertical-align: top;
}
td:first-child{
min-width: 150px;
}
td:nth-child(4), td:nth-child(5){
width: 35%;
}
}
103 changes: 96 additions & 7 deletions guides/file_formats.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -33,14 +33,19 @@ challenging to use.
An exception to this are the 'RGB' style products, where three bands are used to represent a single image. In this case,
creating a Cloud Optimised GeoTIFF with three bands is an option.

For associating time information, create one GeoTIFF per timestamp, and one STAC item per timestamp. The GeoTIFF format has
not built-in support for conveying time information, but STAC metadata is supporting this very well.
For associating time information, create one GeoTIFF per timestamp, and one STAC item per timestamp. The GeoTIFF format
has not built-in support for conveying time information, but STAC metadata is supporting this very well.

### Visualisation in APEx Geospatial Explorer

To optimise visualisation in the APEx Geospatial Explorer, additional guidelines have been established. Adhering to these
guidelines will ensure that the data is effectively optimised for visualisation on a map. Please refer to
[this page](../interoperability/geospatial_explorer.qmd#cloud-optimized-geotiff-cog) for more information.
To optimise visualisation in the APEx Geospatial Explorer, it is recommended to use the GoogleMapsCompatible tiling scheme-
typically 256x256 pixel tiles aligned to a global grid. The default Coordinate Reference System (CRS) used in the Geospatial
Explorer is Web Mercator projection (EPSG:3857) and therefore all datasets in this projection will be supported. On the
fly reprojection and / or configuration of a Geospatial Explorer instance to alternative CRS’s is feasible, although we
advise contact the APEx team for specific advice when using alternative projections. The BitsPerSample field must accurately
reflect the data format. Overviews are essential for performance and should be generated using downsampling by factors of
two until the image dimensions are the size of a tile or smaller. These overviews should also be tiled and placed after
the main image data to conform with the COG specification.

## (Geo-)Zarr

Expand All @@ -61,6 +66,90 @@ At the time of writing, there are, however these important caveats:

## NetCDF

NetCDF is a self-describing format with some properties similar to Zarr, but less optimised for cloud access. It can be useful
for exchanging data cubes as single files through traditional methods. However, it is less recommended for convenient
NetCDF is a self-describing format with some properties similar to Zarr, but less optimised for cloud access. It can be
useful for exchanging data cubes as single files through traditional methods. However, it is less recommended for convenient
sharing of large datasets, for which either COG or Zarr provide better options.

## Statistical Datasets (FlatGeobuf, GeoJSON)

Statistical datasets can be used to store precomputed statistics for dataset variables based on spatial units, such as
administrative areas. An example is to collect land cover statistics on using boundaries from nomenclature of territorial
units for statistics (NUTS), as shown in the [APEx Geospatial Explorer](https://explorer.apex.esa.int/) (Statistics). The
guidelines in this section are focused on supporting the integration of statistical data for visualisation in the APEx
Geospatial Explorer.

The statistical datasets are expected to be vector layers that are provided in a format that can be parsed to a feature
collection following the GeoJSON [@geojson] specification. Currently tested and supported formats are GeoJSON [@geojson]
and FlatGeobuf [@flatgeobuf]. FlatGeobuf should be used where the statistical data is a large size as this allows for
streaming of the relevant features without having to download the full dataset, increasing performance.

The metadata header of the file should contain the following properties to define which fields on the features in the
dataset should be used for the following purposes.

- identifierKey: The name of the field that stores the unique identifier for each feature.
- nameKey: The name of the field that stores the human-readable name for display.
- levelKey: The name of the field that stores the administrative level number.
- childrenKey: The name of the field that has a comma-separated list of child feature IDs as declared in identifierKey.
Can be the empty string if this is the bottom level.
- attributeKeys: A comma-separated list of field numbers that store the statistical data.
- units: The units as displayed in the UI. This is for UI purposes only and has no effect on the data.
- visualization_hint: A string of histogram, categorised, or continuous used as a hint to the UI to choose a suitable
presentation for the data.

For example, properties in the file metadata that is defined as follows:

- identifierKey: NUTS_ID
- nameKey: NUTS_NAME
- levelKey: LEVL_CODE
- childrenKey: children
- attributeKeys: Trees, Shrubland, Grassland
- visualization_hint: categorised

would use the fields NUTS_ID, NUTS_NAME, … in the data to determine the navigation and display of statistics in the
Geospatial Explorer. For further guidance, please contact the APEx team through the [APEx User Forum](http://forum.apex.esa.int/).

Datasets that have classifications (such as land use) should have key:value entires consisting of 'name':'value' and an
entry with a key of 'classifications' with a value consisting of a string based comma separated list containing all the
keys for the classifications and a 'total' key with the sum of all other values. This will allow for correctly rendering
bar charts and pie charts.

```{json}
{
Bare / sparse vegetation: 3349.349614217657,
Built-up: 18474.280639104116
Cropland: 155067.6934300016
Grassland: 140178.79417018566
Herbaceous wetland: 1612.828666906516
Mangroves: 479.46053523623897
Moss and lichen: 499.40601429089236
Permanent water bodies: 8969.837211370474
Shrubland: 7342.96093361589
Snow and ice: 495.7695064816955
Tree cover: 301783.0035618253
Unknown: 1.7258467103820294
total: 638255.1101299465
classifications: "Tree cover,Shrubland,Grassland,Cropland,Built-up,Bare / sparse vegetation,Snow and ice,
Permanent water bodies,Herbaceous wetland,Mangroves,Moss and lichen,Unknown"
}
```

![worldcover_bar_chart_example](./images/worldcover_bar_chart_example.png){width=75%}

Datasets that do not have classifications (such as a raster showing soil organic carbon) should contain a selection of
the following entries:

- mean
- min
- max

These values will be rendered as a table.

```{json}
{
mean: 437.94353402030356
min: 60
max: 4410
}
```

![worldsoils_table_example](./images/worldsoils_table_example.png){width=75%}
50 changes: 26 additions & 24 deletions instantiation/app_code_server.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ title: Code Server IDE

## Overview

The Code Server Interactive Development Environment (IDE) capacity within the APEx Project Environments primarily leverages the power
of the [Code Server software (Visual Studio Code in the Cloud)](#code-server-software-architecture).
The Code Server Interactive Development Environment (IDE) capacity within the APEx Project Environments primarily leverages
the power of the Code Server software (Visual Studio Code in the Cloud).

![Current APEx Code Server IDE](images/code_server.png)

Expand All @@ -22,16 +22,17 @@ for programming languages and productivity plugins or extensions.
## Software Architecture

The APEx Code Server solution is an Integrated Development Environment delivered as a cloud-based user workspace, tailored
to support the activities of Earth observation (EO) projects.
to support the activities of Earth observation (EO) projects.

The Code Server IDE within the APEx Instantiation Services is built on a "User Workspace" system architecture,
leveraging Kubernetes and JupyterHub for orchestration and management, and accessing a file system secured and private to the authenticated user.
leveraging Kubernetes and JupyterHub for orchestration and management, and accessing a file system secured and private
to the authenticated user.

![The Code Server IDE in the current APEx workspaces offering ](images/applicationhub_codeserver.png)

Each Code Server user workspace comes equipped with the Visual Studio Code Server, an extension of Microsoft's popular VS Code
editor, as well as with a private data products catalogue. These features empower developer users to edit and build EO
data processing algorithms and workflows, accelerating project outcomes within a dedicated, tool-rich environment.
Each Code Server user workspace comes equipped with the Visual Studio Code Server, an extension of Microsoft's popular
VS Code editor, as well as with a private data products catalogue. These features empower developer users to edit and
build EO data processing algorithms and workflows, accelerating project outcomes within a dedicated, tool-rich environment.

The Code Server setup encapsulates all the capabilities of Microsoft's popular VS Code editor and extends them to be run
and accessed on a remote server. Beyond the core functionality of its desktop counterpart, the Code Server IDE offers
Expand Down Expand Up @@ -90,11 +91,12 @@ assistant: <https://open-vsx.org/extension/Continue/continue>

It allows to connect any models and any context to build custom autocomplete and chat experiences inside Code Server:

* [Chat](https://continue.dev/docs/chat/how-to-use-it) makes it easy to ask for help from an LLM without needing to
leave the Code Server user interface
* [Autocomplete](https://continue.dev/docs/autocomplete/how-to-use-it) provides inline code suggestions as you type
* [Edit](https://continue.dev/docs/edit/how-to-use-it) is a convenient way to modify code without leaving your current file
* [Actions](https://continue.dev/docs/actions/how-to-use-it) are shortcuts for common use cases.
* [Chat](https://docs.continue.dev/ide-extensions/chat/quick-start) makes it easy to ask for help from an LLM without
needing to leave the Code Server user interface
* [Autocomplete](https://docs.continue.dev/ide-extensions/autocomplete/quick-start) provides inline code suggestions as
you type
* [Edit](https://docs.continue.dev/ide-extensions/edit/quick-start) is a convenient way to modify code without leaving
your current file

This extension asks for API keys to use the models.
This has been successfully tested and could be an option for the APEx use cases.
Expand Down Expand Up @@ -125,18 +127,18 @@ libraries like SNAP, GDAL, and Orfeo Toolbox. Developers build container images
command-line tools, along with necessary runtime environments, and publish these images on container registries for easy
access and deployment.

The Code Server IDE supports the use of the Common Workflow Language (CWL), allowing developers to delineate and disseminate application
workflows in a recognised format. CWL documents comprehensively describe the data processing application, including parameters,
software items, executables, dependencies, and metadata. This standardisation enhances collaboration, clarity, and operational
consistency, ensuring that applications are reproducible and portable across various execution scenarios, including local
computers, cloud resources, high-performance computing (HPC) environments, Kubernetes clusters, and services deployed through
an OGC API - Processes interface.

Version control and continuous integration are integral components of the Code Server IDE technical architecture. This enables access
to VCS (e.g. GitLab, GitHub) for efficient code repository management, version control, collaboration, and monitoring of
code changes. Automated continuous integration (CI) tools manage the build, test, and deployment tasks in response to code
modifications, ensuring that applications are always in a deployable state. This automation minimises manual testing overhead
and accelerates the rollout of new features or updates.
The Code Server IDE supports the use of the Common Workflow Language (CWL), allowing developers to delineate and disseminate
application workflows in a recognised format. CWL documents comprehensively describe the data processing application, including
parameters, software items, executables, dependencies, and metadata. This standardisation enhances collaboration, clarity,
and operational consistency, ensuring that applications are reproducible and portable across various execution scenarios,
including local computers, cloud resources, high-performance computing (HPC) environments, Kubernetes clusters, and services
deployed through an OGC API - Processes interface.

Version control and continuous integration are integral components of the Code Server IDE technical architecture. This
enables access to VCS (e.g. GitLab, GitHub) for efficient code repository management, version control, collaboration, and
monitoring of code changes. Automated continuous integration (CI) tools manage the build, test, and deployment tasks in
response to code modifications, ensuring that applications are always in a deployable state. This automation minimises
manual testing overhead and accelerates the rollout of new features or updates.

## Examples

Expand Down
13 changes: 10 additions & 3 deletions interoperability/algohosting.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,14 @@ complexity.
<td>
This ensures that the algorithm can be hosted on one of the APEx-compliant algorithm hosting platforms.
The APEx documentation will provide clear guidance and samples demonstrating these two options.
More information is available at the following pages in the APEx documentation:
<ul>
<li><a href="../propagation/service_development.qmd">APEx Algorithm Service Development Options</a></li>
<li><a href="../propagation/ondemandservices.qmd#how-to-build-an-on-demand-service">Building an On-Demand Service</a></li>
<li><a href="../propagation/platforms.md">Supported Platforms</a></li>
<li><a href="../guides/udp_writer_guide.qmd">Creating an openEO-based Service</a></li>
<li><a href="../guides/eoap_writer_guide.md">Creating an EOAP-based Service</a></li>
</ul>
</td>
</tr>
<tr>
Expand Down Expand Up @@ -92,7 +100,7 @@ complexity.
</tbody>
</table>

Table: Interoperability requirements for algorithm providers
Table: Interoperability requirements (mandatory) for algorithm providers
:::


Expand Down Expand Up @@ -181,10 +189,9 @@ Table: Interoperability requirements for algorithm providers
</tbody>
</table>

Table: Interoperability recommendations for algorithm providers
Table: Interoperability recommendations (optional) for algorithm providers
:::


## Best Practices

The following sections provide best practice guidelines for developing APEx-compliant algorithms. While these guidelines
Expand Down
11 changes: 3 additions & 8 deletions interoperability/algohostingenv.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,26 +65,21 @@ their compatibility with the APEx standards.
</tr>
<tr>
<td>HOST-REQ-05</td>
<td>The algorithm hosting platform shall provide an SLA that guarantees support beyond the project lifetime.</td>
<td>This ensures the long-term sustainability and reliability of the algorithm hosting platform, providing assurance to users that they can rely on continued support even after the project's completion.</td>
</tr>
<tr>
<td>HOST-REQ-06</td>
<td>The operator of the algorithm hosting platform shall announce major changes to the SLA, including decommissioning of the platform, to APEx and the NoR, preferably with a lead time of 1 year.</td>
<td>Such communication is important to ensure that stakeholders, including APEx and the NoR, are given adequate notice of major changes that could impact the availability or functionality of the algorithm hosting platform. This approach allows for proper planning, adjustment, and mitigation of potential disruptions, ensuring continuity of services for users.</td>
</tr>
<tr>
<td>HOST-REQ-07</td>
<td>HOST-REQ-06</td>
<td>The algorithm hosting platform shall support standardized methods for machine-to-machine authentication.</td>
<td>For instance, the OIDC client credentials workflow allows APEx to securely authenticate.</td>
</tr>
<tr>
<td>HOST-REQ-08</td>
<td>HOST-REQ-07</td>
<td>The operator of the algorithm hosting platform shall support the APEx consortium in obtaining a single account either freely, or via <a href="https://portfolio.nor-discover.org/">ESA Network of Resources</a>, that allows to test all published services.</td>
<td>For convenient service testing and minimal administrative overhead, a single account and NoR request should give access to multiple services.</td>
</tr>
<tr>
<td>HOST-REQ-09</td>
<td>HOST-REQ-08</td>
<td>The algorithm hosting platform shall expose process metadata publicly without requiring authentication, unless the nature of the algorithm requires its description to be hidden.</td>
<td>APEx tools request service metadata for informative purposes, also from browser-based applications that do not have platform tokens available.</td>
</tr>
Expand Down
Loading