Skip to content

Athena: configurable S3 root for Iceberg table location independent of staging bucket #3823

@jflipts

Description

@jflipts

Feature description

The Athena destination derives the final Iceberg table LOCATION from staging_config.bucket_url (hardcoded in athena._get_table_update_sql). There is no way to configure a separate S3 root for the final table data. The staging Parquet files and final Iceberg table data always share the same S3 bucket/prefix root while there is no need for this.

Are you a dlt user?

Yes, I run dlt in production.

Use case

Our environment has strict S3 prefix-based access control where we want to keep staging data internal under a prefix like s3://datalake/source_system/private/... while the final tables should be stored on a sibling prefix like s3://datalake/source_system/output/...

Proposed solution

Add an optional table_bucket_url field to AthenaClientConfiguration. When set, _get_table_update_sql uses it as the S3 root for the Iceberg table LOCATION while continuing to use staging_config.bucket_url for the staging EXTERNAL TABLE LOCATION.

Related issues

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions