Feature/dbt macros by littlemex · Pull Request #85 · littlemex/aws-samples

littlemex · 2025-10-26T19:01:27Z

No description provided.

amazon-q-developer · 2025-10-26T19:01:33Z

⏳ Code review in progress. Analyzing for code quality issues and best practices. You can monitor the review status in the checks section at the bottom of this pull request. Detailed findings will be posted upon completion.

Using Amazon Q Developer for GitHub

Amazon Q Developer¹ is an AI-powered assistant that integrates directly into your GitHub workflow, enhancing your development process with intelligent features for code development, review, and transformation.

Slash Commands

Command	Description
`/q <message>`	Chat with the agent to ask questions or request revisions
`/q review`	Requests an Amazon Q powered code review
`/q help`	Displays usage information

Features

Agentic Chat
Enables interactive conversation with Amazon Q to ask questions about the pull request or request specific revisions. Use /q <message> in comment threads or the review body to engage with the agent directly.

Code Review
Analyzes pull requests for code quality, potential issues, and security concerns. Provides feedback and suggested fixes. Automatically triggered on new or reopened PRs (can be disabled for AWS registered installations), or manually with /q review slash command in a comment.

Customization

You can create project-specific rules for Amazon Q Developer to follow:

Create a .amazonq/rules folder in your project root.
Add Markdown files in this folder to define rules (e.g., cdk-rules.md).
Write detailed prompts in these files, such as coding standards or best practices.
Amazon Q Developer will automatically use these rules when generating code or providing assistance.

Example rule:

All Amazon S3 buckets must have encryption enabled, enforce SSL, and block public access.
All Amazon DynamoDB Streams tables must have encryption enabled.
All Amazon SNS topics must have encryption enabled and enforce SSL.
All Amazon SNS queues must enforce SSL.

Feedback

To provide feedback on Amazon Q Developer, create an issue in the Amazon Q Developer public repository.

For more detailed information, visit the Amazon Q for GitHub documentation.

Amazon Q Developer uses generative AI. You may need to verify generated code before using it in your environment. See the AWS Responsible AI Policy. ↩

amazon-q-developer

🚀 Feature Review: dbt Macros Enhancement

This PR introduces significant enhancements to the multitenant analytics platform, adding comprehensive dbt macro support and local development capabilities. The changes demonstrate a well-thought-out approach to scaling multi-tenant data processing.

✅ Strengths

Excellent Local Development Support: The addition of --local flags across all ETL managers provides a valuable development environment using Docker Compose
Comprehensive dbt Macro System: The new macros for dynamic tenant schema detection and batch processing show good architectural thinking for scalability
Consistent Script Renaming: The migration from phase-specific script names to more descriptive names (e.g., aurora-sql-execute.sh) improves maintainability
Robust Configuration Structure: The enhanced config.json with separate remote/local configurations supports multiple deployment scenarios

🔧 Areas for Improvement

Security Concerns: Hardcoded credentials in config.json pose security risks and should use environment variables
Code Quality Issues:
- SQL generation logic in dbt macros has structural problems that could cause compilation errors
- Hardcoded container names create tight coupling
- Missing error handling for edge cases (empty tenant lists)
Configuration Inconsistencies: Mixed references to Glue and PostgreSQL/Redshift adapters could cause confusion

🎯 Recommendations

Priority 1 (Security): Replace hardcoded credentials with environment variable references
Priority 2 (Functionality): Fix the SQL generation logic in union_zero_etl_tenant_tables macro to prevent compilation errors
Priority 3 (Robustness): Add proper error handling for empty tenant schema scenarios

📊 Impact Assessment

Positive: Significantly improves developer experience and system scalability
Risk: Some implementation issues could cause runtime failures if not addressed
Complexity: Adds substantial functionality while maintaining reasonable code organization

The PR represents a major step forward in making the platform more developer-friendly and scalable. With the identified issues addressed, this will be a valuable addition to the codebase.

Coming soon: You'll be able to request the agent to implement changes and create commits on your pull request's source branch. This may trigger any configured CI/CD workflows automatically.

You can disable any CI/CD workflow that you have configured to run on the source branch of the pull request. For more information, see Disabling and enabling a workflow in the GitHub documentation.

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/1-etl-manager.sh

+    echo "  ./2-etl-manager.sh -p $PATTERN -c $CONFIG_FILE --local"
+    echo ""
+    echo "Or test local PostgreSQL connection:"
+    echo "  docker exec -it multitenant-analytics-platform-postgres-1 psql -U dbt_user -d postgres"


Good addition of local execution support! The --local flag provides a valuable development environment option. However, there's a potential issue with the hardcoded container name in the local environment info.

Suggested change

echo " docker exec -it multitenant-analytics-platform-postgres-1 psql -U dbt_user -d postgres"

echo " docker exec -it \$(docker compose ps -q postgres) psql -U dbt_user -d postgres"

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/2-etl-manager.sh

+# Function to transfer files to Docker container based on config.json
+transfer_files_to_docker_container() {
+    local config_file="$1"
+    local container_name="multitenant-analytics-platform-dbt-local-1"


The hardcoded container name creates a tight coupling and potential brittleness. Consider making this configurable or using dynamic container discovery.

Suggested change

local container_name="multitenant-analytics-platform-dbt-local-1"

local container_name=$(docker compose ps -q dbt-local 2>/dev/null | head -1)

if [[ -z "$container_name" ]]; then

container_name="multitenant-analytics-platform-dbt-local-1" # fallback

fi

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/2-etl-manager.sh

+
+    # Verify transfer
+    print_info "Verifying transferred files in Docker container..."
+    local verify_output=$(docker exec "$container_name" bash -c "cd $container_path && echo 'VERIFY: Current directory:' && pwd && echo 'VERIFY: Directory contents:' && ls -la && echo 'VERIFY: Key files check:' && if [ -f config.json ]; then echo 'VERIFY: config.json exists'; else echo 'VERIFY: config.json MISSING'; fi && if [ -f scripts/aurora-sql-execute.sh ]; then echo 'VERIFY: scripts/aurora-sql-execute.sh exists'; else echo 'VERIFY: scripts/auora.sh MISSING'; fi && if [ -f sql/aurora/schema/create-tenant-schemas.sql ]; then echo 'VERIFY: sql/aurora/schema/create-tenant-schemas.sql exists'; else echo 'VERIFY: sql/aurora/schema/create-tenant-schemas.sql MISSING'; fi" 2>/dev/null)


There's a typo in the verification message that could cause confusion during debugging.

Suggested change

local verify_output=$(docker exec "$container_name" bash -c "cd $container_path && echo 'VERIFY: Current directory:' && pwd && echo 'VERIFY: Directory contents:' && ls -la && echo 'VERIFY: Key files check:' && if [ -f config.json ]; then echo 'VERIFY: config.json exists'; else echo 'VERIFY: config.json MISSING'; fi && if [ -f scripts/aurora-sql-execute.sh ]; then echo 'VERIFY: scripts/aurora-sql-execute.sh exists'; else echo 'VERIFY: scripts/auora.sh MISSING'; fi && if [ -f sql/aurora/schema/create-tenant-schemas.sql ]; then echo 'VERIFY: sql/aurora/schema/create-tenant-schemas.sql exists'; else echo 'VERIFY: sql/aurora/schema/create-tenant-schemas.sql MISSING'; fi" 2>/dev/null)

local verify_output=$(docker exec "$container_name" bash -c "cd $container_path && echo 'VERIFY: Current directory:' && pwd && echo 'VERIFY: Directory contents:' && ls -la && echo 'VERIFY: Key files check:' && if [ -f config.json ]; then echo 'VERIFY: config.json exists'; else echo 'VERIFY: config.json MISSING'; fi && if [ -f scripts/aurora-sql-execute.sh ]; then echo 'VERIFY: scripts/aurora-sql-execute.sh exists'; else echo 'VERIFY: scripts/aurora-sql-execute.sh MISSING'; fi && if [ -f sql/aurora/schema/create-tenant-schemas.sql ]; then echo 'VERIFY: sql/aurora/schema/create-tenant-schemas.sql exists'; else echo 'VERIFY: sql/aurora/schema/create-tenant-schemas.sql MISSING'; fi" 2>/dev/null)

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/dbt/macros/zero_etl_tenant_macros.sql

+  {% if sample_tenant is none %}
+    {% set tenant_schemas = get_zero_etl_tenant_schemas(zeroetl_database) %}
+    {% set sample_tenant = tenant_schemas[0] %}
+  {% endif %}


🛑 Potential Runtime Error: This code assumes that tenant_schemas will always have at least one element, but if get_zero_etl_tenant_schemas() returns an empty list, accessing tenant_schemas[0] will cause an index error.

Suggested change

{% if sample_tenant is none %}

{% set tenant_schemas = get_zero_etl_tenant_schemas(zeroetl_database) %}

{% set sample_tenant = tenant_schemas[0] %}

{% endif %}

{% if sample_tenant is none %}

{% set tenant_schemas = get_zero_etl_tenant_schemas(zeroetl_database) %}

{% if tenant_schemas|length > 0 %}

{% set sample_tenant = tenant_schemas[0] %}

{% else %}

{{ log("No tenant schemas found for column introspection", info=true) }}

{{ return([]) }}

{% endif %}

{% endif %}

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/dbt/macros/zero_etl_tenant_macros.sql

+  {% for batch in batches %}
+    {% if batches|length > 1 %}
+      {# バッチ処理時のCTEとして処理 #}
+      batch_{{ loop.index0 }} AS (
+    {% endif %}
+
+    {% for tenant_schema in batch %}
+      {% set table_ref = zeroetl_database ~ '.' ~ tenant_schema ~ '.' ~ table_name %}
+
+      SELECT 
+        '{{ tenant_schema }}'::varchar(50) as tenant_id,
+        {{ select_columns }}
+      FROM {{ table_ref }}
+
+      {% if not loop.last %}
+      UNION ALL
+      {% endif %}
+    {% endfor %}
+
+    {% if batches|length > 1 %}
+      ){% if not loop.last %},{% endif %}
+    {% endif %}
+  {% endfor %}
+
+  {# バッチが複数ある場合は最終的にUNION #}
+  {% if batches|length > 1 %}
+    {% for batch in batches %}
+    SELECT * FROM batch_{{ loop.index0 }}
+    {% if not loop.last %}
+    UNION ALL
+    {% endif %}
+    {% endfor %}
+  {% endif %}


The batch processing logic has a structural issue. The current implementation generates invalid SQL when there are multiple batches because it creates CTEs inside a loop without proper WITH clause structure. This will cause SQL compilation errors.

Suggested change

{% for batch in batches %}

{% if batches|length > 1 %}

{# バッチ処理時のCTEとして処理 #}

batch_{{ loop.index0 }} AS (

{% endif %}

{% for tenant_schema in batch %}

{% set table_ref = zeroetl_database ~ '.' ~ tenant_schema ~ '.' ~ table_name %}

SELECT

'{{ tenant_schema }}'::varchar(50) as tenant_id,

{{ select_columns }}

FROM {{ table_ref }}

{% if not loop.last %}

UNION ALL

{% endif %}

{% endfor %}

{% if batches|length > 1 %}

){% if not loop.last %},{% endif %}

{% endif %}

{% endfor %}

{# バッチが複数ある場合は最終的にUNION #}

{% if batches|length > 1 %}

{% for batch in batches %}

SELECT * FROM batch_{{ loop.index0 }}

{% if not loop.last %}

UNION ALL

{% endif %}

{% endfor %}

{% endif %}

{% if batches|length > 1 %}

{# Multiple batches: use CTE structure #}

WITH

{% for batch in batches %}

batch_{{ loop.index0 }} AS (

{% for tenant_schema in batch %}

{% set table_ref = zeroetl_database ~ '.' ~ tenant_schema ~ '.' ~ table_name %}

SELECT

'{{ tenant_schema }}'::varchar(50) as tenant_id,

{{ select_columns }}

FROM {{ table_ref }}

{% if not loop.last %}

UNION ALL

{% endif %}

{% endfor %}

){% if not loop.last %},{% endif %}

{% endfor %}

{# Final UNION of all batches #}

{% for batch in batches %}

SELECT * FROM batch_{{ loop.index0 }}

{% if not loop.last %}

UNION ALL

{% endif %}

{% endfor %}

{% else %}

{# Single batch: direct UNION #}

{% for tenant_schema in batches[0] %}

{% set table_ref = zeroetl_database ~ '.' ~ tenant_schema ~ '.' ~ table_name %}

SELECT

'{{ tenant_schema }}'::varchar(50) as tenant_id,

{{ select_columns }}

FROM {{ table_ref }}

{% if not loop.last %}

UNION ALL

{% endif %}

{% endfor %}

{% endif %}

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/config.json

+    "local": {
+      "host": "localhost",
+      "port": 5432,
+      "database": "multitenant_analytics",
+      "username": "dbt_user",
+      "password": "dbt_password"
+    },


🛑 Security Vulnerability: Hardcoded credentials in configuration files pose a significant security risk. These credentials could be exposed in version control, logs, or during deployment processes.

Suggested change

"local": {

"host": "localhost",

"port": 5432,

"database": "multitenant_analytics",

"username": "dbt_user",

"password": "dbt_password"

},

"local": {

"host": "localhost",

"port": 5432,

"database": "multitenant_analytics",

"username": "${LOCAL_DB_USER:-dbt_user}",

"password": "${LOCAL_DB_PASSWORD:-dbt_password}"

},

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/dbt/dbt_project.yml

+      +materialized: table
+      +tags: ["marts"]
+
+# dbt-glue specific configurations


The comment mentions "dbt-glue specific configurations" but the project appears to be using PostgreSQL/Redshift adapters based on the profiles and other configurations. This could cause confusion about the intended target platform.

Suggested change

# dbt-glue specific configurations

# Multi-tenant dbt configurations

amazon-q-developer · 2025-10-26T19:03:23Z

databases/data-platform/multitenant-analytics-platform/dbt/dbt_project.yml

+  # Glue Interactive Sessions settings
+  glue_session_role: "dbt-glue-interactive-session-role"
+  glue_region: "us-east-1"


These Glue-specific configurations seem inconsistent with the PostgreSQL/Redshift setup used elsewhere in the project. Consider removing these if not using AWS Glue, or clarify the multi-adapter strategy.

Suggested change

# Glue Interactive Sessions settings

glue_session_role: "dbt-glue-interactive-session-role"

glue_region: "us-east-1"

# Multi-tenant configuration - 1000+テナント対応

Workshop Coder added 4 commits October 18, 2025 20:41

.

eb3d6e6

.

617961b

mod readme

c12ed6b

tmp

6bde9be

amazon-q-developer bot reviewed Oct 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/dbt macros#85

Feature/dbt macros#85
littlemex wants to merge 4 commits intomainfrom
feature/dbt-macros

littlemex commented Oct 26, 2025

Uh oh!

amazon-q-developer bot commented Oct 26, 2025

Slash Commands

Features

Customization

Feedback

Uh oh!

amazon-q-developer bot left a comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

amazon-q-developer bot Oct 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	echo " docker exec -it multitenant-analytics-platform-postgres-1 psql -U dbt_user -d postgres"
	echo " docker exec -it \$(docker compose ps -q postgres) psql -U dbt_user -d postgres"

-    local container_name="multitenant-analytics-platform-dbt-local-1"
+    local container_name=$(docker compose ps -q dbt-local 2>/dev/null | head -1)
+    if [[ -z "$container_name" ]]; then
+        container_name="multitenant-analytics-platform-dbt-local-1"  # fallback
+    fi

	# dbt-glue specific configurations
	# Multi-tenant dbt configurations

Conversation

littlemex commented Oct 26, 2025

Uh oh!

amazon-q-developer bot commented Oct 26, 2025

Slash Commands

Features

Customization

Feedback

Footnotes

Uh oh!

amazon-q-developer bot left a comment

Choose a reason for hiding this comment

🚀 Feature Review: dbt Macros Enhancement

✅ Strengths

🔧 Areas for Improvement

🎯 Recommendations

📊 Impact Assessment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

amazon-q-developer bot Oct 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant