feat(ibis): Show full Redshift table metadata by querying system catalogs #1345

douenergy · 2025-10-08T06:40:05Z

The current Redshift metadata query uses information_schema.tables and information_schema.columns, which only shows schemas that are in the user's search_path or owned by the user. This causes visibility issues like Canner/WrenAI#1953

Summary by CodeRabbit

Bug Fixes
- More accurate Redshift metadata: corrected nullability and data type reporting; improved table/column comments.
- Excludes system/internal schemas and dropped/system columns; respects namespace privileges.
- Consistent ordering by schema, table, then column; better handling of views alongside tables.
Performance
- Faster and more reliable retrieval of Redshift table and column metadata.

coderabbitai · 2025-10-08T06:40:27Z

Walkthrough

The Redshift metadata query in get_table_list was replaced: it now queries PostgreSQL catalog tables (pg_class/pg_namespace/pg_attribute) and uses format_type/current_database() for types and catalogs, with revised filters, nullability logic, and ordering. External interface and Table/Column construction remain unchanged.

Changes

Cohort / File(s)	Summary of Changes
Redshift metadata query rewrite `ibis-server/app/model/metadata/redshift.py`	Replaced information_schema-based SQL with a catalog-driven query against `pg_class`/`pg_namespace`/`pg_attribute`; selected fields now use `current_database()`/`nspname`/`relname`/`attname`/`format_type`; nullability moved to a `CASE` on `attnotnull`; filters restrict `relkind` to `'r'` and `'v'`, exclude system schemas/columns, exclude dropped columns, require `USAGE` on namespace, and order by schema, table, column position; existing helper transforms and Table/Column construction preserved.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Metadata as RedshiftMetadata
  participant Conn as SQLConnector
  participant Redshift as Redshift

  Client->>Metadata: get_table_list()
  Metadata->>Conn: execute(catalog-based SQL)
  Conn->>Redshift: SELECT from pg_namespace/pg_class/pg_attribute
  Redshift-->>Conn: rows (catalog, schema, table, column, type, nullable, comment)
  Conn-->>Metadata: result set
  Note right of Metadata: Map rows → Table/Column objects<br/>Use helpers: _format_redshift_compact_table_name, _transform_redshift_column_type
  Metadata-->>Client: list[Table]

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Pay attention to the new SQL correctness and edge cases:
- type extraction via format_type(...) and its parsing in _transform_redshift_column_type
- nullability logic change from information_schema to attnotnull CASE expression
- privilege check (has_schema_privilege(nsp.oid, 'USAGE')) and schema exclusions
- handling of system/dropped columns and ordering semantics

Poem

I thump the logs and sniff each name,
From pg_class I learn the game.
Columns, types, and schemas bright—
I hop through catalogs all night.
A carrot cheer for metadata’s light. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: querying Redshift system catalogs instead of information_schema to show full table metadata, which directly addresses the PR's core objective.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 923d6a7 and 7d1b853.

📒 Files selected for processing (1)

ibis-server/app/model/metadata/redshift.py (1 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci

🔇 Additional comments (1)

ibis-server/app/model/metadata/redshift.py (1)

114-153: Use pg_catalog for foreign keys in get_constraints

get_constraints still relies on information_schema, which has the same schema-visibility limitations that led to pg_catalog in get_table_list. Confirm whether Redshift’s pg_catalog.pg_constraint (joined with pg_class and pg_attribute) exposes foreign key metadata, and refactor get_constraints to use it for consistency and full coverage.

ibis-server/app/model/metadata/redshift.py

goldmedal · 2025-10-29T06:21:31Z

@douenergy, is this PR tested well?

douenergy · 2025-11-19T05:29:23Z

Oracle testing failures are not related to this PR. I have tested it manually on my local environment.

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

ibis-server/app/model/metadata/redshift.py (1)
66-66: format_type output breaks REDSHIFT_TYPE_MAPPING for length/precision-qualified types.

format_type(a.atttypid, a.atttypmod) returns strings like "character varying(100)" or "numeric(10,2)", which will not match the base-type keys in REDSHIFT_TYPE_MAPPING, so those columns will resolve to UNKNOWN and generate warnings instead of the expected mapped types. Normalize data_type by stripping modifiers before lookup.

You can fix this inside _transform_redshift_column_type:
     def _transform_redshift_column_type(
         self, data_type: str
     ) -> RustWrenEngineColumnType:
@@
-        # Convert to lowercase for comparison
-        normalized_type = data_type.lower()
-
-        # Use the module-level mapping table
-        mapped_type = REDSHIFT_TYPE_MAPPING.get(
-            normalized_type, RustWrenEngineColumnType.UNKNOWN
-        )
+        # Convert to lowercase and strip type modifiers (e.g., "varchar(100)" -> "varchar")
+        normalized_type = data_type.lower()
+        # Remove everything after the first parenthesis or bracket
+        if "(" in normalized_type:
+            normalized_type = normalized_type.split("(", 1)[0].strip()
+        elif "[" in normalized_type:
+            normalized_type = normalized_type.split("[", 1)[0].strip()
+
+        # Use the module-level mapping table
+        mapped_type = REDSHIFT_TYPE_MAPPING.get(
+            normalized_type, RustWrenEngineColumnType.UNKNOWN
+        )
This is the same underlying issue noted in the earlier review and still appears unresolved.

Also applies to: 166-188

🧹 Nitpick comments (2)

ibis-server/app/model/metadata/redshift.py (2)

61-79: Catalog-based table/column query looks sound; verify desired object coverage.

The pg_class/pg_namespace/pg_attribute join, filters, and ordering should give a consistent view of regular tables and views with correct nullability and comments. If you also want materialized views or other relation kinds exposed in metadata, consider extending c.relkind IN ('r', 'v') accordingly once you’ve confirmed the relkind values used by Redshift in your environment.

114-153: Constraints still use information_schema; visibility may lag the new catalog-based table list.

get_constraints continues to query information_schema.*; if Redshift enforces the same search_path/ownership restrictions there, you might list tables from schemas whose foreign keys never appear. Consider moving constraints to a catalog-based query as well or at least confirming that the current views expose all constraints you expect after this change.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7d1b853 and 218ebfa.

📒 Files selected for processing (1)

ibis-server/app/model/metadata/redshift.py (1 hunks)

goldmedal · 2025-12-03T06:16:02Z

@douenergy did you check if a user with a lower permission could get the tables that he shouldn't see through the pg_catalog path?

github-actions bot added ibis python Pull requests that update Python code labels Oct 8, 2025

coderabbitai bot reviewed Oct 8, 2025

View reviewed changes

ibis-server/app/model/metadata/redshift.py Show resolved Hide resolved

douenergy marked this pull request as draft October 13, 2025 01:58

use pg_catalog instead of redshift imformation schema

218ebfa

douenergy force-pushed the redshift-full-metadata branch from 7d1b853 to 218ebfa Compare November 19, 2025 05:13

douenergy marked this pull request as ready for review November 19, 2025 05:32

douenergy assigned goldmedal Nov 19, 2025

coderabbitai bot reviewed Nov 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ibis): Show full Redshift table metadata by querying system catalogs #1345

feat(ibis): Show full Redshift table metadata by querying system catalogs #1345

Uh oh!

douenergy commented Oct 8, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 8, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

goldmedal commented Oct 29, 2025

Uh oh!

douenergy commented Nov 19, 2025

Uh oh!

coderabbitai bot left a comment

Uh oh!

goldmedal commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(ibis): Show full Redshift table metadata by querying system catalogs #1345

Are you sure you want to change the base?

feat(ibis): Show full Redshift table metadata by querying system catalogs #1345

Uh oh!

Conversation

douenergy commented Oct 8, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

goldmedal commented Oct 29, 2025

Uh oh!

douenergy commented Nov 19, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

goldmedal commented Dec 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

douenergy commented Oct 8, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 8, 2025 •

edited

Loading