Skip to content

Handle delete CDC events with _planetscale_operation field#150

Open
lahdirakram wants to merge 1 commit intoplanetscale:mainfrom
lahdirakram:feature/capture_deletions
Open

Handle delete CDC events with _planetscale_operation field#150
lahdirakram wants to merge 1 commit intoplanetscale:mainfrom
lahdirakram:feature/capture_deletions

Conversation

@lahdirakram
Copy link
Copy Markdown
Contributor

@lahdirakram lahdirakram commented Apr 4, 2026

Summary

This PR adds explicit handling for delete CDC events in the PlanetScale Airbyte source.

Today, the connector only emits row changes when the VStream row change contains an after image. In practice, that means delete events are silently dropped because Vitess encodes deletes as before != nil && after == nil.

This change makes delete handling explicit and surfaces the operation type on every emitted record through a top-level _planetscale_operation field.

Problem

The current behavior loses information for CDC consumers:

  • inserts are emitted
  • updates are emitted
  • deletes are ignored

For downstream systems, this means the connector cannot faithfully represent table state over time. Any consumer that expects CDC semantics will drift, because rows removed in PlanetScale never produce a corresponding event.

Proposed solution

This PR introduces two related changes:

  1. Add a top-level _planetscale_operation field to emitted records with values:

    • insert
    • update
    • delete
  2. Add capture_deletes as a source config flag to control whether delete events are emitted.

Behavior:

  • inserts emit the after image with _planetscale_operation = "insert"
  • updates emit the after image with _planetscale_operation = "update"
  • deletes emit the before image with _planetscale_operation = "delete" when capture_deletes=true

Why a top-level field instead of _planetscale_metadata.operation

I originally considered putting the operation inside _planetscale_metadata, but I think that creates the wrong coupling.

Reasons:

  • Airbyte does not provide a standard protocol-level CRUD operation field on AirbyteRecordMessage
  • operation type is not just auxiliary metadata; it is part of the semantic meaning of the record
  • tying delete support to include_metadata makes delete capture harder to use than necessary
  • a top-level field gives downstream consumers a simple, explicit contract without forcing them to opt into all metadata

This keeps the design cleaner:

  • _planetscale_operation communicates row semantics
  • _planetscale_metadata remains optional transport/replication metadata

Why this is worth adding

This change improves correctness more than convenience.

Without delete emission, the connector is not producing a complete CDC stream. With this PR:

  • downstream systems can reconcile deletes correctly
  • CDC behavior becomes explicit instead of implicit
  • the connector better reflects Vitess row-change semantics
  • consumers can choose whether they want delete events through capture_deletes

Backward compatibility

This is designed to minimize disruption:

  • delete emission is opt-in via capture_deletes
  • _planetscale_metadata remains optional
  • existing metadata behavior is preserved
  • existing insert/update flows are unchanged except for the addition of _planetscale_operation

Testing

This PR adds coverage for:

  • insert/update/delete row classification
  • delete emission when capture_deletes=true
  • delete suppression when capture_deletes=false
  • operation labeling on emitted records
  • full-refresh/COPY rows being labeled as insert

Notes

I also added a targeted log message for captured delete rows to make runtime verification easier while validating this behavior.

If maintainers prefer a different field name or want delete capture enabled by default, I can adjust that, but I think the important part is to stop silently dropping delete events and provide an explicit CDC contract.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant