Skip to content

Conversation

@nina9753
Copy link

@nina9753 nina9753 commented Aug 11, 2025

What does this PR do?

This PR adds comprehensive support for Google Cloud Pub/Sub push subscriptions, enabling distributed tracing for messages delivered via HTTP webhooks. Unlike pull subscriptions, where the application pulls for messages using the SDK, push subscriptions have GCP Pub/Sub POST messages directly to an HTTP endpoint.

New Plugin: pubsub-push-subscription.js

  • Creates pubsub.delivery spans for incoming push subscription HTTP requests
  • Intercepts HTTP requests with GCP-specific headers:
  • Extracts trace context from message attributes (injected by producer)
  • Parses message metadata (subscription name, topic, publish time)
  • Finishes span when HTTP response completes

Modified: packages/datadog-instrumentations/src/http.js

  • So the push subscription plugin must subscribe to apm:http:server:request:start before the HTTP server plugin
    • Detect Pub/Sub push requests early
    • Create pubsub.delivery span as active parent
    • Ensure web framework spans (Express, Fastify, etc.) become children of the delivery span

Next PR in the batch is #6782

Motivation

An inferred span for the push subscription HTTP POST request to the Cloud Run service from a pub/sub topic
Example full Push Distributed Trace of a cloud run service triggering another service using a push subscription

Plugin Checklist

Additional Notes

Critical for Distributed Tracing: When creating or updating a GCP Pub/Sub push subscription, you must include the --push-no-wrapper-write-metadata flag to enable trace context propagation. By default, GCP Pub/Sub wraps push messages in a JSON envelope and does not include message attributes as HTTP headers. We do not currently read the req.body for pubsub distributed tracing

Additional information can be found in this doc

image

@github-actions
Copy link

github-actions bot commented Aug 11, 2025

Overall package size

Self size: 13.42 MB
Deduped: 113.62 MB
No deduping: 128.63 MB

Dependency sizes | name | version | self size | total size | |------|---------|-----------|------------| | @datadog/libdatadog | 0.7.0 | 35.02 MB | 35.02 MB | | @datadog/native-appsec | 10.3.0 | 20.73 MB | 20.74 MB | | @datadog/pprof | 5.12.0 | 11.19 MB | 11.57 MB | | @datadog/native-iast-taint-tracking | 4.1.0 | 9.01 MB | 9.02 MB | | @opentelemetry/resources | 1.30.1 | 557.67 kB | 7.71 MB | | @opentelemetry/core | 1.30.1 | 908.66 kB | 7.16 MB | | protobufjs | 7.5.4 | 2.95 MB | 5.83 MB | | @datadog/wasm-js-rewriter | 5.0.1 | 2.82 MB | 3.53 MB | | @datadog/native-metrics | 3.1.1 | 1.02 MB | 1.43 MB | | @opentelemetry/api-logs | 0.208.0 | 199.48 kB | 1.42 MB | | @opentelemetry/api | 1.9.0 | 1.22 MB | 1.22 MB | | jsonpath-plus | 10.3.0 | 617.18 kB | 1.08 MB | | import-in-the-middle | 1.15.0 | 127.66 kB | 856.24 kB | | lru-cache | 10.4.3 | 804.3 kB | 804.3 kB | | @datadog/openfeature-node-server | 0.2.0 | 118.51 kB | 437.19 kB | | opentracing | 0.14.7 | 194.81 kB | 194.81 kB | | source-map | 0.7.6 | 185.63 kB | 185.63 kB | | pprof-format | 2.2.1 | 163.06 kB | 163.06 kB | | @datadog/sketches-js | 2.1.1 | 109.9 kB | 109.9 kB | | @isaacs/ttlcache | 2.1.2 | 90.79 kB | 90.79 kB | | lodash.sortby | 4.7.0 | 75.76 kB | 75.76 kB | | ignore | 7.0.5 | 63.38 kB | 63.38 kB | | istanbul-lib-coverage | 3.2.2 | 34.37 kB | 34.37 kB | | rfdc | 1.4.1 | 27.15 kB | 27.15 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB | | tlhunter-sorted-set | 0.1.0 | 24.94 kB | 24.94 kB | | shell-quote | 1.8.3 | 23.74 kB | 23.74 kB | | limiter | 1.1.5 | 23.17 kB | 23.17 kB | | retry | 0.13.1 | 18.85 kB | 18.85 kB | | semifies | 1.0.0 | 15.84 kB | 15.84 kB | | jest-docblock | 29.7.0 | 8.99 kB | 12.76 kB | | crypto-randomuuid | 1.0.0 | 11.18 kB | 11.18 kB | | ttl-set | 1.0.0 | 4.61 kB | 9.69 kB | | mutexify | 1.4.0 | 5.71 kB | 8.74 kB | | path-to-regexp | 0.1.12 | 6.6 kB | 6.6 kB | | module-details-from-path | 1.0.4 | 3.96 kB | 3.96 kB | | escape-string-regexp | 5.0.0 | 3.66 kB | 3.66 kB |

🤖 This report was automatically generated by heaviest-objects-in-the-universe

@codecov
Copy link

codecov bot commented Aug 11, 2025

Codecov Report

❌ Patch coverage is 24.67532% with 58 lines in your changes missing coverage. Please review.
✅ Project coverage is 84.44%. Comparing base (0bb1f17) to head (80b7ddf).

Files with missing lines Patch % Lines
...oogle-cloud-pubsub/src/pubsub-push-subscription.js 27.27% 32 Missing ⚠️
...atadog-instrumentations/src/google-cloud-pubsub.js 16.12% 26 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6260      +/-   ##
==========================================
- Coverage   84.94%   84.44%   -0.51%     
==========================================
  Files         514      507       -7     
  Lines       21754    21768      +14     
==========================================
- Hits        18478    18381      -97     
- Misses       3276     3387     +111     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@nina9753 nina9753 marked this pull request as draft August 11, 2025 17:43
@pr-commenter
Copy link

pr-commenter bot commented Aug 11, 2025

Benchmarks

Benchmark execution time: 2025-12-01 20:32:51

Comparing candidate commit 80b7ddf in PR branch nina.rei/SVLS-7168/gcp-push-pubsub-plugin with baseline commit 0bb1f17 in branch master.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 291 metrics, 29 unstable metrics.

Copy link
Collaborator

@BridgeAR BridgeAR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite a big PR, if you like, we could set up a sync call to go through the changes to review the PR :)

@nina9753 nina9753 marked this pull request as draft August 27, 2025 17:56
- Remove unused variables (mockRes, mockEmit, mockServer)
- Fix trailing spaces
- Fix line length issues
- Fix dot notation for object properties
- Fix multiple empty lines
- Fix padded blocks
- Fix consistent spacing between blocks
- Remove docs/pubsub-transit-handler.md from plugin branch
- Add docs/pubsub-transit-handler.md to .gitignore to keep local only
- This prevents the docs file from being included in PRs
@datadog-official
Copy link

datadog-official bot commented Sep 10, 2025

⚠️ Tests

⚠️ Warnings

🧪 3 Tests failed

express-mongo-sanitize with express-mongo-sanitize >=1.0.0 (1.0.0) middleware without subscriptions it continues working with sanitization request from without subscriptions (Datadog)
Expected values to be strictly equal:
+ actual - expected

+ 'paramvalue'
- undefined

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
+ actual - expected

+ 'paramvalue'
...
express-mongo-sanitize with express-mongo-sanitize >=1.0.0 (1.0.0) middleware with subscriptions it continues working with sanitization request from with subscriptions (Datadog)
Expected values to be strictly equal:
+ actual - expected

+ 'paramvalue'
- undefined

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
+ actual - expected

+ 'paramvalue'
...
express-mongo-sanitize with express-mongo-sanitize >=1.0.0 (1.0.0) middleware with subscriptions subscription is called with expected parameters with sanitization request from with subscriptions (Datadog)
Expected values to be strictly equal:
+ actual - expected

+ 'paramvalue'
- undefined

AssertionError [ERR_ASSERTION]: Expected values to be strictly equal:
+ actual - expected

+ 'paramvalue'
...

ℹ️ Info

❄️ No new flaky tests detected

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 80b7ddf | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@nina9753
Copy link
Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

_createDeliverySpan (messageData, parentContext, tracer) {
const { message, subscription, topicName, attrs } = messageData
const subscriptionName = subscription.split('/').pop() || subscription
const publishStartTime = attrs['x-dd-publish-start-time']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is x-dd-publish-start-time? It kinda looks like an HTTP header.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for catching this. It was accidentally moved to the child PR when i refactored this PR

@tlhunter
Copy link
Member

We're definitely going to want a test with this.

For example we should simulate the situation where an HTTP request is received and it contains the Google pub sub user agent and headers and makes the appropriate span. Is the intent to only have the google pub sub span or also have an HTTP span with it? And which one is the parent? That's all stuff that should be a part of the test.

Comment on lines 10 to 17
try {
const PushSubscriptionPlugin = require('../../datadog-plugin-google-cloud-pubsub/src/pubsub-push-subscription')
new PushSubscriptionPlugin(null, {}).configure({})
} catch {
// Push subscription plugin is optional
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What part of this is expected to throw?

Copy link
Author

@nina9753 nina9753 Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The plugin constructor could fail, but I can update this so users can know exactly what/why it failed but the plugin is only required for push subscriptions that have the --push-no-wrapper-write-metadata flag. I wrapped the plugin load in a try catch so even if it fails we wont lose traces just the trace context propagation

catch (e) {
  log.debug(`PushSubscriptionPlugin not loaded: ${e.message}`)
}

@nina9753
Copy link
Author

We're definitely going to want a test with this.

For example we should simulate the situation where an HTTP request is received and it contains the Google pub sub user agent and headers and makes the appropriate span. Is the intent to only have the google pub sub span or also have an HTTP span with it? And which one is the parent? That's all stuff that should be a part of the test.

I've tested many different situations, but an HTTP request will only ever include the Google PubSub headers and user agent when the request originates from PubSub with the flag as stated in Google's documentation, which means we need to handle that. I have another PR in progress to test all these changes, and I’ve already deployed the code to org 2 to verify the traces and ensure distributed traces are functioning correctly across different scenarios. I’ve specifically tested multiple frameworks and direct HTTP requests, both with and without the --push-no-wrapper-write-metadata flag, as well as all deployment methods for a Cloud Run service using Pub/Sub. I can update the test PR with more examples beyond what’s currently included in the other PRs' details.

@nina9753 nina9753 force-pushed the nina.rei/SVLS-7168/gcp-push-pubsub-plugin branch from 7862105 to 505588b Compare December 1, 2025 15:19
@nina9753 nina9753 requested a review from tlhunter December 1, 2025 15:20
Comment on lines +8 to +17
const log = require('../../dd-trace/src/log')

// Auto-load push subscription plugin to enable pubsub.delivery spans for push subscriptions
try {
const PushSubscriptionPlugin = require('../../datadog-plugin-google-cloud-pubsub/src/pubsub-push-subscription')
new PushSubscriptionPlugin(null, {}).configure({})
} catch (e) {
// Push subscription plugin is optional
log.debug(`PushSubscriptionPlugin not loaded: ${e.message}`)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it's not the requiring of the file that can fail but only the class instantiation. For that reason let's hoist the require out of the try/catch. It's better to make try/catch super specific IMO.

Suggested change
const log = require('../../dd-trace/src/log')
// Auto-load push subscription plugin to enable pubsub.delivery spans for push subscriptions
try {
const PushSubscriptionPlugin = require('../../datadog-plugin-google-cloud-pubsub/src/pubsub-push-subscription')
new PushSubscriptionPlugin(null, {}).configure({})
} catch (e) {
// Push subscription plugin is optional
log.debug(`PushSubscriptionPlugin not loaded: ${e.message}`)
}
const log = require('../../dd-trace/src/log')
const PushSubscriptionPlugin = require('../../datadog-plugin-google-cloud-pubsub/src/pubsub-push-subscription')
// Auto-load push subscription plugin to enable pubsub.delivery spans for push subscriptions
try {
new PushSubscriptionPlugin(null, {}).configure({})
} catch (e) {
// Push subscription plugin is optional
log.debug(`PushSubscriptionPlugin not loaded: ${e.message}`)
}

if (attributes['x-datadog-trace-id'] || attributes.traceparent) return

try {
const tracer = require('../../dd-trace')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should also be hoisted to the top of the file.

const traceIdUpperBits = activeSpan.context()._trace.tags['_dd.p.tid']
if (traceIdUpperBits) attributes['_dd.p.tid'] = traceIdUpperBits
} catch {
// Silently fail - trace context injection is best-effort
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a log here as well. Otherwise it would be difficult to diagnose if the feature isn't working for customers who expect it to.

I'd also like to know if you're seeing any failures here during test suite runs. For example if the activeSpan.context()._trace.tags['_dd.p.tid'] line fails with something like "cannot access _trace of undefined" then the code might better be served looking like activeSpan.context()?._trace.tags['_dd.p.tid']`, and nothing else ever fails, then maybe we can remove the try/catch entirely.

const topicName = topic.split('/').pop() || topic
const span = this.startSpan({ // TODO: rename
resource: `${api} ${topic}`,
resource: `${api} to Topic ${topicName}`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it could be a breaking change for customers who are expecting the old resource name.

Is there an RFC or something explaining how the new approach is the correct approach?

const messageData = this._parseMessage(req)
if (!messageData) return

const tracer = this.tracer || require('../../dd-trace')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This require call should be hoisted to the top of the file

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you finding that this.tracer is sometimes available and sometimes not? If so it might be that a test just isn't providing the tracer when it should be. Ideally we wouldn't need to access the tracer from two sources like this.

Comment on lines +91 to +95
'_dd.serviceoverride.type': 'integration'
}
})

span.setTag('resource.name', `Push Subscription ${subscriptionName}`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work?

Suggested change
'_dd.serviceoverride.type': 'integration'
}
})
span.setTag('resource.name', `Push Subscription ${subscriptionName}`)
'_dd.serviceoverride.type': 'integration',
'resource.name': `Push Subscription ${subscriptionName}`
}
})


_extractProjectTopic (attrs, subscription) {
const topicName = attrs['pubsub.topic']
const projectId = subscription.match(/projects\/([^\\/]+)\/subscriptions/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does the projectId get used?

Notably the String#match() method either returns an array where the [1] index is the match, or a null if there is no match.

"projects/abcde/subscriptions".match(/projects\/([^\\/]+)\/subscriptions/)
[
  'projects/abcde/subscriptions',
  'abcde',
  index: 0,
  input: 'projects/abcde/subscriptions',
  groups: undefined
]

With a name like projectId I'm assuming you would want to return the string at [1].

@tlhunter
Copy link
Member

tlhunter commented Dec 3, 2025

As discused on Zoom please add an experimental env var to enable this functionality. Here's a recent example that shows the two config files that need updating and how to access the config in code:

#6733

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants