Skip to content

Conversation

@ambushwork
Copy link
Member

What does this PR do?

This PRs fixes the resource duration breakdown feature which is broken since the version 2.12.0, more precisely from this commit.

The main cause of issue:

We have DatadogInteceptor generating the RUM event startResource with the ResourceId(requestUrl, randomUUID) as the key of the resource when intercepting a request, also we have DatadogEventListener to add waitForResourceTiming event and addResourceTiming to report the resource timing for this duration breakdown.

The issue is that, both DatadogInteceptor and DatadogEventListener are generating their own UUID for the same request, so in our event processing they won’t be able to match.

The fix is to remove the generation of UUID in DatadogEventListener since its key is only used for enhancing the current resource.

Motivation

RUM-12891

Demo

Before image
After image

Review checklist (to be filled by reviewers)

  • Feature or bugfix MUST have appropriate tests (unit, integration, e2e)
  • Make sure you discussed the feature or bugfix with the maintaining team in an Issue
  • Make sure each commit and the PR mention the Issue number (cf the CONTRIBUTING doc)

@ambushwork ambushwork requested review from a team as code owners November 18, 2025 09:33
@datadog-datadog-prod-us1

This comment has been minimized.

/** @inheritdoc */
override fun create(call: Call): EventListener {
val resourceId = call.request().buildResourceId(generateUuid = true)
val resourceId = call.request().buildResourceId(generateUuid = false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this fixes the issue given that under the hood it will first try to pull existing UUID from the Request.tag, which should already be there if call to interceptor was made before.

Can you please give more details?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's because the UUID we generate from DatadogIntercptor is not put inside Request.tag, so here if we try to retrieve tag(UUID::class.java) it will always return null. The UUID is only used as a RUM resource key.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's because the UUID we generate from DatadogInterceptor is not put inside Request.tag

This is probably a real cause of the issue, it should be there and it is a miss from our side.

Basically we always should rely on the UUID, because the legacy way to identify the request is not reliable: say there are multiple GET calls in parallel with the same URL - they will be identified as the same call. Same for multiple POST calls with the same body length.

So when we generate UUID we should put it to the request as tag as well.

And we should add the tests to the reliability/single-fit-okhttp that for a call made with instrumentation setup startResource / stopResource/waitForResourceTiming/addResourceTiming all have the same key argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see the issue, then this is a bit tricky to handle with our current implementation, because create in DatadogEventListener will be called first, the call is immutable inside, meaning that we are not able to pass our resource id along with the request to DatadogInterceptor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the same time we have an access to the Call object in all the callbacks of EventListener, meaning we can read the actual state of request to get the ID in each callback instead of passing it at the DatadogEventListener constructor, probably it will still get consistent results for the same call.

I'm not sure at which point callStart is called, but if it is after intercept, then we should be good.

Overall, we shouldn't indeed mutate anything here as per the following comment.

Copy link
Member Author

@ambushwork ambushwork Nov 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

callStart also happens before intercept, and other functions are not systematically called after intercept neither, we will need to dig another way out.

@codecov-commenter
Copy link

codecov-commenter commented Nov 18, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 71.12%. Comparing base (bc35e34) to head (2e3e915).
⚠️ Report is 7 commits behind head on develop.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3010      +/-   ##
===========================================
+ Coverage    71.08%   71.12%   +0.03%     
===========================================
  Files          859      859              
  Lines        31315    31316       +1     
  Branches      5276     5276              
===========================================
+ Hits         22260    22271      +11     
+ Misses        7552     7547       -5     
+ Partials      1503     1498       -5     
Files with missing lines Coverage Δ
...com/datadog/android/okhttp/DatadogEventListener.kt 96.05% <100.00%> (-0.10%) ⬇️

... and 38 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants