Skip to content

feat: Obfuscate potential PII in logs#1163

Open
alexs-mparticle wants to merge 2 commits intodevelopmentfrom
feat/SDKE-889-logger-obfuscate-data
Open

feat: Obfuscate potential PII in logs#1163
alexs-mparticle wants to merge 2 commits intodevelopmentfrom
feat/SDKE-889-logger-obfuscate-data

Conversation

@alexs-mparticle
Copy link
Collaborator

Background

The Logger currently outputs raw batch and event payloads at various log levels (error, warning, verbose). These payloads can include Personally Identifiable Information (PII)—data that can identify or be reasonably linked to an individual, such as email addresses, phone numbers, names, user IDs, IP addresses, or other user-level identifiers.

What Has Changed

  • Adds obfuscateData method to be used when logging payloads
  • Updates logger calls to to use obfuscateData method when passing in payloads that may contain PII
  • Updates identity calls to target known_identities payload for obfuscation, while allowing the rest of the identity payload to be visible for debugging
  • Removes Logger from Vault to avoid spamming verbose logs

Screenshots/Video

  • {Include any screenshots or video demonstrating the new feature or fix, if applicable}

Checklist

  • I have performed a self-review of my own code.
  • I have made corresponding changes to the documentation.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have tested this locally.

Additional Notes

  • {Any additional information or context relevant to this PR}

Reference Issue (For employees only. Ignore if you are an outside contributor)

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces the risk of leaking PII by obfuscating logged payloads (events/batches/attributes/identity responses), and reduces verbose logging noise by removing Vault logging.

Changes:

  • Added obfuscateData utility (with Jest coverage) to replace primitive values with type strings while preserving object/array structure.
  • Updated verbose logger call sites (BatchUploader, RoktManager, IdentityAPIClient) to log obfuscated payloads instead of raw data.
  • Removed Logger plumbing from Vault usage to avoid verbose log spam.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/utils.ts Adds obfuscateData helper and exports it for use by logging call sites.
test/jest/utils.spec.ts Adds comprehensive Jest tests for obfuscateData.
src/batchUploader.ts Obfuscates queued event and upload batch payloads before verbose logging; removes Vault logger injection.
src/roktManager.ts Obfuscates selectPlacements attributes in verbose logs; improves identify error formatting.
test/jest/roktManager.spec.ts Updates expectation to match obfuscated verbose logging output.
src/identityApiClient.ts Obfuscates identity response matched_identities in verbose logs.
src/vault.ts Removes Logger usage from Vault internals and changes storage write error handling.
src/mp-instance.ts Removes Vault logger injection for identity cache; adds TS type annotation for setOptOut arg.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

`Cannot Save items to Storage: ${stringifiedItem}`
);
this.logger.error(error as string);
throw new Error('Cannot Save items to Storage');
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseVault.store now throws on setItem failures. Callers (e.g., event/batch persistence) don’t catch this, so quota/security errors will bubble up and can break normal SDK flows (queueing events, uploading, etc.). Previously this was non-fatal; consider restoring a non-throwing behavior (or returning a success flag) and logging a sanitized message (avoid including the stored payload).

Suggested change
throw new Error('Cannot Save items to Storage');
// Swallow storage errors to avoid breaking normal SDK flows.
// Log a sanitized message without including the stored payload.
try {
const errorMessage =
error && typeof (error as Error).message === 'string'
? (error as Error).message
: String(error);
if (typeof console !== 'undefined' && typeof console.error === 'function') {
console.error(
`BaseVault.store: failed to persist data for key "${this._storageKey}".`,
errorMessage
);
}
} catch {
// If logging itself fails, do nothing to maintain non-throwing behavior.
}

Copilot uses AI. Check for mistakes.
} catch (error) {
this.logger.error('Failed to identify user with new email: ' + JSON.stringify(error));
const errorMessage = error instanceof Error ? error.message : JSON.stringify(error);
this.logger.error('Failed to identify user with new email: ' + errorMessage);
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch block can be reached when only the hashed email identity changes (not the email). The error message says "new email", which is misleading and makes debugging harder. Consider making the message identity-agnostic (e.g., "Failed to identify user with updated identities").

Suggested change
this.logger.error('Failed to identify user with new email: ' + errorMessage);
this.logger.error('Failed to identify user with updated identities: ' + errorMessage);

Copilot uses AI. Check for mistakes.
Comment on lines +280 to +283
const responseText = identityResponse.responseText;
const { matched_identities, ...rest } = responseText || {};
const obfuscatedMatchedIdentities = obfuscateData(matched_identities);

Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description mentions obfuscating the known_identities payload, but this change obfuscates matched_identities in the response log. Please confirm the intended field(s) to obfuscate and update either the implementation or the PR description for consistency.

Copilot uses AI. Check for mistakes.
Comment on lines 23 to 27
this._storageKey = storageKey;
this.storageObject = storageObject;

// Add a fake logger in case one is not provided or needed
this.logger = options?.logger || {
verbose: () => {},
warning: () => {},
error: () => {},
};

this.contents = this.retrieve();
}
Copy link

Copilot AI Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

options is still accepted by BaseVault/LocalStorageVault/SessionStorageVault, but it’s no longer used anywhere in the constructor. If the repo’s TS/GTS rules enforce unused-parameter checks, this may fail lint/typecheck. Consider removing the parameter/interface or renaming to _options until it’s needed.

Copilot uses AI. Check for mistakes.
@alexs-mparticle alexs-mparticle marked this pull request as ready for review February 10, 2026 19:46
Copy link
Contributor

@jaissica12 jaissica12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alexs-mparticle can we pull the latest changes from the development branch? I recall exposing the processMessageQueue method in my last PR, but it still appears to be private here. We may also need to obfuscate the logger in that section as well

const sandboxValue = attributes?.sandbox || null;
const mappedAttributes = this.mapPlacementAttributes(attributes, this.placementAttributesMapping);
this.logger?.verbose(`mParticle.Rokt selectPlacements called with attributes:\n${JSON.stringify(attributes, null, 2)}`);
this.logger?.verbose(`mParticle.Rokt selectPlacements called with attributes:\n${JSON.stringify(obfuscateData(attributes), null, 2)}`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added for our support staff to be able to easily see what a customer initially passed to selectPlacements prior to us enriching the attributes with other MP related items. We may need to remove this one temporarily, or provide another logLevel, like debug, but that will require an additional audit to see what would be better for debug vs verbose

@rmi22186
Copy link
Member

rmi22186 commented Feb 14, 2026

Perhaps we need to internally define what verbose is used for. Do we want a separate debug so that in a customer implementation, support staff and the customer know what's going on? Our docs just say verbose is: "Communicates the internal state and processes of the SDK (includes info, warnings, and errors).".

Or perhaps verbose prevents PII from showing, but then if they choose debug, it maps to verbose, but doesn't call obfuscate

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants