Skip to content

Conversation

@adiati98
Copy link
Owner

Problem

Contributions that belong to multiple categories were not persisting correctly across runs:

  • PRs both reviewed and co-authored would appear in only one category on subsequent runs
  • Co-authored status was not detected when commits were pushed after initial fetch
  • Unintended duplicates appeared across non-compliant categories

Root Causes & Solutions

1. Stale Commit Cache (Turn 2)

Issue: getFirstCommitDetails() cached "no commits found" (null) keyed only by PR URL. On re-fetch, the stale null entry prevented re-checking when new commits were pushed later.

Solution (scripts/github-api-fetchers.js):

  • Accept prUpdatedAt parameter in getFirstCommitDetails()
  • Use composite cache key: PR URL + PR updated_at timestamp
  • Only reuse cached results if prUpdatedAt matches current PR updated_at
  • Return explicit objects { firstCommitDate: null, commitCount: 0, prUpdatedAt } instead of null to distinguish "checked and found nothing" from "not checked"
  • Pass pr.updated_at / item.updated_at to all call sites
  • Updated conditions from if (commitDetails) to if (commitDetails && commitDetails.firstCommitDate) to properly handle explicit null caches

2. Multi-Category Persistence & Deduplication (Turn 2-3)

Issue: Merge logic didn't preserve multi-category membership or applied incorrect deduplication rules.

Solution (scripts/main.js):

  • Enforce category hierarchy: pullRequestsreviewedPrs/coAuthoredPrscollaborations
  • Use globalLoadedBy Map to track which categories each URL belongs to
  • During load: Allow URL in multiple categories only if all are in the higher tier (reviewedPrs + coAuthoredPrs)
  • During merge:
    • When adding to higher tier and URL exists in lower tier, remove from lower tier (promotion)
    • When adding to lower tier and URL exists in higher tier, skip (no demotion)
  • Result: PRs correctly appear in all applicable higher-tier categories; no unwanted duplicates in lower tiers

Validation

Test Case 1: PR #433 (mautic/user-documentation#433)

  • Initially: Reviewed only (reviewedPrs)
  • Later: New commits pushed, now also co-authored
  • After fix: Appears in both reviewedPrs ✅ and coAuthoredPrs
  • Across runs: Persists in both categories ✅

Test Case 2: Lost Collaborations Recovered
Three collaborations (#115, #292, #111) that disappeared during earlier iterations:

  • PR #115 → coAuthoredPrs
  • PR #292 → coAuthoredPrs
  • PR #111 → coAuthoredPrs

Test Case 3: No Unwanted Duplicates
Verified no cross-category duplicates exist between:

  • pullRequestsreviewedPrs
  • pullRequestscollaborations
  • issues ↔ any other category ✅
  • collaborationsreviewedPrs/coAuthoredPrs

Files Changed

  • scripts/github-api-fetchers.js: Updated getFirstCommitDetails() to use prUpdatedAt in cache key
  • scripts/main.js: Replaced deduplication logic with category hierarchy enforcement

* fix: prs in collaborations move to co-authored and/or reviewed prs when user review or co-authored it
@adiati98 adiati98 merged commit cc789dd into main Nov 13, 2025
@adiati98 adiati98 deleted the fix/merging-data-logic branch November 22, 2025 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants