feat(Vercel) Build cache improvements #15317

sergical · 2025-10-25T03:06:58Z

DESCRIBE YOUR PR

Caches release registry
looking into md exports

IS YOUR CHANGE URGENT?

Help us prioritize incoming PRs by letting us know when the change needs to go live.

Urgent deadline (GA date, etc.):
Other deadline:
None: Not urgent, can wait up to 1 week+

SLA

Teamwork makes the dream work, so please add a reviewer to your PRs.
Please give the docs team up to 1 week to review your PR unless you've added an urgent due date to it.
Thanks in advance for your help!

PRE-MERGE CHECKLIST

Make sure you've checked the following before merging your changes:

Checked Vercel preview for correctness, including links
PR was reviewed and approved by any necessary SMEs (subject matter experts)
PR was reviewed and approved by a member of the Sentry docs team

LEGAL BOILERPLATE

Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.

EXTRA RESOURCES

Sentry Docs contributor guide

vercel · 2025-10-25T03:07:03Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
develop-docs	Ready	Preview	Comment	Oct 29, 2025 4:02pm
sentry-docs	Ready	Preview	Comment	Oct 29, 2025 4:02pm

codecov · 2025-10-25T03:18:07Z

Bundle Report

Changes will decrease total bundle size by 460.9kB (-1.96%) ⬇️. This is within the configured threshold ✅

Detailed changes

Bundle name	Size	Change
sentry-docs-client-array-push	10.16MB	-6 bytes (-0.0%) ⬇️
sentry-docs-server-cjs	12.52MB	-460.9kB (-3.55%) ⬇️

Affected Assets, Files, and Routes:

view changes for bundle: sentry-docs-client-array-push

Assets Changed:

Asset Name	Size Change	Total Size	Change (%)
`static/chunks/pages/_app-*.js`	-3 bytes	882.71kB	-0.0%
`static/chunks/8321-*.js`	-3 bytes	425.87kB	-0.0%
`server/middleware-*.js`	6.46kB	7.46kB	645.5% ⚠️
`server/middleware-*.js`	-6.46kB	1.0kB	-86.59%
`static/fqMK9BHK1nHyXvt1MK9Ok/_buildManifest.js` (New)	684 bytes	684 bytes	100.0% 🚀
`static/fqMK9BHK1nHyXvt1MK9Ok/_ssgManifest.js` (New)	77 bytes	77 bytes	100.0% 🚀
~~*`static/c-.js`*~~ (Deleted)*	-77 bytes	0 bytes	-100.0% 🗑️
~~*`static/c-.js`*~~ (Deleted)*	-684 bytes	0 bytes	-100.0% 🗑️

view changes for bundle: sentry-docs-server-cjs

Assets Changed:

Asset Name	Size Change	Total Size	Change (%)
`1729.js`	-33.51kB	1.74MB	-1.89%
`../instrumentation.js`	-33.8kB	1.07MB	-3.07%
`9523.js`	-33.51kB	1.04MB	-3.11%
`../app/[[...path]]/page.js.nft.json`	-119.93kB	739.55kB	-13.95%
`../app/platform-redirect/page.js.nft.json`	-119.93kB	739.46kB	-13.96%
`../app/sitemap.xml/route.js.nft.json`	-119.93kB	736.69kB	-14.0%
`7153.js` (New)	30.3kB	30.3kB	100.0% 🚀
`9567.js`	924 bytes	23.11kB	4.17%
`../app/api/ip-ranges/route.js`	-300 bytes	5.79kB	-4.92%
`../app/robots.txt/route.js`	-300 bytes	5.02kB	-5.64%
~~`2311.js`~~ (Deleted)	-30.9kB	0 bytes	-100.0% 🗑️

Files in 9567.js:

./src/mdx.ts → Total Size: 27.86kB

App Routes Affected:

App Route	Size Change	Total Size	Change (%)
/	-600 bytes	2.81MB	-0.02%

- Use VERCEL_GIT_COMMIT_REF (branch name) in cache keys for cross-commit persistence - Include registry data hash in cache key to detect registry updates - Enable caching for 200+ platform-include files (previously skipped) - Add build timing instrumentation - Expected: 18 min → 2-3 min on first build, ~2 min on subsequent commits

BYK

Great time savings. That said we should address the issues I raised either before merging or in a quick follow up.

Also, the extra comments are mostly stating the obvious (vibe code artifacts?) and better removed.

src/docTree.ts

BYK · 2025-10-28T19:10:54Z

src/mdx.ts

+let lastSummaryLog = Date.now();
+function logCacheSummary(force = false) {
+  const now = Date.now();
+  // Log every 30 seconds or when forced


This logic seems unnecessary? Why not just emit at the end?

BYK · 2025-10-28T19:12:02Z

src/mdx.ts

-  const skipCache =
+  // Check if file depends on Release Registry
+  const dependsOnRegistry =
    source.includes('@inject') ||


Not sure if this @inject thing was related to the registry

BYK · 2025-10-28T19:17:26Z

src/mdx.ts

+  if (cachedRegistryHash) {
+    return cachedRegistryHash;
+  }
+  const [apps, packages] = await Promise.all([getAppRegistry(), getPackageRegistry()]);


There's a race condition here: if you call this function 3 times back to back, it would make 3 separate calls.

What you need for proper caching is to change the type of cachedRegistryHash to Promise<string>, and do:

cachedRegistryHash = Promise.all(...). then(([apps, packages]) => md5(...)); return cachedRegistryHash;

BYK · 2025-10-28T19:24:57Z

src/mdx.ts

+        // Get registry hash (cached per worker to avoid redundant fetches)
+        const registryHash = await getRegistryHash();
+        cacheKey = `${sourceHash}-${registryHash}`;
+      } catch (err) {


This logic can and probably should be improved: the only way this can throw an exception should be a network related issue. In that case, pages depending on the registry will also have a problem so the try-catch is redundant. It's also wasteful as if it raises an exception, that means it will raise an exception for every single page.

I'd rather add a retry mechanism into the cache key function and don't handle the exception if the retried fail, halting the build as we need the registry connection for the build.

vercel.json

scripts/generate-md-exports.mjs

+  const leanHTML = rawHTML
+    // Remove all script tags (build IDs, chunk hashes, Vercel injections)
+    .replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '')


scripts/generate-md-exports.mjs

+  // Remove elements that change between builds but don't affect markdown output
+  const leanHTML = rawHTML
+    // Remove all script tags (build IDs, chunk hashes, Vercel injections)
+    .replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '')


The best way to fix this problem is to use a proper HTML parser to remove unwanted tags (such as <script>, <link>, and <meta>), rather than relying on regular expressions. This provides more robust handling of HTML's intricacies, such as extra whitespace, unusual attribute formatting, and invalid but tolerated browser syntax. Since the script already imports rehype-parse (for parsing HTML to a syntax tree) and other tools from the unified/rehype ecosystem, the fix can use these existing libraries.

Specifically, instead of using .replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '') (and similar regex for <link> and <meta>), we should parse the HTML into an AST, programmatically remove the unwanted nodes, and then serialize the AST back to HTML for further processing. This fix should be applied within the genMDFromHTML function, replacing the leanHTML construction (lines 233–242) with parser-based routines.

No new dependencies are needed since rehype-parse, unist-util-remove, and related packages are already imported. We'll need to use unified().use(rehypeParse, {fragment: true}) to parse the HTML, use remove(tree, test) from unist-util-remove to strip undesired nodes, and a rehype serializer (e.g., rehype-stringify) to convert the AST back to HTML. If not already available, we should add a rehype-stringify import.

feat(Vercel) dont generate md exports on preview builds

91c4f27

vercel bot deployed to Preview – develop-docs October 25, 2025 03:11 View deployment

vercel bot deployed to Preview – sentry-docs October 25, 2025 03:33 View deployment

build timer

ab18bad

vercel bot deployed to Preview – develop-docs October 27, 2025 00:05 View deployment

vercel bot deployed to Preview – sentry-docs October 27, 2025 00:22 View deployment

vercel bot had a problem deploying to Preview – develop-docs October 27, 2025 15:50 Failure

vercel bot had a problem deploying to Preview – sentry-docs October 27, 2025 16:08 Failure

test release registry cache approach

45cf20c

sergical changed the title ~~feat(Vercel) dont generate md exports on preview builds~~ feat(Vercel) Build cache improvements Oct 27, 2025

vercel bot deployed to Preview – develop-docs October 27, 2025 16:55 View deployment

vercel bot deployed to Preview – sentry-docs October 27, 2025 17:12 View deployment

testing

ab63b42

vercel bot deployed to Preview – develop-docs October 27, 2025 17:40 View deployment

vercel bot deployed to Preview – sentry-docs October 27, 2025 17:57 View deployment

testing

662fd31

vercel bot deployed to Preview – develop-docs October 27, 2025 19:00 View deployment

vercel bot deployed to Preview – sentry-docs October 27, 2025 19:18 View deployment

logs

0186db0

vercel bot deployed to Preview – develop-docs October 27, 2025 21:07 View deployment

vercel bot deployed to Preview – sentry-docs October 27, 2025 21:23 View deployment

build timer

0ec2adb

vercel bot deployed to Preview – develop-docs October 28, 2025 14:23 View deployment

vercel bot deployed to Preview – sentry-docs October 28, 2025 14:38 View deployment

remove vercel build logs

9fe553c

vercel bot deployed to Preview – develop-docs October 28, 2025 15:16 View deployment

vercel bot deployed to Preview – sentry-docs October 28, 2025 15:27 View deployment

BYK approved these changes Oct 28, 2025

View reviewed changes

sergical added 2 commits October 28, 2025 16:02

address comments

a824227

remove schema declaration

a1fcd87

sergical marked this pull request as ready for review October 28, 2025 20:02

This comment was marked as outdated.

Sign in to view

race condition

a643daa

This comment was marked as outdated.

Sign in to view

vercel bot deployed to Preview – develop-docs October 28, 2025 20:19 View deployment

ci check

1f3b0aa

vercel bot deployed to Preview – sentry-docs October 28, 2025 20:37 View deployment

vercel bot deployed to Preview – develop-docs October 28, 2025 20:42 View deployment

vercel bot deployed to Preview – sentry-docs October 28, 2025 21:02 View deployment

sergical and others added 2 commits October 28, 2025 18:15

more workers

c2fc2f9

[getsentry/action-github-commit] Auto commit

8510963

This comment was marked as outdated.

Sign in to view

vercel bot deployed to Preview – develop-docs October 28, 2025 22:25 View deployment

vercel bot deployed to Preview – sentry-docs October 28, 2025 22:45 View deployment

debugging cache

c0eae4b

This comment was marked as outdated.

Sign in to view

[getsentry/action-github-commit] Auto commit

9f19285

vercel bot deployed to Preview – develop-docs October 28, 2025 23:04 View deployment

vercel bot deployed to Preview – sentry-docs October 28, 2025 23:27 View deployment

new content normalization

3357b00

This comment was marked as outdated.

Sign in to view

github-advanced-security bot found potential problems Oct 28, 2025

View reviewed changes

vercel bot deployed to Preview – develop-docs October 28, 2025 23:34 View deployment

vercel bot temporarily deployed to Preview – sentry-docs October 29, 2025 00:00 Inactive

maybe cache?

485c80c

vercel bot deployed to Preview – develop-docs October 29, 2025 00:43 View deployment

vercel bot deployed to Preview – sentry-docs October 29, 2025 01:09 View deployment

@@ -26,6 +26,7 @@
             import remarkStringify from 'remark-stringify';
             import {unified} from 'unified';
             import {remove} from 'unist-util-remove';
+            import rehypeStringify from 'rehype-stringify';
             const DOCS_ORIGIN = 'https://docs.sentry.io';
             const CACHE_VERSION = 3;
@@ -230,17 +231,44 @@
               // Normalize HTML to make cache keys deterministic across builds
               // Remove elements that change between builds but don't affect markdown output
-              const leanHTML = rawHTML
-                // Remove all script tags (build IDs, chunk hashes, Vercel injections)
-                .replace(/<script[^>]*>[\s\S]*?<\/script>/gi, '')
-                // Remove link tags for stylesheets and preloads (chunk hashes change)
-                .replace(/<link[^>]*>/gi, '')
-                // Remove meta tags that might have build-specific content
-                .replace(/<meta name="next-size-adjust"[^>]*>/gi, '')
-                // Remove data attributes that Next.js/Vercel add (build IDs, etc.)
-                .replace(/\s+data-next-[a-z-]+="[^"]*"/gi, '')
-                .replace(/\s+data-nextjs-[a-z-]+="[^"]*"/gi, '');
+              // Remove all <script>, <link>, and next-size-adjust <meta> tags, as well as data-* attributes, using an HTML parser.
+              const parsedHtmlTree = unified()
+                .use(rehypeParse, {fragment: true})
+                .parse(rawHTML);
+              // Remove unwanted elements using unist-util-remove
+              // Remove <script> tags
+              remove(parsedHtmlTree, (node) => node.type === 'element' && node.tagName === 'script');
+              // Remove <link> tags
+              remove(parsedHtmlTree, (node) => node.type === 'element' && node.tagName === 'link');
+              // Remove <meta name="next-size-adjust" ...>
+              remove(parsedHtmlTree, (node) =>
+                node.type === 'element' &&
+                node.tagName === 'meta' &&
+                node.properties &&
+                node.properties.name === 'next-size-adjust'
+              );
+              // Remove data-next-* and data-nextjs-* attributes from all elements
+              function cleanseDataAttrs(node) {
+                if (node && node.type === 'element' && node.properties) {
+                  Object.keys(node.properties).forEach((key) => {
+                    if (/^data-next(-|js-)/.test(key)) {
+                      delete node.properties[key];
+                    }
+                  });
+                }
+                if (node.children) {
+                  node.children.forEach(cleanseDataAttrs);
+                }
+              }
+              cleanseDataAttrs(parsedHtmlTree);
+              // Convert AST back to HTML
+              const leanHTML = unified()
+                .use(() => (tree) => tree) // identity plugin since tree already processed
+                .use(rehypeStringify)
+                .stringify(parsedHtmlTree);
               if (shouldDebug) {
                 console.log(
                   `✂️  Lean HTML length: ${leanHTML.length} chars (removed ${rawHTML.length - leanHTML.length} chars)`

Uh oh!

Uh oh!

feat(Vercel) Build cache improvements #15317

Are you sure you want to change the base?

feat(Vercel) Build cache improvements #15317

Uh oh!

Conversation

sergical commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

DESCRIBE YOUR PR

IS YOUR CHANGE URGENT?

SLA

PRE-MERGE CHECKLIST

LEGAL BOILERPLATE

EXTRA RESOURCES

Uh oh!

vercel bot commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bundle Report

Affected Assets, Files, and Routes:

Assets Changed:

Assets Changed:

App Routes Affected:

Uh oh!

BYK left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BYK Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

BYK Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

BYK Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

BYK Oct 28, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

Check failure

Uh oh!

Check failure

Copilot Autofix

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

sergical commented Oct 25, 2025 •

edited

Loading

vercel bot commented Oct 25, 2025 •

edited

Loading

codecov bot commented Oct 25, 2025 •

edited

Loading