-
Notifications
You must be signed in to change notification settings - Fork 114
Optimizing build cache step #1395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Vercel Previews Deployed
|
|
@williamdalessandro you are not going to see any changes in the workflow because of special security rules around the
For example notice in your latest commit, even though you removed the You will need to duplicate the GHA workflow file with a trigger of just |
|
Delved deeper into our build process and everything else we're doing but the biggest time sink comes from our interaction with vercel when we hand data off to them. At this point in time we're already handling that the best we can without doing large structural reorganizations. Answers for why certain parts of the build process are slow or appear to be slow:
|
| asyncMapFn, | ||
| { batchSize = 16, loggingEnabled = true } = { | ||
| batchSize: 16, | ||
| { batchSize = 64, loggingEnabled = true } = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return to testing if this is dependent on local machines thread count? Could this cause CPU thread switching trash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only unknown here, but really don't know. 🤷🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's node and its batching promises, there's inherent protection against that because its limited to 4 threads based on the UV_THREADPOOL_SIZE parameter. So it's always handling 4 things at a time, but we'd just be handing it a bigger batch that's a bit more efficient, but also not going so crazy with the batch size that we hit other issues with memory pressure. So it should be fine to stay
RubenSandwich
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚢
While looking for ways to optimize/shave off time from our builds, I was brought back to this ticket. This caching step didn't make any sense to me so I delved into it.
Essentially, this step was creating a cache key whose name was a mix between a hash of our package-lock file and a hash of all of our (70k+) mdx files (which would take 2-3 min to process). This key would be used to pull up a cached version of the content we have in ~/.npm and /.next/cache during the github runner process. The content in those folders are just downloaded packages, compliled js/ts files, and other things for nextjs optimization.
If a single mdx file changed, it would just invalidate the cache (even though the cache was totally good to use), and cause us to run npm install again, so we never really had the proper benefit of this cache in the first place.
This PR simplifies and narrows the scope of the files we actually care about caching the changes for. Included changes in both the build-preview process and the deploy-udr step.