feat: experimental Cross-Origin Storage cache backend by nico-martin · Pull Request #1549 · huggingface/transformers.js

nico-martin · 2026-02-28T10:47:15Z

Adds opt-in support for the Cross-Origin Storage API as a cache backend, allowing model weights to be shared across origins so users only download a given model once regardless of which site requests it.

Changes

CrossOriginStorageCache.js: new CrossOriginStorage class implementing CacheInterface. Resolves the SHA-256 hash of a Hugging Face resource via its raw Git LFS pointer file and uses that hash as the key for navigator.crossOriginStorage. Hash lookups use a network-first strategy with a Cache API fallback (experimental_transformers-hash-cache) so lookups continue to work when the user is offline.
env.js: adds experimental_useCrossOriginStorage (default false) to TransformersEnvironment. The experimental_ prefix is documented to signal that the underlying browser API is not yet standardised and may change without a major version bump.
cache.js: wires CrossOriginStorage into getCache(), checked after useCustomCache and before useBrowserCache.

Usage

import { env } from '@huggingface/transformers';
env.experimental_useCrossOriginStorage = true;

Known limitation: only LFS-tracked files are cached via COS

Cross-origin storage keys files by SHA-256 hash. The implementation resolves that hash by fetching the raw Git LFS pointer file (e.g. /raw/main/onnx/model.onnx) and extracting the oid sha256: field. This only works for files stored in Git LFS (typically the ONNX model weights). Smaller files such as config.json, tokenizer.json, and tokenizer_config.json are stored directly in git and have no LFS pointer, so no hash can be resolved aand those files are not cached at all. They will be re-downloaded on every load.

Open question: should CrossOriginStorage fall back to the browser Cache API for files where no hash can be resolved (non-LFS files, or any other case where _getFileHash returns null)? This would give non-LFS files the same caching semantics they have today, while LFS files get the cross-origin sharing benefit. The trade-off is that CrossOriginStorage would then silently mix two storage backends, which may be surprising. @tomayac Feedback welcome.

HuggingFaceDocBuilderDev · 2026-02-28T10:51:04Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

tomayac · 2026-03-02T09:38:43Z

I'm very excited about this PR! Thanks a lot 🤗! Here's some feedback and general thoughts:

While we currently use SHA-256 in practice, I'd not hardcode (at least not too hard) this assumption into the code. What you have now is fine, it's mostly just a "heads up" for the future.
Cross-Origin Storage (COS) should be a progressive enhancement. Ideally, it's not something the developer needs to enable, but something the library automatically uses if support is detected. This is to maximize the likelihood of cache matches. End users with COS enabled (i.e., at the current state, the extension installed) would transparently use it, end users without would not.
The instructions (and error message) for now should maybe point developers at the extension.
For files where no hash is known, I'd maybe recommend you run them through the _getBlobHash() function and calculate the hash, and then store the mapping, so you have all model resources in one place if COS is used. (In the long-term, would it be possible to support SHA-256 for Hugging Face's git? It currently still defaults to SHA-1 [run git rev-parse --show-object-format in any HF repo to verify]. This way, it might be easier to make all hashes available programmatically.)

nico-martin · 2026-03-02T14:40:47Z

Thank you very much for your answers!

Cross-Origin Storage (COS) should be a progressive enhancement. Ideally, it's not something the developer needs to enable, but something the library automatically uses if support is detected. This is to maximize the likelihood of cache matches. End users with COS enabled (i.e., at the current state, the extension installed) would transparently use it, end users without would not.

I think you're mainly concerned with the env.experimental_useCrossOriginStorage variable, right?
I agree with you that it could easily be introduced under the hood as a progressive enhancement. However, I would only do that at a later stage. At present, neither the standard nor our implementation is final. Therefore, I would prefer that developers consciously opt in to this and only do so in controlled test environments.
At a later date, I can imagine an env.experimental_useCrossOriginStorageIfAvailable flag. In other words, a progressive enhancement as you suggest. Once it has been battle-tested, we can remove the flag. Does that make sense to you?

For files where no hash is known, I'd maybe recommend you run them through the _getBlobHash() function and calculate the hash, and then store the mapping, so you have all model resources in one place if COS is used.

Not sure if I get that right. So if I want to calculate the hash from the blob, I still need to download the whole file. So I would still have to download the whole file before I can evaluate if it's already stored in the COS. And then when I store the hash and the URL, then I would still store that in let's say the local storage, which means the URL->Hash Map is stored per origin so each origin first has to download the file before it knows if it can use the COS. Or have I misunderstood something?

tomayac · 2026-03-02T15:40:36Z

I think you're mainly concerned with the env.experimental_useCrossOriginStorage variable, right? I agree with you that it could easily be introduced under the hood as a progressive enhancement. However, I would only do that at a later stage. At present, neither the standard nor our implementation is final. Therefore, I would prefer that developers consciously opt in to this and only do so in controlled test environments. At a later date, I can imagine an env.experimental_useCrossOriginStorageIfAvailable flag. In other words, a progressive enhancement as you suggest. Once it has been battle-tested, we can remove the flag. Does that make sense to you?

Just to be sure I understand how you envision this at the current stage: developers who would opt in would not be expected to release apps in the wild with this flag set, as people without the extension would effectively not have a cached experience?

If so, I'd hope to convince you to rethink this as progressive enhancement from the start. This shouldn't be thought of as its own Transformers.js caching backend, but as a thing to try first before falling back to Cache API before falling back to network. This way developers can see the immediate benefit more clearly. I fully reckon that it's early stages, but I'd argue installing the extension is effectively the current way to opt in. What do you think?

Not sure if I get that right. So if I want to calculate the hash from the blob, I still need to download the whole file. So I would still have to download the whole file before I can evaluate if it's already stored in the COS. And then when I store the hash and the URL, then I would still store that in let's say the local storage, which means the URL->Hash Map is stored per origin so each origin first has to download the file before it knows if it can use the COS. Or have I misunderstood something?

This is what I meant, yes, mostly to make it possible to keep all resources together in the same storage and to be future proof. As I wrote, the eventual solution would be for all resources to have their hash known.

nico-martin · 2026-03-02T16:07:32Z

Just to be sure I understand how you envision this at the current stage: developers who would opt in would not be expected to release apps in the wild with this flag set, as people without the extension would effectively not have a cached experience?

As it was integrated until now, yes. But I have adapted it. CrossOriginStorage now has a fallback per request if the hash is not found. In addition, the opt-in is now such that a dev can activate it via env. experimental_useCrossOriginStorage and it is then used if COS is available. So as a progressive enhancement.

This is what I meant, yes, mostly to make it possible to keep all resources together in the same storage and to be future proof. As I wrote, the eventual solution would be for all resources to have their hash known.

But then I always have to load it twice initially. Once for the origin so I can calculate the hash, then again from the COS.
Further loads could then look up the hash by URL and only use the COS. But the initial double-load is pretty bad.

tomayac · 2026-03-02T16:23:37Z

As it was integrated until now, yes. But I have adapted it. CrossOriginStorage now has a fallback per request if the hash is not found. In addition, the opt-in is now such that a dev can activate it via env. experimental_useCrossOriginStorage and it is then used if COS is available. So as a progressive enhancement.

❤️ Really appreciate your receptiveness to my feedback! I think this is the way to go, as it allows developers to roll this out with no side effects to regular users, and power users with the extension installed will get the cache benefit!

But then I always have to load it twice initially. Once for the origin so I can calculate the hash, then again from the COS. Further loads could then look up the hash by URL and only use the COS. But the initial double-load is pretty bad.

Can you maybe tee it?

async function fetchAndProcess(url) {
  const response = await fetch(url);

  // stream1 goes to the caller, stream2 goes to background tasks
  const [stream1, stream2] = response.body.tee();

  processStreamInBackground(stream2);

  // Convert the first stream into a Blob for the caller
  return new Response(stream1, { 
    headers: response.headers 
  }).blob();
}

async function processStreamInBackground(stream) {
  const blob = await new Response(stream).blob();

  const hash = await calculateSHA256(blob);
  await storeInCrossOriginStorage(blob, hash);
}

xenova · 2026-03-02T19:43:37Z

const [stream1, stream2] = response.body.tee();

TIL 🤯

nico-martin · 2026-03-03T08:47:04Z

Can you maybe tee it?

Ok, thats really cool. I did not know that. So it still needs to download files where we have no hash per origin, but

the cache is shared (not 10s of GBs of storage per origin)
it loads once per origin so not worse than the Caches API

I will try that :)

tomayac · 2026-03-03T09:00:44Z

it loads once per origin so not worse than the Caches API

Which is a super important point, and what I love about making this progressively enhanced. We're never making it worse than it is at the moment, only better.

I would like to add one (arguably tiny lever, but still) point:

It populates the global COS cache, so if at some point Hugging Face makes the hashes of all resources known, future users will profit from the early adopters' COS caches.

I will try that :)

Awesome, thanks! It's one of those APIs I rarely use (and only got to know because I wrote an article about it, but when it comes in handy, it's fantastic.

nico-martin added 2 commits February 28, 2026 11:41

added CrossOriginStorage implementation

1ca350b

resolved conflicts with main

f4b6072

tomayac mentioned this pull request Mar 2, 2026

Cross-Origin Storage API Extension #1442

Closed

nico-martin added 2 commits March 2, 2026 15:46

added references to COS extension

469df35

Merge branch 'main' into feat/cross-origin-storage-experiment

fb84d76

added fallback cache

ac0976f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: experimental Cross-Origin Storage cache backend#1549

feat: experimental Cross-Origin Storage cache backend#1549
nico-martin wants to merge 5 commits intomainfrom
feat/cross-origin-storage-experiment

nico-martin commented Feb 28, 2026

Uh oh!

HuggingFaceDocBuilderDev commented Feb 28, 2026

Uh oh!

tomayac commented Mar 2, 2026

Uh oh!

nico-martin commented Mar 2, 2026

Uh oh!

tomayac commented Mar 2, 2026

Uh oh!

nico-martin commented Mar 2, 2026

Uh oh!

tomayac commented Mar 2, 2026

Uh oh!

xenova commented Mar 2, 2026

Uh oh!

nico-martin commented Mar 3, 2026 •

edited

Loading

Uh oh!

tomayac commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

nico-martin commented Feb 28, 2026

Changes

Usage

Known limitation: only LFS-tracked files are cached via COS

Uh oh!

HuggingFaceDocBuilderDev commented Feb 28, 2026

Uh oh!

tomayac commented Mar 2, 2026

Uh oh!

nico-martin commented Mar 2, 2026

Uh oh!

tomayac commented Mar 2, 2026

Uh oh!

nico-martin commented Mar 2, 2026

Uh oh!

tomayac commented Mar 2, 2026

Uh oh!

xenova commented Mar 2, 2026

Uh oh!

nico-martin commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tomayac commented Mar 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

nico-martin commented Mar 3, 2026 •

edited

Loading