Skip to content

feat: experimental Cross-Origin Storage cache backend#1549

Open
nico-martin wants to merge 5 commits intomainfrom
feat/cross-origin-storage-experiment
Open

feat: experimental Cross-Origin Storage cache backend#1549
nico-martin wants to merge 5 commits intomainfrom
feat/cross-origin-storage-experiment

Conversation

@nico-martin
Copy link
Collaborator

Adds opt-in support for the Cross-Origin Storage API as a cache backend, allowing model weights to be shared across origins so users only download a given model once regardless of which site requests it.

Changes

  • CrossOriginStorageCache.js: new CrossOriginStorage class implementing CacheInterface. Resolves the SHA-256 hash of a Hugging Face resource via its raw Git LFS pointer file and uses that hash as the key for navigator.crossOriginStorage. Hash lookups use a network-first strategy with a Cache API fallback (experimental_transformers-hash-cache) so lookups continue to work when the user is offline.
  • env.js: adds experimental_useCrossOriginStorage (default false) to TransformersEnvironment. The experimental_ prefix is documented to signal that the underlying browser API is not yet standardised and may change without a major version bump.
  • cache.js: wires CrossOriginStorage into getCache(), checked after useCustomCache and before useBrowserCache.

Usage

import { env } from '@huggingface/transformers';
env.experimental_useCrossOriginStorage = true;

Known limitation: only LFS-tracked files are cached via COS

Cross-origin storage keys files by SHA-256 hash. The implementation resolves that hash by fetching the raw Git LFS pointer file (e.g. /raw/main/onnx/model.onnx) and extracting the oid sha256: field. This only works for files stored in Git LFS (typically the ONNX model weights). Smaller files such as config.json, tokenizer.json, and tokenizer_config.json are stored directly in git and have no LFS pointer, so no hash can be resolved aand those files are not cached at all. They will be re-downloaded on every load.

Open question: should CrossOriginStorage fall back to the browser Cache API for files where no hash can be resolved (non-LFS files, or any other case where _getFileHash returns null)? This would give non-LFS files the same caching semantics they have today, while LFS files get the cross-origin sharing benefit. The trade-off is that CrossOriginStorage would then silently mix two storage backends, which may be surprising. @tomayac Feedback welcome.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@tomayac
Copy link

tomayac commented Mar 2, 2026

I'm very excited about this PR! Thanks a lot 🤗! Here's some feedback and general thoughts:

  • While we currently use SHA-256 in practice, I'd not hardcode (at least not too hard) this assumption into the code. What you have now is fine, it's mostly just a "heads up" for the future.
  • Cross-Origin Storage (COS) should be a progressive enhancement. Ideally, it's not something the developer needs to enable, but something the library automatically uses if support is detected. This is to maximize the likelihood of cache matches. End users with COS enabled (i.e., at the current state, the extension installed) would transparently use it, end users without would not.
  • The instructions (and error message) for now should maybe point developers at the extension.
  • For files where no hash is known, I'd maybe recommend you run them through the _getBlobHash() function and calculate the hash, and then store the mapping, so you have all model resources in one place if COS is used. (In the long-term, would it be possible to support SHA-256 for Hugging Face's git? It currently still defaults to SHA-1 [run git rev-parse --show-object-format in any HF repo to verify]. This way, it might be easier to make all hashes available programmatically.)

@nico-martin
Copy link
Collaborator Author

Thank you very much for your answers!

Cross-Origin Storage (COS) should be a progressive enhancement. Ideally, it's not something the developer needs to enable, but something the library automatically uses if support is detected. This is to maximize the likelihood of cache matches. End users with COS enabled (i.e., at the current state, the extension installed) would transparently use it, end users without would not.

I think you're mainly concerned with the env.experimental_useCrossOriginStorage variable, right?
I agree with you that it could easily be introduced under the hood as a progressive enhancement. However, I would only do that at a later stage. At present, neither the standard nor our implementation is final. Therefore, I would prefer that developers consciously opt in to this and only do so in controlled test environments.
At a later date, I can imagine an env.experimental_useCrossOriginStorageIfAvailable flag. In other words, a progressive enhancement as you suggest. Once it has been battle-tested, we can remove the flag. Does that make sense to you?

For files where no hash is known, I'd maybe recommend you run them through the _getBlobHash() function and calculate the hash, and then store the mapping, so you have all model resources in one place if COS is used.

Not sure if I get that right. So if I want to calculate the hash from the blob, I still need to download the whole file. So I would still have to download the whole file before I can evaluate if it's already stored in the COS. And then when I store the hash and the URL, then I would still store that in let's say the local storage, which means the URL->Hash Map is stored per origin so each origin first has to download the file before it knows if it can use the COS. Or have I misunderstood something?

@tomayac
Copy link

tomayac commented Mar 2, 2026

I think you're mainly concerned with the env.experimental_useCrossOriginStorage variable, right? I agree with you that it could easily be introduced under the hood as a progressive enhancement. However, I would only do that at a later stage. At present, neither the standard nor our implementation is final. Therefore, I would prefer that developers consciously opt in to this and only do so in controlled test environments. At a later date, I can imagine an env.experimental_useCrossOriginStorageIfAvailable flag. In other words, a progressive enhancement as you suggest. Once it has been battle-tested, we can remove the flag. Does that make sense to you?

Just to be sure I understand how you envision this at the current stage: developers who would opt in would not be expected to release apps in the wild with this flag set, as people without the extension would effectively not have a cached experience?

If so, I'd hope to convince you to rethink this as progressive enhancement from the start. This shouldn't be thought of as its own Transformers.js caching backend, but as a thing to try first before falling back to Cache API before falling back to network. This way developers can see the immediate benefit more clearly. I fully reckon that it's early stages, but I'd argue installing the extension is effectively the current way to opt in. What do you think?

Not sure if I get that right. So if I want to calculate the hash from the blob, I still need to download the whole file. So I would still have to download the whole file before I can evaluate if it's already stored in the COS. And then when I store the hash and the URL, then I would still store that in let's say the local storage, which means the URL->Hash Map is stored per origin so each origin first has to download the file before it knows if it can use the COS. Or have I misunderstood something?

This is what I meant, yes, mostly to make it possible to keep all resources together in the same storage and to be future proof. As I wrote, the eventual solution would be for all resources to have their hash known.

@nico-martin
Copy link
Collaborator Author

Just to be sure I understand how you envision this at the current stage: developers who would opt in would not be expected to release apps in the wild with this flag set, as people without the extension would effectively not have a cached experience?

As it was integrated until now, yes. But I have adapted it. CrossOriginStorage now has a fallback per request if the hash is not found. In addition, the opt-in is now such that a dev can activate it via env. experimental_useCrossOriginStorage and it is then used if COS is available. So as a progressive enhancement.

This is what I meant, yes, mostly to make it possible to keep all resources together in the same storage and to be future proof. As I wrote, the eventual solution would be for all resources to have their hash known.

But then I always have to load it twice initially. Once for the origin so I can calculate the hash, then again from the COS.
Further loads could then look up the hash by URL and only use the COS. But the initial double-load is pretty bad.

@tomayac
Copy link

tomayac commented Mar 2, 2026

As it was integrated until now, yes. But I have adapted it. CrossOriginStorage now has a fallback per request if the hash is not found. In addition, the opt-in is now such that a dev can activate it via env. experimental_useCrossOriginStorage and it is then used if COS is available. So as a progressive enhancement.

❤️ Really appreciate your receptiveness to my feedback! I think this is the way to go, as it allows developers to roll this out with no side effects to regular users, and power users with the extension installed will get the cache benefit!

But then I always have to load it twice initially. Once for the origin so I can calculate the hash, then again from the COS. Further loads could then look up the hash by URL and only use the COS. But the initial double-load is pretty bad.

Can you maybe tee it?

async function fetchAndProcess(url) {
  const response = await fetch(url);

  // stream1 goes to the caller, stream2 goes to background tasks
  const [stream1, stream2] = response.body.tee();

  processStreamInBackground(stream2);

  // Convert the first stream into a Blob for the caller
  return new Response(stream1, { 
    headers: response.headers 
  }).blob();
}

async function processStreamInBackground(stream) {
  const blob = await new Response(stream).blob();

  const hash = await calculateSHA256(blob);
  await storeInCrossOriginStorage(blob, hash);
}

@xenova
Copy link
Collaborator

xenova commented Mar 2, 2026

const [stream1, stream2] = response.body.tee();

TIL 🤯

@nico-martin
Copy link
Collaborator Author

nico-martin commented Mar 3, 2026

Can you maybe tee it?

Ok, thats really cool. I did not know that. So it still needs to download files where we have no hash per origin, but

  1. the cache is shared (not 10s of GBs of storage per origin)
  2. it loads once per origin so not worse than the Caches API

I will try that :)

@tomayac
Copy link

tomayac commented Mar 3, 2026

  1. it loads once per origin so not worse than the Caches API

Which is a super important point, and what I love about making this progressively enhanced. We're never making it worse than it is at the moment, only better.

I would like to add one (arguably tiny lever, but still) point:

  1. It populates the global COS cache, so if at some point Hugging Face makes the hashes of all resources known, future users will profit from the early adopters' COS caches.

I will try that :)

Awesome, thanks! It's one of those APIs I rarely use (and only got to know because I wrote an article about it, but when it comes in handy, it's fantastic.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants