Skip to content

[BUG] flux.1-schnell v1.0.0: HuggingFace token not used for safetensor downloads #178

@nskpro-cmd

Description

@nskpro-cmd

What We're Trying to Do
Deploy the NVIDIA NIM container for the Flux.1-schnell image generation model on Kubernetes.

The Problem
The container fails to start because it cannot download 6 model files from HuggingFace, even though:
We have a valid HuggingFace token - tested manually and it works
We have access to the model - confirmed on HuggingFace website
The files are already downloaded - 54GB of model files exist on disk
We set the token in multiple ways - environment variables, token files, etc.

What Happens
The container starts, finds all cached files, but then tries to verify 6 large files with HuggingFace and fails with:

"Permission error: The requested operation requires permissions that the user does not have."
This causes the container to crash in a loop.

Root Cause
The NIM container uses a custom Rust-based file downloader that:
Does not properly read the HuggingFace token from environment variables
Does not respect offline mode settings (HF_HUB_OFFLINE=1)
Does not read the token from standard file locations
Still tries to contact HuggingFace even when files are already cached

Evidence

Test Result
Token validity (curl test) ✅ Works - returns user info
Token access to model files (curl test) ✅ Works - returns download redirect
Token inside container (env var) ✅ Present
All model files on disk ✅ 54GB downloaded
Container starts successfully ❌ Fails

Affected
Container: nvcr.io/nim/black-forest-labs/flux.1-schnell
Version: 1.0.0 (all available tags: latest, 1.0, 1.0.0, 1)
Platform: Kubernetes with H100 GPU
Workaround Status
No working workaround found. We tried:
Multiple token environment variables
Token files in cache directories
Offline mode settings
Pre-downloading all model files
Fixing file permissions
All attempts failed due to the container's internal downloader ignoring authentication

Key Details
Container: nvcr.io/nim/black-forest-labs/flux.1-schnell:1.0.0
Platform: Kubernetes, H100 GPU
Token works externally (curl test proves it)
All files cached but container still fails

logs of the pod

2025/12/10 16:15:12 WARN Client tools updates are disabled as they are licensed under AGPL. To use Community Edition builds or custom binaries, set the 'TELEPORT_CDN_BASE_URL' environment variable.

=========================================
== NVIDIA NIM for Visual Generative AI ==
=========================================

NVIDIA Release 1.0.0
Model: black-forest-labs/flux.1-schnell

Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
LicenseRef-NvidiaProprietary

NVIDIA CORPORATION, its affiliates and licensors retain all intellectual property and proprietary rights in and to this material, related documentation and any modifications thereto. Any use, reproduction, disclosure or distribution of this material and related documentation without an express license agreement from NVIDIA CORPORATION or its affiliates are strictly prohibited.
The NIM container is governed by the NVIDIA Software License Agreement(https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and Product-Specific Terms for AI Products(https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/).

The Flux.1 Schnell model is available at https://huggingface.co/black-forest-labs/FLUX.1-schnell

Use of the NVIDIA Cosmos-1.0 Guardrail is governed by the NVIDIA Open Model License Agreement(https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/).
NOTE: CUDA Forward Compatibility mode ENABLED.
  Using CUDA 12.8 driver version 570.86.10 with kernel driver version 550.127.08.
  See https://docs.nvidia.com/deploy/cuda-compatibility/ for details.

INFO 2025-12-10 10:45:08.265 start_server.py:44] Starting nimlib 0.9.0 nim_sdk 0.8.0
INFO 2025-12-10 10:45:08.265 standard_files.py:95] NIM VERSION:
1.0.0
INFO 2025-12-10 10:45:18.155 profiles.py:93] Registered custom profile selectors: [<class 'inference.CustomProfileSelector'>]
INFO 2025-12-10 10:45:18.197 inference.py:141] Matched profile_id in manifest from CustomProfileSelector 0376eb85528b177c914b3a435c6d34456f1ce16bd9287c7e9f22392d87de0441 with tags: {'gpu': 'h100', 'gpu_device': '2330:10de', 'model_type': 'tensorrt', 'number_of_gpus': '1', 'precision': 'fp8', 'resolution': '768-1344x768-1344', 'use_t5_fp8': 'true', 'variant': 'base', 'weightless_engines': 'true'}
INFO 2025-12-10 10:45:18.220 lib.rs:203] File: trt_engines_dir/flux.1-schnell/t5-fp8.trt10.8.0.43.plan found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/t5-fp8.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.220 public.rs:52] Skipping download, using cached copy of file: trt_engines_dir/flux.1-schnell/t5-fp8.trt10.8.0.43.plan at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/t5-fp8.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.232 lib.rs:156] File: tokenizer/merges.txt found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/merges.txt"
INFO 2025-12-10 10:45:18.232 tokio.rs:564] Skipping download, using cached copy of file: tokenizer/merges.txt at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/merges.txt"
INFO 2025-12-10 10:45:18.240 lib.rs:156] File: text_encoder_2/config.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/text_encoder_2/config.json"
INFO 2025-12-10 10:45:18.240 tokio.rs:564] Skipping download, using cached copy of file: text_encoder_2/config.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/text_encoder_2/config.json"
INFO 2025-12-10 10:45:18.248 lib.rs:156] File: tokenizer/tokenizer_config.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/tokenizer_config.json"
INFO 2025-12-10 10:45:18.248 tokio.rs:564] Skipping download, using cached copy of file: tokenizer/tokenizer_config.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/tokenizer_config.json"
INFO 2025-12-10 10:45:18.256 lib.rs:156] File: tokenizer_2/special_tokens_map.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/special_tokens_map.json"
INFO 2025-12-10 10:45:18.256 tokio.rs:564] Skipping download, using cached copy of file: tokenizer_2/special_tokens_map.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/special_tokens_map.json"
INFO 2025-12-10 10:45:18.264 lib.rs:156] File: transformer/config.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/transformer/config.json"
INFO 2025-12-10 10:45:18.264 tokio.rs:564] Skipping download, using cached copy of file: transformer/config.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/transformer/config.json"
INFO 2025-12-10 10:45:18.271 lib.rs:156] File: tokenizer_2/tokenizer.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/tokenizer.json"
INFO 2025-12-10 10:45:18.271 tokio.rs:564] Skipping download, using cached copy of file: tokenizer_2/tokenizer.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/tokenizer.json"
INFO 2025-12-10 10:45:18.288 lib.rs:203] File: h100-metadata/tensorrt/base/metadata.json found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-metadata/h100-metadata/tensorrt/base/metadata.json"
INFO 2025-12-10 10:45:18.288 public.rs:52] Skipping download, using cached copy of file: h100-metadata/tensorrt/base/metadata.json at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-metadata/h100-metadata/tensorrt/base/metadata.json"
INFO 2025-12-10 10:45:18.299 lib.rs:203] File: frame_content_safety_filter/siglip-so400m-patch14-384/preprocessor_config.json found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-cosmos-guardrail/frame_content_safety_filter/siglip-so400m-patch14-384/preprocessor_config.json"
INFO 2025-12-10 10:45:18.299 public.rs:52] Skipping download, using cached copy of file: frame_content_safety_filter/siglip-so400m-patch14-384/preprocessor_config.json at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-cosmos-guardrail/frame_content_safety_filter/siglip-so400m-patch14-384/preprocessor_config.json"
INFO 2025-12-10 10:45:18.306 lib.rs:156] File: tokenizer/vocab.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/vocab.json"
INFO 2025-12-10 10:45:18.306 tokio.rs:564] Skipping download, using cached copy of file: tokenizer/vocab.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/vocab.json"
INFO 2025-12-10 10:45:18.314 lib.rs:156] File: tokenizer_2/spiece.model found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/spiece.model"
INFO 2025-12-10 10:45:18.314 tokio.rs:564] Skipping download, using cached copy of file: tokenizer_2/spiece.model at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/spiece.model"
INFO 2025-12-10 10:45:18.322 lib.rs:203] File: blocklist/blocklist.tar found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-cosmos-guardrail/blocklist/blocklist.tar"
INFO 2025-12-10 10:45:18.322 public.rs:52] Skipping download, using cached copy of file: blocklist/blocklist.tar at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-cosmos-guardrail/blocklist/blocklist.tar"
INFO 2025-12-10 10:45:18.330 lib.rs:156] File: vae/config.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/vae/config.json"
INFO 2025-12-10 10:45:18.330 tokio.rs:564] Skipping download, using cached copy of file: vae/config.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/vae/config.json"
INFO 2025-12-10 10:45:18.344 lib.rs:203] File: trt_engines_dir/flux.1-schnell/clip.trt10.8.0.43.plan found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/clip.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.344 public.rs:52] Skipping download, using cached copy of file: trt_engines_dir/flux.1-schnell/clip.trt10.8.0.43.plan at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/clip.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.352 lib.rs:203] File: trt_engines_dir/flux.1-schnell/transformer-fp8.l4.0.trt10.8.0.43.plan found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/transformer-fp8.l4.0.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.352 public.rs:52] Skipping download, using cached copy of file: trt_engines_dir/flux.1-schnell/transformer-fp8.l4.0.trt10.8.0.43.plan at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/transformer-fp8.l4.0.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.359 lib.rs:203] File: trt_engines_dir/flux.1-schnell/vae.trt10.8.0.43.plan found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/vae.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.359 public.rs:52] Skipping download, using cached copy of file: trt_engines_dir/flux.1-schnell/vae.trt10.8.0.43.plan at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/flux.1-schnell/vae.trt10.8.0.43.plan"
INFO 2025-12-10 10:45:18.368 lib.rs:203] File: trt_engines_dir/guardrails/cosmos-frame-content-safety-filter/model.plan found in cache: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/guardrails/cosmos-frame-content-safety-filter/model.plan"
INFO 2025-12-10 10:45:18.368 public.rs:52] Skipping download, using cached copy of file: trt_engines_dir/guardrails/cosmos-frame-content-safety-filter/model.plan at path: "/model-store/ngc/hub/models--nim--black-forest-labs--flux.1-schnell/snapshots/1.0.0-h100x1-fp8-fp8-768-1344x768-1344/trt_engines_dir/guardrails/cosmos-frame-content-safety-filter/model.plan"
INFO 2025-12-10 10:45:18.388 lib.rs:156] File: tokenizer/special_tokens_map.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/special_tokens_map.json"
INFO 2025-12-10 10:45:18.388 tokio.rs:564] Skipping download, using cached copy of file: tokenizer/special_tokens_map.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer/special_tokens_map.json"
INFO 2025-12-10 10:45:18.403 lib.rs:156] File: scheduler/scheduler_config.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/scheduler/scheduler_config.json"
INFO 2025-12-10 10:45:18.403 tokio.rs:564] Skipping download, using cached copy of file: scheduler/scheduler_config.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/scheduler/scheduler_config.json"
INFO 2025-12-10 10:45:18.410 lib.rs:156] File: tokenizer_2/tokenizer_config.json found in cache: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/tokenizer_config.json"
INFO 2025-12-10 10:45:18.410 tokio.rs:564] Skipping download, using cached copy of file: tokenizer_2/tokenizer_config.json at path: "/model-store/huggingface/hub/models--black-forest-labs--FLUX.1-schnell/snapshots/741f7c3ce8b383c54771c7003378a50191e9efe9/tokenizer_2/tokenizer_config.json"
ERROR 2025-12-10 10:45:18.478 repo.rs:144] One or more errors fetching files:
ERROR 2025-12-10 10:45:18.478 repo.rs:146] Permission error: The requested operation requires permissions that the user does not have. This may be due to the user not being a member of the organization that owns the repo.
ERROR 2025-12-10 10:45:18.478 repo.rs:146] Permission error: The requested operation requires permissions that the user does not have. This may be due to the user not being a member of the organization that owns the repo.
ERROR 2025-12-10 10:45:18.478 repo.rs:146] Permission error: The requested operation requires permissions that the user does not have. This may be due to the user not being a member of the organization that owns the repo.
ERROR 2025-12-10 10:45:18.478 repo.rs:146] Permission error: The requested operation requires permissions that the user does not have. This may be due to the user not being a member of the organization that owns the repo.
ERROR 2025-12-10 10:45:18.478 repo.rs:146] Permission error: The requested operation requires permissions that the user does not have. This may be due to the user not being a member of the organization that owns the repo.
ERROR 2025-12-10 10:45:18.478 repo.rs:146] Permission error: The requested operation requires permissions that the user does not have. This may be due to the user not being a member of the organization that owns the repo.
not found /opt/nim/.cache/trt_engines_cache/flux.1-schnell/clip.trt10.8.0.43.plan
INFO 2025-12-10 10:45:18.478 inference.py:363] CustomProfileSelector not able to find the profile: Permission error: The requested operation requires permissions that the user does not have. This may be due to the user not being a member of the organization that owns the repo..
Traceback (most recent call last):
  File "/usr/local/bin/start_server", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nimlib/start_server.py", line 61, in main
    nimutils.get_model_manifest()
  File "/usr/local/lib/python3.12/dist-packages/nimlib/nimutils.py", line 114, in get_model_manifest
    model_manifest = SdkModelManifest()
                     ^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/dist-packages/nimlib/nim_sdk.py", line 124, in __init__
    raise NIMProfileIDNotFound(
nimlib.exceptions.NIMProfileIDNotFound: Could not match a profile in manifest at /opt/nim/etc/default/model_manifest.yaml`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions