Skip to content

Bug: CACHE_FILELOCK attribute check always fails due to attribute name mismatch #963

@yezhengmao1

Description

@yezhengmao1

Description

The lazy initialization check for _cache_filelock in StreamingDataset always fails because of a mismatch between the attribute name being checked and the attribute name being set.

Root Cause

In streaming/base/constant.py:
CACHE_FILELOCK = 'cache_filelock' # without underscore prefix

The check uses CACHE_FILELOCK ('cache_filelock'), but the assignment is to self._cache_filelock (with underscore prefix). These names don't match, so hasattr(self, CACHE_FILELOCK) always returns False.

Impact

Every call to evict_shard(), evict_coldest_shard(), or prepare_shard() creates a new FileLock or Lock object instead of reusing the previously created one. This could lead to:

  • Unnecessary object creation overhead

Suggested Fix

Change the hasattr check to use the actual attribute name:
if not hasattr(self, '_cache_filelock'):
Or update the constant to match:
CACHE_FILELOCK = '_cache_filelock'

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions