-
Notifications
You must be signed in to change notification settings - Fork 188
Description
Description
The lazy initialization check for _cache_filelock in StreamingDataset always fails because of a mismatch between the attribute name being checked and the attribute name being set.
Root Cause
In streaming/base/constant.py:
CACHE_FILELOCK = 'cache_filelock' # without underscore prefix
The check uses CACHE_FILELOCK ('cache_filelock'), but the assignment is to self._cache_filelock (with underscore prefix). These names don't match, so hasattr(self, CACHE_FILELOCK) always returns False.
Impact
Every call to evict_shard(), evict_coldest_shard(), or prepare_shard() creates a new FileLock or Lock object instead of reusing the previously created one. This could lead to:
- Unnecessary object creation overhead
Suggested Fix
Change the hasattr check to use the actual attribute name:
if not hasattr(self, '_cache_filelock'):
Or update the constant to match:
CACHE_FILELOCK = '_cache_filelock'