Skip to content

File name too long #539

@youknow16

Description

@youknow16

It seems like sha256 is being used as filenames. This won't work on some Linux systems using EXT4 (limit: 255 characters per filename)

`
linker = EntityLinker(resolve_abbreviations=True, name="umls")
nlp.add_pipe(linker)
https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/data/linkers/2023-04-23/umls/tfidf_vectors_sparse.npz not found in cache, downloading to /tmp/tmpbm_3w8e8
100%|██████████| 492M/492M [02:37<00:00, 3.27MiB/s]
Finished download, copying /tmp/tmpbm_3w8e8 to cache at /home/tom/.scispacy/datasets/2b79923846fb52e62d686f2db846392575c8eb5b732d9d26cd3ca9378c622d40.87bd52d0f0ee055c1e455ef54ba45149d188552f07991b765da256a1b512ca0b.tfidf_vectors_sparse.npz
Traceback (most recent call last):

Cell In[61], line 1
linker = EntityLinker(resolve_abbreviations=True, name="umls")

File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/linking.py:85 in init
self.candidate_generator = candidate_generator or CandidateGenerator(

File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/candidate_generation.py:222 in init
self.ann_index = ann_index or load_approximate_nearest_neighbours_index(

File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/candidate_generation.py:134 in load_approximate_nearest_neighbours_index
cached_path(linker_paths.tfidf_vectors)

File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/file_cache.py:39 in cached_path
return get_from_cache(url_or_filename, cache_dir)

File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/file_cache.py:150 in get_from_cache
with open(cache_path, "wb") as cache_file:

OSError: [Errno 36] File name too long: '/home/tom/.scispacy/datasets/2b79923846fb52e62d686f2db846392575c8eb5b732d9d26cd3ca9378c622d40.87bd52d0f0ee055c1e455ef54ba45149d188552f07991b765da256a1b512ca0b.tfidf_vectors_sparse.npz'`

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions