-
Couldn't load subscription status.
- Fork 247
Description
It seems like sha256 is being used as filenames. This won't work on some Linux systems using EXT4 (limit: 255 characters per filename)
`
linker = EntityLinker(resolve_abbreviations=True, name="umls")
nlp.add_pipe(linker)
https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/data/linkers/2023-04-23/umls/tfidf_vectors_sparse.npz not found in cache, downloading to /tmp/tmpbm_3w8e8
100%|██████████| 492M/492M [02:37<00:00, 3.27MiB/s]
Finished download, copying /tmp/tmpbm_3w8e8 to cache at /home/tom/.scispacy/datasets/2b79923846fb52e62d686f2db846392575c8eb5b732d9d26cd3ca9378c622d40.87bd52d0f0ee055c1e455ef54ba45149d188552f07991b765da256a1b512ca0b.tfidf_vectors_sparse.npz
Traceback (most recent call last):
Cell In[61], line 1
linker = EntityLinker(resolve_abbreviations=True, name="umls")
File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/linking.py:85 in init
self.candidate_generator = candidate_generator or CandidateGenerator(
File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/candidate_generation.py:222 in init
self.ann_index = ann_index or load_approximate_nearest_neighbours_index(
File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/candidate_generation.py:134 in load_approximate_nearest_neighbours_index
cached_path(linker_paths.tfidf_vectors)
File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/file_cache.py:39 in cached_path
return get_from_cache(url_or_filename, cache_dir)
File /media/tom/StorageA/anaconda/lib/python3.11/site-packages/scispacy/file_cache.py:150 in get_from_cache
with open(cache_path, "wb") as cache_file:
OSError: [Errno 36] File name too long: '/home/tom/.scispacy/datasets/2b79923846fb52e62d686f2db846392575c8eb5b732d9d26cd3ca9378c622d40.87bd52d0f0ee055c1e455ef54ba45149d188552f07991b765da256a1b512ca0b.tfidf_vectors_sparse.npz'`