-
Notifications
You must be signed in to change notification settings - Fork 6
Description
Hi,
I executed python corpus.py corpus.ttl and then python download_corpus_resources.py to download the corpus but I got this output. Is this the expected output? It looks like some publications cannot be downloaded.
Number of records in the corpus: 586
Number of research publications: 480
Successfully downloaded 474 pdf files.
Missing publication resources: {'012df4a72af52b038483', 'dca54974ff51a5f7f8ab', '5f48a343cb75195cd646', 'c8f9b19b39e34d98a557','988428e18884e28e037c', '42c2755ec0f983870e62'}
Number of datasets: 106
Successfully downloaded 101 resource files.
Missing dataset resources: {'875ffb2b04b1392cd1f2', 'fe338b5b2f3f6b0d11a4', '53ca68ba0ded95220662', '33b1ce039c67a6658644', '379ff5f518e664ba2353'}
I checked the publication with id: "012df4a72af52b038483", and it looks like the link is not broken. Here is the link I got from corpus.jsonld
https://aasldpubs.onlinelibrary.wiley.com/doi/pdf/10.1002/hep.23220
@ceteri @philipskokoh Do you know why this happen?
Thanks