Skip to content

Missing Publication and Dataset Resources #6

@HaritzPuerto

Description

@HaritzPuerto

Hi,

I executed python corpus.py corpus.ttl and then python download_corpus_resources.py to download the corpus but I got this output. Is this the expected output? It looks like some publications cannot be downloaded.

Number of records in the corpus: 586
Number of research publications: 480
Successfully downloaded 474 pdf files.
Missing publication resources: {'012df4a72af52b038483', 'dca54974ff51a5f7f8ab', '5f48a343cb75195cd646', 'c8f9b19b39e34d98a557','988428e18884e28e037c', '42c2755ec0f983870e62'}
Number of datasets: 106
Successfully downloaded 101 resource files.
Missing dataset resources: {'875ffb2b04b1392cd1f2', 'fe338b5b2f3f6b0d11a4', '53ca68ba0ded95220662', '33b1ce039c67a6658644', '379ff5f518e664ba2353'}

I checked the publication with id: "012df4a72af52b038483", and it looks like the link is not broken. Here is the link I got from corpus.jsonld
https://aasldpubs.onlinelibrary.wiley.com/doi/pdf/10.1002/hep.23220

@ceteri @philipskokoh Do you know why this happen?

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions