Skip to content

What types of entities can each scispaCy model recognize? #550

@rferrazd

Description

@rferrazd

Hello,

First, thank you for developing and maintaining the scispaCy package — it’s an impressive tool and a valuable contribution to the field of biomedical NLP.

I’m currently experimenting with the en_core_sci_md model, and I would like to better understand what types of entities it is designed to recognize. For example, when testing the following text:

"""
The patient is a 58-year-old male with a history of type 2 diabetes and hypertension.
He presents with chest pain and shortness of breath for the past two hours.
In the emergency room, a troponin test was ordered, which came back elevated.
An urgent coronary angiography was performed, and the patient was started on aspirin and atorvastatin.
He has a known penicillin allergy.
His smoking history is considered a major risk factor.
"""

All the words in bold were the ones that I wanted to extract as entities, but the model only extracted the following:

  • patient (ENTITY)
  • male (ENTITY)
  • history of type 2 diabetes (ENTITY)
  • hypertension (ENTITY)
  • chest pain (ENTITY)
  • shortness of breath (ENTITY)
  • hours (ENTITY)

Could you please point me to documentation or resources that describe the entity types covered by this model, so that I can better anticipate what it can and cannot extract?

Thank you very much for your time and for your excellent work on scispaCy!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions