Skip to content

Error when adding textrank component for language model in Python 3.12 Docker setup #282

@matteosdocsity

Description

@matteosdocsity

I encountered an issue when trying to use PyTextRank with spaCy in a Docker container using Python 3.12.7. The problem arises when I try to add the textrank component to the all language models (it in the example).

Environment:

Python version: 3.12.7 (using the Docker image python:3.12.7-slim-bullseye)
spaCy versions: 3.0.5 and 3.7.4 (tested both)
PyTextRank version: 3.3.0
Steps to reproduce: Here are the commands I'm using to set up the environment in Docker:

RUN /root/.cargo/bin/uv pip install --no-cache --system spacy==3.0.5 pytextrank==3.3.0
RUN python -m spacy download it_core_news_sm
RUN python -m spacy download en_core_web_sm
RUN python -m spacy download es_core_news_sm
RUN python -m spacy download pt_core_news_sm
RUN python -m spacy download ru_core_news_sm
RUN python -m spacy download fr_core_news_sm
RUN python -m spacy download de_core_news_sm
RUN python -m spacy download pl_core_news_sm
RUN python -m spacy download xx_ent_wiki_sm

Error Message: The following error is thrown when attempting to add the textrank component to the Italian language model:

File "/app/extractive_summary/text_rank.py", line 139, in process_chunk_summary
 nlp.add_pipe("textrank", config={"stopwords": {"word": list(self.stop_words)}}, last=True)

 File "/usr/local/lib/python3.12/site-packages/spacy/language.py", line 824, in add_pipe
    pipe_component = self.create_pipe(
                    ^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/spacy/language.py", line 693, in create_pipe
    raise ValueError(err)
ValueError: [E002] Can't find factory for 'textrank' for language Italian (it). 
This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. 
If you're using a custom component, make sure you've added the decorator `@Language.component` (for function components) or `@Language.factory` (for class components).

Expected Behavior: The textrank component should be successfully added to the Italian language model without throwing any errors.

Additional Information: This issue seems to be related to the component registration process in spaCy.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions