Skip to content

Added ability to limit the languages to check for#62

Open
whnr wants to merge 1 commit intoMimino666:masterfrom
whnr:master
Open

Added ability to limit the languages to check for#62
whnr wants to merge 1 commit intoMimino666:masterfrom
whnr:master

Conversation

@whnr
Copy link
Copy Markdown

@whnr whnr commented Feb 20, 2019

Added a list as language limitation for load_profiles. Also implemented in detect(text, languages=[]) and detect_langs(text, languages=[]). Auto reloading the _factory when the language selection changes.

Added a list as language limitation for load_profiles
@Vangelys
Copy link
Copy Markdown

Hello, I also want to limit languages in detection for a project, do you have any news about this functionality and the acceptation of this branch ?

@ManuelMartinG
Copy link
Copy Markdown

ManuelMartinG commented Jul 3, 2020

I'm also interested in this feature. Is there any expectation to be merged in a new release?

The reason behind my concern is that if you are using langdetect within a PySpark UDF, it's not efficient to load every possible language available. It adds quite an overhead to the serialized size of the UDF passed to Spark. Usually, you're expecting a limited number of languages to appear in your application, no need to have such a big list by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants