Skip to content

Change body hashing method from SHA512 to Simhash #32

@cstrouse

Description

@cstrouse

The possibilities for duplicate content checking using SHA512 is limited. What do you think of swapping that out for Simhash so more nuanced comparisons of content would be possible?

The stopwords package already implements Simhash in Go and has a compatible license.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions