Group Similar

Group similar items together.

Runtime complexity is O(N^2 * (M + α(N))), where N is the number of elements in items, M is the runtime complexity of the similarityFunction, and α(N) is the inverse Ackermann function (amortized constant time for all practical purposes).

Space complexity is O(N).

Getting started

npm i group-similar

Examples

Group similar strings

import { groupSimilar } from "group-similar";
import { distance } from "fastest-levenshtein";

function levenshteinSimilarityFunction(a: string, b: string): number {
  return a.length === 0 && b.length === 0
    ? 1
    : 1 - distance(a, b) / Math.max(a.length, b.length);
}

groupSimilar({
  items: ["cat", "bat", "kitten", "dog", "sitting"],
  mapper: (i) => i,
  similarityFunction: levenshteinSimilarityFunction,
  similarityThreshold: 0.5,
});

// [ [ 'cat', 'bat' ], [ 'kitten', 'sitting' ], [ 'dog' ] ]

Group similar numbers

import { groupSimilar } from "group-similar";

function evenOddSimilarityFunction(a: number, b: number): number {
  return Number(a % 2 === b % 2);
}

groupSimilar({
  items: [1, 5, 10, 0, 2, 123],
  mapper: (i) => i,
  similarityFunction: evenOddSimilarityFunction,
  similarityThreshold: 1,
});

// [ [ 1, 5, 123 ], [ 10, 0, 2 ] ]

Group similar objects

import { groupSimilar } from "group-similar";
import { distance } from "fastest-levenshtein";

function nestedMapper(object: { a: { b: { value: string } } }): string {
  return object.a.b.value;
}

function levenshteinSimilarityFunction(a: string, b: string): number {
  return a.length === 0 && b.length === 0
    ? 1
    : 1 - distance(a, b) / Math.max(a.length, b.length);
}

groupSimilar({
  items: [
    { a: { b: { value: "sitting" } } },
    { a: { b: { value: "dog" } } },
    { a: { b: { value: "kitten" } } },
    { a: { b: { value: "bat" } } },
    { a: { b: { value: "cat" } } },
  ],
  mapper: nestedMapper,
  similarityFunction: levenshteinSimilarityFunction,
  similarityThreshold: 0.5,
});

// [
//   [{ a: { b: { value: "sitting" } } }, { a: { b: { value: "kitten" } } }],
//   [{ a: { b: { value: "dog" } } }],
//   [{ a: { b: { value: "bat" } } }, { a: { b: { value: "cat" } } }],
// ]

Syntax

groupSimilar(options);
groupSimilar({ items, mapper, similarityFunction, similarityThreshold });

Parameters

Parameter	Type	Required	Default	Description
options	`Object`	Yes	none	Arguments to pass to the function

Options

Property	Type	Required	Default	Description
items	`T[]`	Yes	none	Array of items to group
mapper	`(t: T) => K`	Yes	none	Function to apply to each element in items prior to measuring similarity
similarityFunction	`(a: K, b: K) => number`	Yes	none	Function to measure similarity between mapped items
similarityThreshold	`number`	Yes	none	Threshold at which items whose similarity value is greater than or equal to it are grouped together

Return value

The return value is a new nested array of type T[][] containing elements of items grouped by similarity. If there are no elements in items, an empty array will be returned.

Benchmark

Benchmark test results where N is the number of items being grouped, higher ops/sec is better.

Library	N=16	N=32	N=64	N=128	N=256	N=512	N=1024	N=2048
group-similar	86867	17538	6067	1594	444	171	75	27
set-clustering	28506	6258	1831	455	121	30	6	1

Benchmark configuration details can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
src		src
test		test
.eslintignore		.eslintignore
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
jest.config.json		jest.config.json
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Group Similar

Getting started

Examples

Syntax

Parameters

Options

Return value

Benchmark

About

Uh oh!

Sponsor this project

Uh oh!

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Uh oh!

Uh oh!

License

Uh oh!

peterthehan/group-similar

Folders and files

Latest commit

History

Repository files navigation

Group Similar

Getting started

Examples

Syntax

Parameters

Options

Return value

Benchmark

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages