Group/cluster size decreases as data increases although same similar records number remains about the same

Hello,

In a dataset of 2.6M records, that contains a set of 28 similar records, Zingg clustered these 28 similar records into 6 different z_cluster IDs. 
Using the same model with a larger dataset of 14.5M records, that contains the exact same set of 28 similar records (plus 4, 32 in total), Zingg clustered them into 24 different z_cluster IDs. 
Why would zingg do less matches when the number of total records increased (from 2.6M to 14.5M) but the set of same similar records was pretty much the same (28 vs 32)?  

Thank you,
Rodrigo Escamilla


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Group/cluster size decreases as data increases although same similar records number remains about the same #1239

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Group/cluster size decreases as data increases although same similar records number remains about the same #1239

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions