Releases: RTiedrez/ms_blocking
v1.2.1
v1.2.1 (2026-02-05)
Bug Fixes
-
_score was named score (
056bad8) -
Add_blocks_to_dataset does not use index (
8fed724) -
Crash on sort (
3e4f0ab) -
Flatten does not raise anything (
40affb0) -
Must_not_be_different was ignored by MixedBlocker (
5807571) -
Normalize crash on checking if array is NaN (
3d3f06e) -
Normalize would skip normalization on lists (
fb8092f) -
Parse_list and normalize should be better at handling null values (
4e7e435)
Refactoring
-
Replace must_not_be_different logic with dropna and duplicated combination; fix: preprocessed does not always run on right columns; perf: add second dropna for overlap since empty lists were not considered as NaN's (this might be a huge improvement) (
f18d8cd) -
Typo (
5e44994) -
Use .empty instead of len()==0 (
3fa86f8)
Testing
Detailed Changes: v1.2.0...v1.2.1
v1.2.0
v1.2.0 (2026-02-04)
Bug Fixes
-
Parse_list crashes on strings that do not represent lists (
83b1932) -
Parse_list use startswith instead of endswith (
b37e017) -
Switche must_not_be_different and normalize_strings (
b4441c4)
Code Style
Documentation
-
Add typehints (
86a443c) -
Fix discarded reference (
41e8def) -
Fix motives (
a128219) -
Fix obsolete type in docstring (
3643a00) -
Fix rendering issue in notebook (
add6ae5) -
Rebuild docs (
49ce632) -
Update example notebook with new motive system (
60a3cc9)
Features
- BREAKING CHANGES various improvements and bugfixes (
d08173d)
Performance Improvements
- Move dropna higher in the block logic (
8cdd3bd)
Refactoring
-
Make preprocessing in .block more compact and pandas-esque (
9521338) -
Motive as list instead of string (
c0e1891) -
Motives are now instances of own classes (
9da6796) -
Remove remove_value_if_appears_only_once since it was redundant with df.duplicated (
9fce21d)
Testing
-
Fix checks depending on (random) ordering of motives (
4822c87) -
Update tests to new Motives (
41e154e)
Detailed Changes: v1.1.0...v1.2.0
v1.1.0
v1.1.0 (2026-02-02)
Bug Fixes
Build System
-
Add ruff to cicd (
afb3fed) -
Add ruff to cicd (
212b248) -
Add versioning to workflow (
ea64a3d) -
Remove obsolete comments (
022b55a)
Code Style
Documentation
-
Add must_not_be_different to example notebook (
987978e) -
Add scoring to example notebook (
5a9211f) -
Run notebook (
3793a13) -
Update readme (
c51220e)
Features
-
Add eq for AndNode and OrNode (
c526ca9) -
Add eq for blockers (
ad91e93) -
Add scoring function (
eab75ad) -
Revamp motives structure (
a381fc3)
Refactoring
-
Add underscore in front of new column names; perf: do not add temp column to df in must_not_be_different_apply (
043ac2d) -
Move adding motive logic to own function (
80b3561) -
Move helpers to utils; docs: add docstrings and typehints to utils (
1877168) -
Move must_not_be_different logic to own function (
ec7b2f2) -
Move overlap coords generation logic to own function (
eb7d1ea) -
Rename block and motive to new names (
c6f6fd6) -
Rename Node to BlockerNode; fix: motives handling in AndNode (
4e14e6d) -
Ruff (
c2799e4)
Testing
- Add tests for MixedBlocker (
5dec1a9)
Detailed Changes: v1.0.0...v1.1.0