Summary
When using pandas 3.0, Enum variable encoding fails with KeyError: 0 because encode() uses array[0] for element access, which does label-based lookup on pandas Series with StringDtype index instead of positional access.
Error
policyengine_core/enums/enum.py:69: in encode
if len(array) > 0 and isinstance(array[0], Enum):
^^^^^^^^
pandas/core/series.py:959: in __getitem__
return self._get_value(key)
...
E KeyError: 0
Reproduction
This fails in policyengine-us CI when running county.yaml tests with pandas 3.0:
Root Cause
In policyengine_core/enums/enum.py:69:
if len(array) > 0 and isinstance(array[0], Enum):
With pandas 3.0, string columns use StringDtype by default. When a Series has a string index, array[0] does label-based lookup (looking for key "0") instead of positional access, causing KeyError: 0.
Proposed Fix
Use .iloc[0] for positional access when dealing with pandas Series:
first_elem = array.iloc[0] if hasattr(array, 'iloc') else array[0]
if len(array) > 0 and isinstance(first_elem, Enum):
Or convert to numpy array first:
if hasattr(array, 'values'):
array = np.asarray(array)
if len(array) > 0 and isinstance(array[0], Enum):
Related
Summary
When using pandas 3.0, Enum variable encoding fails with
KeyError: 0becauseencode()usesarray[0]for element access, which does label-based lookup on pandas Series with StringDtype index instead of positional access.Error
Reproduction
This fails in policyengine-us CI when running
county.yamltests with pandas 3.0:policyengine_us/tests/policy/baseline/household/demographic/geographic/county/county.yamlRoot Cause
In
policyengine_core/enums/enum.py:69:With pandas 3.0, string columns use
StringDtypeby default. When a Series has a string index,array[0]does label-based lookup (looking for key"0") instead of positional access, causingKeyError: 0.Proposed Fix
Use
.iloc[0]for positional access when dealing with pandas Series:Or convert to numpy array first:
Related
filled_array, StringArray handling inVectorialParameterNodeAtInstant