-
Notifications
You must be signed in to change notification settings - Fork 41
Description
https://autogis-site.readthedocs.io/en/latest/lessons/lesson-2/geopandas-an-introduction.html
Here it says:
Again, we start by grouping the input data by terrain classes, and then compute the sum of each classes’ area. This can be condensed into one line of code:
area_information = data.groupby("CLASS").area.sum()
area_informationAnd here's the corresponding expected output:
CLASS
32111 1.833747e+03
32112 2.148168e+03
32200 1.057368e+05
32417 1.026678e+02
32421 6.792797e+05
32500 1.097467e+05
32611 1.314807e+07
32612 1.073431e+05
32800 1.407231e+06
32900 6.158391e+05
33000 6.594647e+05
33100 3.769076e+06
34100 1.236289e+07
34300 1.627079e+03
34700 2.785751e+03
35300 1.382940e+06
35411 3.928004e+05
35412 4.708321e+06
35421 6.786374e+04
36200 9.986966e+06
36313 4.346029e+04
Name: area, dtype: float64
At least with my current version of pandas (2.2.3) and geopandas (1.0.1), I get the following error:
AttributeError: 'DataFrameGroupBy' object has no attribute 'area'
However, I was first able to recreate the desired output bit by bit. I started like so:
data.groupby('CLASS').apply(lambda group: group.area)But I got a warning:
DeprecationWarning: DataFrameGroupBy.apply operated on the grouping columns. This behavior is deprecated, and in a future version of pandas the grouping columns will be excluded from the operation. Either pass `include_groups=False` to exclude the groupings or explicitly select the grouping columns after groupby to silence this warning.
Giving an extra argument made the warning go away without changing the output (see this Stack Overflow thread for further details):
data.groupby('CLASS').apply(lambda group: group.area, include_groups=False)However, due to the way the area attribute in geopandas works, the output has an undesired MultiIndex:
CLASS
32111 3116 1833.746786
32112 3115 2148.168209
32200 103 103982.028273
104 1754.793619
32417 3112 102.667779
...
36313 4299 2651.800270
4300 376.503380
4301 413.942555
4302 3487.927677
4303 1278.963199
Length: 4304, dtype: float64After further searching, I was finally able to match the desired output like so:
data.groupby('CLASS').apply(lambda group: group.area, include_groups=False).groupby(level=0).sum()Here's the output:
CLASS
32111 1.833747e+03
32112 2.148168e+03
32200 1.057368e+05
32417 1.026678e+02
32421 6.792797e+05
...
35411 3.928004e+05
35412 4.708321e+06
35421 6.786374e+04
36200 9.986966e+06
36313 4.346029e+04
Length: 21, dtype: float64
I don't know whether what I did, with two .groupby() operations, is the ideal way to do things, but it worked.