-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PointCloud
subclass of Vector
#492
base: main
Are you sure you want to change the base?
Conversation
@adehecq @atedstone @erikmannerfelt This PR is now advanced enough to get your feedback! 😊 All tests passing locally with a LAS file I have. Will need to add one to |
@rhugonnet , this looks exceptionally well thought through. I read your descriptions and the changed files,but have not tried out the functionality myself as I don't immediately have files lying around that would be a good test case. But from my side, this looks good to progress to finishing the outstanding tasks you listed. Concerning the bigger possible change about arithmetic ops between Raster and PointCloud: my instinct would be to require the user to decide, so that the comparison is explicit and clear. |
This PR adds the
PointCloud
class to facilitate interfacing with this special sub-type of vector data very common in geospatial analysis.In short, a point cloud is a subclass of
Vector
containing only 2D point geometries and associated to a main data column (+ optionally other auxiliary data columns). This main data column represents the values.data
of the point cloud, that can be compared to the.data
of a raster format (which is impossible for a typical vector), or of that of another point cloud if some of their coordinates are the same, or within close tolerance.Additionally, all functionalities of
Raster
using.data
that just need coordinates can then be ported toPointCloud
(array interface with NumPy, subsampling, binning, zonal stats, variography, etc).Summary
The idea behind the implementation of the
PointCloud
class is two-fold:load
mechanism asRaster
and support chunked reading.For this, we add the new optional dependency
laspy
for reading and writing LAS-type format. And we override some of the behaviour ofVector
inPointCloud
to allow for aload()
functionality.Details
This PR adds the
PointCloud
class, including:data_column
attribute containing the data column name of the point cloud,data
attribute corresponding to the data of the point cloud,all_columns
attribute containing all data columns of the point cloud (Vector.columns
also containsgeometry
),is_loaded
attribute andload()
method for when data is read from a LAS file,nb_points
attribute returning the number of points in the point cloud,from_array()
,from_tuples()
andfrom_xyz()
methods to easily create a point cloud from different inputs, and their equivalentto_array()
,to_tuples()
andto_xyz()
,pointcloud_equal()
method simply checkingvector_equal()
and that thedata_column
is equal as well,grid()
functionality as a class method.The new
PointCloud
class overrides the attributesds
,crs
,bounds
andcolumns
to allow loading only metadata, and implicitly loading the point cloud.This PR updates existing input/output of some functions:
Raster.to_pointcloud()
now returns aPointCloud
instead of aVector
.Left to-do for this PR
Small changes:
geoutils-data
to run tests,Raster.from_regular_pointcloud()
to accept aPointCloud
input,Raster.interp_points()
andRaster.reduce_points()
to accept aPointCloud
as input for coordinates,Vector.create_mask()
to accept aPointCloud
the manipulation of masking and booleans for point clouds,.data
(same whether 1D or 2D) to point cloud:subsample()
,get_stats()
(and function that work on 1D sets of coordinates likevariogram()
, once moved to GeoUtils).__array_interface__
support for point clouds with the same coordinates, simply pointing to that ofself.ds[data_column]
?Potential bigger change?
This last change could be to:
Raster
andPointCloud
? (e.g., defaulting to usinginterp_points
at thePointCloud
coordinates, the comparison method could be tweaked ingeoutils.config
),It could work as:
The only problem I see is that interpolation is not ideal for a point cloud much denser than a raster (e.g., a lidar point cloud). In that case, it's better to grid the point cloud into a raster, and compare the two rasters. But that is computationally-intensive, so happening under-the-hood is not ideal.
We could have a criteria based on point cloud density relative to the raster size:
Or we do not add such a functionality and always leave this to the user, which remains fairly short as long as we support
Raster
/Raster
arithmetic andPointCloud
/PointCloud
arithmetic:And add a public function to calculate relative point density to help users decide when need be?
Future functionalities
PointCloud
class in a GeoPandas accessor (technically Pandas accessor, see here: ENH: Addregister_geo(series|dataframe)_accessor
API (#1947) geopandas/geopandas#1952 (comment)).Resolves #463