Point Clouds

Point cloud data is one of my favorite and most frustrating forms of data to interact with. It represents a significant milestone in most workflows since the collection of observations from the natural world is most commonly a point representation, and there are very few large scale applications whose inputs don’t touch on or hinge on the point cloud data and the phenomenon they aim to represent.

One of the most accessible and immediatly relevent data an average user will encounter is point clouds from the USGS entwine dataset.

“The Point Data Abstraction Library (PDAL) is a C++ library for translating and manipulating point cloud data” . Much like the other giants of geospatial processing, and due perhaps to its orientation in relation to ”big data”, PDAL has really light-posted the pathways towards cloud native and optimized data manipulations. By that, I refer to the ability to perform partial read operations via http protocol, as opposed to more traditional computing pathways which require the entire dataset to be read into memory or otherwise serially streamed. This transformation requires the target point cloud to have been saved as an entwine point cloud, something the USGS and Hobu have been working diligently on and to date there are 2,125 point clouds staged and available for you to hit here. It’s worth noting however, that just because there are cloud native paths to stream that data into the compute environment, does not mean that it’s the fastest or even cheapest way to accomplish your end goal. If you plan on hitting the data more than a few times or if the dataset is static, you may be able to read more data in faster by performing the one-time download of the massive dataset than you would by taking tiny subsets over and over each time you’d like to run your analysis. Because this is so hardware and geographic dataset specific, timings have been hard to track down but using a docker’d image on a Intel Core i5-9600K CPU @ 3.7 GHz with 32 gigs of RAM available and light background use, the following timings were found:

https://pdal.io/en/2.7.1/project/docs.html https://pdal.io/en/2.6.0/workshop/manipulation/ground/ground.html#workshop-ground https://pdal.io/en/2.4.3/workshop/exercises/analysis/dtm/dtm.html https://pdal.io/en/2.4.3/workshop/exercises/analysis/rasterize/rasterize.html There are two sorts of workflows

Common pipeline examples:

Pull class to tif: {"pipeline":[{"type":"readers.ept","filename":"https://s3-us-west-2.amazonaws.com/usgs-lidar-public/TX_Coastal_B1_2018/ept.json","bounds":"([-10605693, -10601522], [3505846, 3508523])"},{"type":"filters.range","limits":"Classification[17:17]"},{"type":"writers.gdal","dimension":"Z","resolution":1,"filename":"bridge_1.tif"}]} pull to csv: {"type":"writers.text","format":"csv","order":"X,Y,Z","keep_unspecified":"false","filename":"outputfile.csv"}

{"pipeline":[{"type":"readers.ept","filename":"https://s3-us-west-2.amazonaws.com/usgs-lidar-public/TX_Coastal_B1_2018/ept.json","bounds":"([-10605693, -10601522], [3505846, 3508523])"},{"type":"filters.returns","groups":"first,only"},{"type":"writers.gdal","dimension":"Z","resolution":1,"filename":"FR_1.tif"}]}

Data

https://registry.opendata.aws/usgs-lidar/ https://usgs.entwine.io/ https://portal.opentopography.org/usgsDataset?dsid=USGS_LPC_TX_Central_B1_2017_LAS_2019 https://www.sciencebase.gov/catalog/item/4f70ab64e4b058caae3f8def

Other tools

https://github.com/oilshit/las_converter https://maps.equatorstudios.com/ https://www.danielgm.net/cc/

References

Deering, Carol A., and Jason M. Stoker. 2014. “Let’s Agree on the Casing of Lidar 4 (6).