Reading with PDAL

Author:Bradley Chambers
Contact:brad.chambers@gmail.com
Date:01/21/2015

This tutorial will be presented in two parts – the first being an introduction to the command-line utilities that can be used to perform processing operations with PDAL, and the second being an introductory C++ tutorial of how to use the PDAL API to accomplish similar tasks.

Introduction

PDAL is both a C++ library and a collection of command-line utilities for data processing operations. While it is similar to LAStools in a few aspects, and borrows some of its lineage in others, the PDAL library is an attempt to construct a library that is primarily intended as a data translation library first, and a exploitation and filtering library second. PDAL exists to provide an abstract API for software developers wishing to navigate the multitude of point cloud formats that are out there. Its value and niche is explicitly modeled after the hugely successful GDAL library, which provides an abstract API for data formats in the GIS raster data space.

A basic inquiry example

Our first example to demonstrate PDAL’s utility will be to simply query an ASPRS LAS file to determine the data that are in it in the very first point.

Note

The interesting.las file in these examples can be found on github.

pdal info outputs JavaScript JSON.

$ pdal info interesting.las -p 0
{
  "filename": "interesting.las",
  "pdal_version": "1.0.1 (git-version: 80644d)",
  "points":
  {
    "point":
    {
      "Blue": 88,
      "Classification": 1,
      "EdgeOfFlightLine": 0,
      "GpsTime": 245381,
      "Green": 77,
      "Intensity": 143,
      "NumberOfReturns": 1,
      "PointId": 0,
      "PointSourceId": 7326,
      "Red": 68,
      "ReturnNumber": 1,
      "ScanAngleRank": -9,
      "ScanDirectionFlag": 1,
      "UserData": 132,
      "X": 637012,
      "Y": 849028,
      "Z": 431.66
    }
  }
}

A conversion example

Conversion of one file format to another can be a hairy topic. You should expect leakage of details of data in the source format as it is converted to the destination format. Metadata, file organization, and data themselves may not be able to be represented as you move from one format to another. Conversion is by definition lossy, if not in terms of the actual data themselves, but possibly in terms of the auxiliary data the format also carries.

It is also important to recognize that both fixed and flexible point cloud formats exist, and conversion of flexible formats to fixed formats will often leak. The dimensions might even match in terms of type or name, but not in terms of width or interpretation.

See also

See pdal::Dimension for details on PDAL dimensions.

$ pdal translate interesting.las output.txt
"X","Y","Z","Intensity","ReturnNumber","NumberOfReturns","ScanDirectionFlag","EdgeOfFlightLine","Classification","ScanAngleRank","UserData","PointSourceId","Time","Red","Green","Blue"
637012.24,849028.31,431.66,143,1,1,1,0,1,-9,132,7326,245381,68,77,88
636896.33,849087.70,446.39,18,1,2,1,0,1,-11,128,7326,245381,54,66,68
636784.74,849106.66,426.71,118,1,1,0,0,1,-10,122,7326,245382,112,97,114
636699.38,848991.01,425.39,100,1,1,0,0,1,-6,124,7326,245383,178,138,162
636601.87,849018.60,425.10,124,1,1,1,0,1,-4,126,7326,245383,134,104,134
636451.97,849250.59,435.17,48,1,1,0,0,1,-9,122,7326,245384,99,85,95
...

The text format, of course, is the ultimate flexible-definition format – at least for the point data themselves. For the other header information, like the spatial reference system, or the ASPRS LAS UUID, the conversion leaks. In short, you may need to preserve some more information as part of your conversion to make it useful down the road.

Metadata

PDAL transmits this other information in the form of Metadata that is carried per-stage throughout the PDAL processing pipeline. We can capture this metadata using the info utility.

$ pdal info --metadata interesting.las

This produces metadata that looks like this. You can use your JSON manipulation tools to extract this information. For formats that do not have the ability to preserve this metadata internally, you can keep a .json file alongside the .txt file as auxiliary information.

See also

Metadata contains much more detail of metadata workflow in PDAL.

A pipeline example

The full power of PDAL comes in the form of pipeline invocations. While translate provides some utility as far as simple conversion of one format to another, it does not provide much power to a user to be able to filter or alter data as they are converted. Pipelines are the way to take advantage of PDAL’s ability to manipulate data as they are converted. This section will provide a basic example and demonstration of Pipeline, but the Pipeline document contains more detailed exposition of the topic.

Note

The pipeline document contains detailed examples and background information.

The pipeline PDAL utility is one that takes in a .json file containing pipeline description that defines a PDAL processing pipeline. Options can be given at each pdal::Stage of the pipeline to affect different aspects of the processing pipeline, and stages may be chained together into multiple combinations to have varying effects.

Simple conversion

The following JSON document defines a Pipeline that takes the file.las ASPRS LAS file and converts it to a new file called output.las.

{
  "pipeline":[
    "file.las",
    "output.las"
  ]
}

Loop a directory and filter it through a pipeline

This little bash script loops through a directory and pushes the las files through a pipeline, substituting the input and output as it goes.

ls *.las | cut -d. -f1 | xargs -P20 -I{} pdal pipeline -i /path/to/proj.json --readers.las.filename={}.las --writers.las.filename=output/{}.laz