filters.python

The filters.python filter allows Python software to be embedded in a Pipeline that interacts with a NumPy array of the data and allows you to modify those points. Additionally, some global Metadata is also available that Python functions can interact with.

The function must have two NumPy arrays as arguments, ins and outs. The ins array represents the points before the filters.python filter and the outs array represents the points after filtering.

Warning

filters.python was called filters.programmable and filters.predicate before Python 1.6. The functionality of filters.programmable and filters.predicate was rolled into filters.python, and the filter was renamed for clarity.

Warning

Each array contains all the Dimensions of the incoming ins point schema. Each array in the outs list match NumPy array of the same type as provided as ins for shape and type.

Dynamic Plugin

This stage requires a dynamic plugin to operate

import numpy as np

def multiply_z(ins,outs):
    Z = ins['Z']
    Z = Z * 10.0
    outs['Z'] = Z
    return True
  1. The function must always return True upon success. If the function returned False, an error would be thrown and the Pipeline exited.
  2. If you want to write a dimension that might not be available, use can use one or more add_dimension options.

Note

To filter points based on a Python function, use the filters.python filter.

Modification Example

{
  "pipeline":[
    "file-input.las",
    {
      "type":"filters.ground"
    },
    {
      "type":"filters.python",
      "script":"multiply_z.py",
      "function":"multiply_z",
      "module":"anything"
    },
    {
      "type":"writers.las",
      "filename":"file-filtered.las"
    }
  ]
}

The JSON pipeline file referenced the external multiply_z.py Python script, which scales up the Z coordinate by a factor of 10.

import numpy as np

def multiply_z(ins,outs):
    Z = ins['Z']
    Z = Z * 10.0
    outs['Z'] = Z
    return True

Predicates

Points can be retained/removed from the stream by setting true/false values into a special “Mask” dimension in the output point array.

The example above sets the “mask” to true for points that are in classifications 1 or 2 and to false otherwise, causing points that are not classified 1 or 2 to be dropped from the point stream.

import numpy as np

def filter(ins,outs):
   cls = ins['Classification']

   keep_classes = [1,2]

   # Use the first test for our base array.
   keep = np.equal(cls, keep_classes[0])

   # For 1:n, test each predicate and join back
   # to our existing predicate array
   for k in range(1,len(keep_classes)):
       t = np.equal(cls, keep_classes[k])
       keep = keep + t

   outs['Mask'] = keep
   return True

Note

filters.range is a specialized filter that implements the exact functionality described in this Python operation. It is likely to be much faster than Python, but not as flexible. filters.python is the tool you can use for prototyping point stream processing operations.

See also

If you want to just read a Pipeline of operations into a numpy array, the PDAL Python extension might be what you want. See it at https://pypi.python.org/pypi/PDAL

Example Pipeline

{
  "pipeline":[
    "file-input.las",
    {
      "type":"filters.ground"
    },
    {
      "type":"filters.python",
      "script":"filter_pdal.py",
      "function":"filter",
      "module":"anything"
    },
    {
      "type":"writers.las",
      "filename":"file-filtered.las"
    }
  ]
}

Module Globals

Three global variables are added to the Python module as it is run to allow you to get Dimensions, Metadata, and coordinate system information. Additionally, the metadata object can be set by the function to modify metadata for the in-scope filters.python pdal::Stage.

def myfunc(ins,outs):
    print ('schema: ', schema)
    print ('srs: ', spatialreference)
    print ('metadata: ', metadata)
    outs = ins
    return True

Updating metadata

The filter can update the global metadata dictionary as needed, define it as a global Python variable for the function’s scope, and the updates will be reflected back into the pipeline from that stage forward.

def myfunc(ins,outs):
  global metadata
  metadata = {'name': 'root', 'value': 'a string', 'type': 'string', 'description': 'a description', 'children': [{'name': 'filters.python', 'value': 52, 'type': 'integer', 'description': 'a filter description', 'children': []}, {'name': 'readers.faux', 'value': 'another string', 'type': 'string', 'description': 'a reader description', 'children': []}]}
  return True

Passing Python objects

As of PDAL 1.5, it is possible to pass an option to filters.python and filters.python of JSON representing a Python dictionary containing objects you want to use in your function. This feature is useful in situations where you wish to call pipeline with substitutions.

If we needed to be able to provide the Z scaling factor of Example Pipeline with a Python argument, we can place that in a dictionary and pass that to the filter as a separate argument. This feature allows us to be able easily reuse the same basic Python function while substituting values as necessary.

{
  "pipeline":[
    "input.las",
    {
      "type":"filters.python",
      "module":"anything",
      "function":"filter",
      "source":"arguments.py",
      "pdalargs":"{\"factor\":0.3048,\"an_argument\":42, \"another\": \"a string\"}"
    },
    "output.las"
  ]
}

With that option set, you can now fetch the pdalargs dictionary in your Python script and use it:

import numpy as np

def multiply_z(ins,outs):
    Z = ins['Z']
    Z = Z * float(pdalargs['factor'])
    outs['Z'] = Z
    return True

Standard output and error

A redirector module is available for scripts to output to PDAL’s log stream explicitly. The module handles redirecting sys.stderr and sys.stdout for you transparently, but it can be used directly by scripts. See the PDAL source code for more details.

Options

script
When reading a function from a separate Python file, the file name to read from. [Example: functions.py]
module
The Python module that is holding the function to run. [Required]
function
The function to call.
source
The literal Python code to execute, when the script option is not being used.
add_dimension
The name of a dimension to add to the pipeline that does not already exist.
pdalargs
A JSON dictionary of items you wish to pass into the modules globals as the pdalargs object.