Parallel Execution of Models

Introduction

QInfer provides tools to expedite simulation by distributing computation across multiple nodes using standard parallelization tools.

Distributed Computation with IPython

The ipyparallel package (previously IPython.parallel) provides facilities for parallelizing computation across multiple cores and/or nodes. ipyparallel separates computation into a controller that is responsible for one or more engines, and a client that sends commands to these engines via the controller. QInfer can use a client to send likelihood evaluation calls to engines, via the DirectViewParallelizedModel class.

This class takes a DirectView onto one or more engines, typically obtained with an expression similar to client[:], and splits calls to likelihood() across the engines accessible from the DirectView.

>>> from ipyparallel import Client 
>>> from qinfer import SimplePrecessionModel 
>>> from qinfer import DirectViewParallelizedModel 
>>> c = Client() 
>>> serial_model = SimplePrecessionModel() 
>>> parallel_model = DirectViewParallelizedModel(serial_model, c[:]) 

The newly decorated model will now distribute likelihood calls, such that each engine computes the likelihood for an equal number of particles. As a consequence, information shared per-experiment or per-outcome is local to each engine, and is not distributed. Therefore, this approach works best at quickly parallelizing where the per-model cost is significantly larger than the per-experiment or per-outcome cost.

Note

The DirectViewParallelizedModel assumes that it has ownership over engines, such that the behavior is unpredictable if any further commands are sent to the engines from outside the class.

Distributed Performance Testing

As an alternative to distributing a single likelihood call across multiple engines, QInfer also supports distributed performance testing. Under this model, each engine performs an independent trial of an estimation procedure, which is then collected by the client process. Distributed performance testing is implemented using the perf_test_multiple() function, with the keyword argument apply provided. For instance, the ipyparallel package offers a LoadBalancedView class whose apply() method sends tasks to engines according to their respective loads.

>>> lbview = client.load_balanced_view() 
>>> performance = qi.perf_test_multiple(
...     100, serial_model, 6000, prior, 200, heuristic_class,
...     apply=lbview.apply
... ) 

Examples of both approaches to parallelization are provided as a Jupyter Notebook.

GPGPU-based Likelihood Computation with PyOpenCL

Though QInfer does not yet have built-in support for GPU-based parallelization, PyOpenCL can be used to effectively distribute models as well. Here, the Cartesian product over outcomes, models and experiments matches closely the OpenCL concept of a global ID, as this example demonstrates. Once a kernel is developed in this way, PyOpenCL will allow for it to be used with any available OpenCL-compliant device.

Note that for sufficiently fast models, the overhead of copying data between the CPU and GPU may overwhelm any speed benefits obtained by this parallelization.