iris: Ultrafast electron diffraction data exploration

iris is both a library for interacting with ultrafast electron diffraction data, as well as a GUI frontend to interactively explore this data.

The code presented herein has been in use at some point by the Siwick research group.

_images/iris_screen.png

General Documentation

Installation

Standalone Installation

Starting with iris 5.1.0, standalone Windows installers and executables are available. You can find them on the GitHub release page.

The standalone installers and executables make the installation of iris completely separate from any other Python installation. This method should be preferred, unless Python scripting using the iris library is required.

Installing the Python Package

If you want to script using iris data structures and algorithms, you need to install the iris-ued package.

Note

Users are strongly recommended to manage these dependencies with the excellent Intel Distribution for Python which provides easy access to all of the above dependencies and more.

iris is available on PyPI as iris-ued:

python -m pip install iris-ued

iris is also available on the conda-forge channel:

conda config --add channels conda-forge
conda install iris-ued

You can install the latest developer version of iris by cloning the git repository:

git clone https://github.com/LaurentRDC/iris-ued.git

…then installing the package with:

cd iris-ued
python setup.py install

In Python code, iris can be imported as follows

import iris

Test data

Test reduced datasets are made available by the Siwick research group. The data can be accessed on the public data repository

Testing

If you want to check that all the tests are running correctly with your Python configuration, type:

python setup.py test

Using iris: typical workflow

Before you start

You might want to download test datasets before you start to play around. Test reduced datasets are made available by the Siwick research group. The data can be accessed on the public data repository

Startup

To start the GUI from the command line:

> python -m iris

Note that the command-line interface has some useful options:

> python -m iris --help
usage: iris [-h] [-v] {open,docs} ...

Iris is both a library for interacting with ultrafast electron diffraction
data, as well as a GUI frontend for interactively exploring this data. Below
are some helpful commands.

optional arguments:
-h, --help        show this help message and exit
-v, --version     show program's version number and exit

Subcommands:
{open,docs}  Available sub-commands
    open            Dataset to open with iris start-up.
    docs            Open online documentation in your default web browser.

Running this command without any parameters will launch the graphical user
interface. Documentation is available here: https://iris-ued.readthedocs.io/

Most importantly, you can programatically start the GUI with opening a dataset:

> python -m iris open --reduced ~/dataset.hdf5

The path can lead to a reduced HDF5 file (flag –reduced) or a raw dataset (flag –raw). In case of a raw dataset, the dataset format will be guessed with the same rules as iris.open_raw().

The first blank screen is shown below.

_images/startup.png

Loading raw data

The file menu can be used to load raw data. Depending on the installed plugins, options will be available. To install a new plug-in, use the following option:

_images/load_plugin_option.png

You’ll be able to select a plug-in file which will be copied to the plug-in directory. The plugin can be used immediately. Once a plug-in is installed, a new raw data format will appear.

_images/load_raw.png

Here is an example of loaded raw data: Raw data controls are available to the right.

_images/raw_data.png

Data reduction

Once raw data is loaded, the following option becomes available:

_images/reduction_dialog.png

This opens the data reduction dialog.

_images/reduction_window.png

Parts of the data can be masked. To add a mask, use the controls on the top of the dialog. Masks can be moved and resized. Note that all images will be masked, so this is best for beam blocks, known hot pixels, etc.

_images/reduction_mask.png

A preview of the mask can be generated:

_images/mask_preview.png

Once you are satisfied with the processing parameters, the ‘Launch processing’ button will open a file dialog so that you can choose where to save the reduced HDF5 file. Processing might take a few minutes.

Data exploration

Once processing is complete, the resulting diffraction dataset will be loaded. New controls will be available.

_images/processed_view.png

The ‘Show/hide peak dynamics’ button can be toggled. Doing so allows for the exploration of the time-evolution of the data.

_images/peak_dynamics_single.png

When a diffraction dataset is loaded, new options become available.

_images/dataset_options.png

One of these options, ‘Compute angular averages’, is best suited for polycrystalline diffraction. It opens the following dialog:

_images/azimuthal_dialog_1.png

Drag and resize the red circle so it coincides with a diffraction ring. This will allow for the determination of the diffraction center. The averaging will happen after clicking ‘Promote’. This might take a few minutes.

Polycrystalline data exploration

After the azimuthal averages have been computed, a new section of the GUI will be made available, with additional controls.

_images/poly_view.png

The top screen shows the superposition of all radial profiles. Dragging the yellow lines allows for exploration of time-evolution on the bottom screen. Note that the trace colors on the top are associated with the time-points and colors of the bottom image.

_images/poly_view_2.png

The baseline can be removed using the controls on the right. You can play with the baseline parameters and compute a baseline many times without any problems.

_images/poly_view_3.png

Polycrystalline scattering vector calibration

On the above images, the scattering vector range might not be right. To calibrate the scattering vector range based on a known structure, select the ‘Calibrate scattering vector’ option from the ‘Dataset’ menu.

_images/calibrate_option.png

This opens the calibration dialog.

_images/calibration_dialog.png

You must either select a structure file (CIF) or one of the built-in structures. Once a structure is selected, it’s description will be printed on the screen. Make sure this is the crystal structure you expect.

Then, drag the left and right yellow bars on two diffraction peaks with known Miller indices. Click ‘Calibrate’ to calibrate the scattering vector range.

_images/calibration_dialog_2.png

Datasets in Iris

The DiffractionDataset object

The DiffractionDataset object is the basis for iris’s interaction with ultrafast electron diffraction data. DiffractionDataset objects are simply HDF5 files with a specific layout, and associated methods:

from iris import DiffractionDataset
import h5py

assert issubclass(DiffractionDataset, h5py.File)    # yep

You can take a look at h5py’s documentation to familiarize yourself with h5py.File.

You can also use other HDF5 bindings to inspect DiffractionDataset instances.

Creating a DiffractionDataset

An easy way to create a DiffractionDataset is through the DiffractionDataset.from_collection() method, which saves diffraction patterns and metadata:

classmethod DiffractionDataset.from_collection(patterns, filename, time_points, metadata, valid_mask=None, dtype=None, ckwargs=None, callback=None, **kwargs)

Create a DiffractionDataset from a collection of diffraction patterns and metadata.

Parameters:
  • patterns (iterable of ndarray or ndarray) – Diffraction patterns. These should be in the same order as time_points. Note that the iterable can be a generator, in which case it will be consumed.
  • filename (str or path-like) – Path to the assembled DiffractionDataset.
  • time_points (array_like, shape (N,)) – Time-points of the diffraction patterns, in picoseconds.
  • metadata (dict) – Valid keys are contained in DiffractionDataset.valid_metadata.
  • valid_mask (ndarray or None, optional) – Boolean array that evaluates to True on valid pixels. This information is useful in cases where a beamblock is used.
  • dtype (dtype or None, optional) – Patterns will be cast to dtype. If None (default), dtype will be set to the same data-type as the first pattern in patterns.
  • ckwargs (dict, optional) – HDF5 compression keyword arguments. Refer to h5py’s documentation for details. Default is to use the lzf compression pipeline.
  • callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update when patterns is a generator and involves large computations.
  • kwargs – Keywords are passed to h5py.File constructor. Default is file-mode ‘x’, which raises error if file already exists. Default libver is ‘latest’.
Returns:

dataset

Return type:

DiffractionDataset

The required metadata that must be passed to DiffractionDataset.from_collection() is also listed in DiffractionDataset.valid_metadata. Metadata not listed in DiffractionDataset.valid_metadata will be ignored.

An other possibility is to create a DiffractionDataset from a AbstractRawDataset subclass using the DiffractionDataset.from_raw() method :

classmethod DiffractionDataset.from_raw(raw, filename, exclude_scans=None, valid_mask=None, processes=1, callback=None, align=True, normalize=True, ckwargs=None, dtype=None, **kwargs)

Create a DiffractionDataset from a subclass of AbstractRawDataset.

Parameters:
  • raw (AbstractRawDataset instance) – Raw dataset instance.
  • filename (str or path-like) – Path to the assembled DiffractionDataset.
  • exclude_scans (iterable of ints or None, optional) – Scans to exclude from the processing. Default is to include all scans.
  • valid_mask (ndarray or None, optional) – Boolean array that evaluates to True on valid pixels. This information is useful in cases where a beamblock is used.
  • processes (int or None, optional) – Number of Processes to spawn for processing. Default is number of available CPU cores.
  • callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update.
  • align (bool, optional) – If True (default), raw images will be aligned on a per-scan basis.
  • normalize (bool, optional) – If True, images within a scan are normalized to the same integrated diffracted intensity.
  • ckwargs (dict or None, optional) – HDF5 compression keyword arguments. Refer to h5py’s documentation for details.
  • dtype (dtype or None, optional) – Patterns will be cast to dtype. If None (default), dtype will be set to the same data-type as the first pattern in patterns.
  • kwargs – Keywords are passed to h5py.File constructor. Default is file-mode ‘x’, which raises error if file already exists.
Returns:

dataset

Return type:

DiffractionDataset

See also

open_raw()
open raw datasets by guessing the appropriate format based on available plug-ins.
Raises:IOError : If the filename is already associated with a file.
Important Methods for the DiffractionDataset

The following three methods are the bread-and-butter of interacting with data. See the API section for a complete description.

DiffractionDataset.diff_data(timedelay, relative=False, out=None)

Returns diffraction data at a specific time-delay.

Parameters:
  • timdelay (float or None) – Timedelay [ps]. If None, the entire block is returned.
  • relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
  • out (ndarray or None, optional) – If an out ndarray is provided, h5py can avoid making intermediate copies.
Returns:

arr – Time-delay data. If out is provided, arr is a view into out.

Return type:

ndarray

Raises:

ValueError – If timedelay does not exist.

DiffractionDataset.diff_eq()

Returns the averaged diffraction pattern for all times before photoexcitation. In case no data is available before photoexcitation, an array of zeros is returned. The result of this function is cached to minimize overhead.

Time-zero can be adjusted using the shift_time_zero method.

Returns:I – Diffracted intensity [counts]
Return type:ndarray, shape (N,)
DiffractionDataset.time_series(rect, relative=False, out=None)

Integrated intensity over time inside bounds.

Parameters:
  • rect (4-tuple of ints) – Bounds of the region in px. Bounds are specified as [row1, row2, col1, col2]
  • relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
  • out (ndarray or None, optional) – 1-D ndarray in which to store the results. The shape should be compatible with (len(time_points),)
Returns:

out

Return type:

ndarray, ndim 1

See also

time_series_selection()
intensity integration using arbitrary selections.
DiffractionDataset.time_series_selection(selection, relative=False, out=None)

Integrated intensity over time according to some arbitrary selection. This is a generalization of the DiffractionDataset.time_series method, which is much faster, but limited to rectangular selections.

New in version 5.2.1.

Parameters:
  • selection (skued.Selection or ndarray, dtype bool, shape (N,M)) – A selection mask that dictates the regions to integrate in each scattering patterns. In the case selection is an array, an ArbirarySelection will be used. Performance may be degraded. Selection mask evaluating to True in the regions to integrate. The selection must be the same shape as one scattering pattern (i.e. two-dimensional).
  • relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
  • out (ndarray or None, optional) – 1-D ndarray in which to store the results. The shape should be compatible with (len(time_points),)
Returns:

out

Return type:

ndarray, ndim 1

Raises:

ValueError : if the shape of mask does not match the scattering patterns.

See also

time_series()
integrated intensity in a rectangle.

The PowderDiffractionDataset object

For polycrystalline data, we can define more data structures and methods. A PowderDiffractionDataset is a strict subclass of a DiffractionDataset, and hence all methods previously described are also available.

Specializing a DiffractionDataset object into a PowderDiffractionDataset is done as follows:

from iris import PowderDiffractionDataset
dataset_path = 'C:\\path_do_dataset.hdf5'   # DiffractionDataset already exists

with PowderDiffractionDataset.from_dataset(dataset_path, center) as dset:
    # Do computation
Important Methods for the PowderDiffractionDataset

The following methods are specific to polycrystalline diffraction data. See the API section for a complete description.

PowderDiffractionDataset.powder_eq()

Returns the average powder diffraction pattern for all times before photoexcitation. In case no data is available before photoexcitation, an array of zeros is returned.

Parameters:bgr (bool) – If True, background is removed.
Returns:I – Diffracted intensity [counts]
Return type:ndarray, shape (N,)
PowderDiffractionDataset.powder_data(timedelay, bgr=False, relative=False, out=None)

Returns the angular average data from scan-averaged diffraction patterns.

Parameters:
  • timdelay (float or None) – Time-delay [ps]. If None, the entire block is returned.
  • bgr (bool, optional) – If True, background is removed.
  • relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
  • out (ndarray or None, optional) – If an out ndarray is provided, h5py can avoid making intermediate copies.
Returns:

I – Diffracted intensity [counts]

Return type:

ndarray, shape (N,) or (N,M)

PowderDiffractionDataset.powder_calq(crystal, peak_indices, miller_indices)

Determine the scattering vector q corresponding to a polycrystalline diffraction pattern and a known crystal structure.

For best results, multiple peaks (and corresponding Miller indices) should be provided; the absolute minimum is two.

Parameters:
  • crystal (skued.Crystal instance) – Crystal that gave rise to the diffraction data.
  • peak_indices (n-tuple of ints) – Array index location of diffraction peaks. For best results, peaks should be well-separated. More than two peaks can be used.
  • miller_indices (iterable of 3-tuples) – Indices associated with the peaks of peak_indices. More than two peaks can be used. E.g. indices = [(2,2,0), (-3,0,2)]
Raises:
  • ValueError : if the number of peak indices does not match the number of Miller indices.
  • ValueError : if the number of peaks given is lower than two.
PowderDiffractionDataset.compute_baseline(first_stage, wavelet, max_iter=50, level=None, **kwargs)

Compute and save the baseline computed based on the dual-tree complex wavelet transform. All keyword arguments are passed to scikit-ued’s baseline_dt function.

Parameters:
  • first_stage (str, optional) – Wavelet to use for the first stage. See skued.available_first_stage_filters() for a list of suitable arguments
  • wavelet (str, optional) – Wavelet to use in stages > 1. Must be appropriate for the dual-tree complex wavelet transform. See skued.available_dt_filters() for possible values.
  • max_iter (int, optional) –
  • level (int or None, optional) – If None (default), maximum level is used.

HDF5 layout

DiffractionDataset instances (and by extension, PowderDiffractionDataset instances) are a specialization of HDF5 files. Therefore, it is possible to inspect and manipulate instances with any other tool that has bindings to the HDF5 libraries. The HDF5 layout is presented below.

_images/datastructure.png

Dataset Plug-ins

To use your own raw data with iris, a plug-in functionality is made available.

Plug-ins are Python modules that implement a subclass of AbstractRawDataset, and should be placed in ~/iris_plugins (C:\Users\UserName\iris_plugins on Windows). Subclasses of AbstractRawDataset are automatically detected by iris and can be used via the GUI.

Installed plug-ins can be imported from iris.plugins:

from iris.plugins import DatasetSubclass

which would work if the DatasetSubclass is defined in the file ~/iris_plugins/<anything>.py. Example plug-ins is available here. Plug-ins used by members of the Siwick research group are visible here.

Installing a plug-in

To install a plug-in that you have written in a file named ~/myplugin.py:

import iris
iris.install_plugin('~/myplugin.py')

Installing a plug-in in the above makes it immediately available.

install_plugin(path) Install and load an iris plug-in.

Subclassing AbstractRawDataset

To take advantage of iris’s DiffractionDataset and PowderDiffractionDataset, an appropriate subclass of AbstractRawDataset must be implemented. This subclass can then be fed to DiffractionDataset.from_raw() to produce a DiffractionDataset.

How to assemble a AbstractRawDataset subclass

Ultrafast electron diffraction experiments typically have multiple scans. Each scan consists of a time-delay sweep. You can think of it as one scan being an experiment, and so each dataset is composed of multiple, equivalent experiments.

To subclass AbstractRawDataset, the method AbstractRawDataset.raw_data() must minimally implemented. It must follow the following specification:

AbstractRawDataset.raw_data(timedelay, scan=1, **kwargs)

Returns an array of the image at a timedelay and scan.

Parameters:
  • timdelay (float) – Acquisition time-delay.
  • scan (int, optional) – Scan number. Default is 1.
  • kwargs – Keyword-arguments are ignored.
Returns:

arr

Return type:

~numpy.ndarray, ndim 2

Raises:
  • ValueError : if timedelay or scan are invalid / out of bounds.
  • IOError : Filename is not associated with an image/does not exist.

For better performance, or to tailor data reduction to your data acquisition scheme, the following method can also be overloaded:

AbstractRawDataset.reduced(exclude_scans=None, align=True, normalize=True, mask=None, processes=1, dtype=<class 'float'>)

Generator of reduced dataset. The reduced diffraction patterns are generated in order of time-delay.

This particular implementation normalizes diffracted intensity of pictures acquired at the same time-delay while rejecting masked pixels.

Parameters:
  • exclude_scans (iterable or None, optional) – These scans will be skipped when reducing the dataset.
  • align (bool, optional) – If True (default), raw diffraction patterns will be aligned using the masked normalized cross-correlation approach. See skued.align for more information.
  • normalize (bool, optional) – If True (default), equivalent diffraction pictures (e.g. same time-delay, different scans) are normalized to the same diffracted intensity.
  • mask (array-like of bool or None, optional) – If not None, pixels where mask = True are ignored for certain operations (e.g. alignment).
  • processes (int or None, optional) – Number of Processes to spawn for processing.
  • dtype (numpy.dtype or None, optional) – Reduced patterns will be cast to dtype.
Yields:

pattern (~numpy.ndarray, ndim 2)

AbstractRawDataset metadata

AbstractRawDataset subclasses automatically include the following metadata:

  • date (str): Acquisition date. Date format is up to you.
  • energy (float): Electron energy in keV.
  • pump_wavelength (int): photoexcitation wavelength in nanometers.
  • fluence (float): photoexcitation fluence \(\text{mJ}/\text{cm}**2\).
  • time_zero_shift (float): Time-zero shift in picoseconds.
  • temperature (float): sample temperature in Kelvins.
  • exposure (float): picture exposure in seconds.
  • resolution (2-tuple): pixel resolution of pictures.
  • time_points (tuple): time-points in picoseconds.
  • scans (tuple): experimental scans.
  • camera_length (float): sample-to-camera distance in meters.
  • pixel_width (float): pixel width in meters.
  • notes (str): notes.

Subclasses can add more metadata or override the current metadata with new defaults.

All proper subclasses of AbstractRawDataset are automatically added to the possible raw dataset formats that can be loaded from the GUI.

Reference/API

Opening raw datasets

To open any raw dataset, take a look at the open_raw() function.

iris.open_raw(path)

Open a raw data item, guessing the AbstractRawDataset instance that should be used based on available plug-ins.

This function can also be used as a context manager:

with open_raw('.') as dset:
    ...
Parameters:path (path-like) – Path to the file/folder containing the raw data.
Returns:raw – The raw dataset. If no format could be guessed, an RuntimeError is raised.
Return type:AbstractRawDataset instance
Raises:RuntimeError : if the data format could not be guessed.

Raw Dataset Classes

AbstractRawDataset([source, metadata]) Abstract base class for ultrafast electron diffraction data set.

Diffraction Dataset Classes

DiffractionDataset(name[, mode, driver, …]) Abstraction of an HDF5 file to represent diffraction datasets.
PowderDiffractionDataset(*args, **kwargs) Abstraction of HDF5 files for powder diffraction datasets.

What’s new

5.2.1

  • Added the DiffractionDataset.time_series_selection method, which allows to create time-series integrated across an arbitrary momentum-space selection mask. This allows to create time-series from shapes that are not rectangular, at the expense of performance.
  • Added a few methods to create selection masks: DiffractionDataset.selection_rect, DiffractionDataset.selection_disk, and DiffractionDataset.selection_ring.
  • Added the ability to show/hide dataset control bar;
  • Added the ability to export time-series data in CSV format;
  • Fixed an issue where calculations of time-series, relative to pre-time-zero, would raise an error.
  • Symmetrization dialog is no longer in “beta”.

5.2.0

  • Official support for Linux.
  • Plug-ins installed via the GUI can now be used right away. No restarts required.
  • Added the iris.plugins.load_plugin function to load plug-ins without installing them. Useful for testing.
  • Plug-ins can now have the display_name property which will be displayed in the GUI. This is optional and backwards-compatible.
  • Siwick Research Group-specific plugins were removed. They can be found here: https://github.com/Siwick-Research-Group/iris-ued-plugins
  • Switched to Azure Pipelines for continuous integration builds;
  • Added cursor information (position and image value) for processed data view;
  • Fixed an issue where very large relative differences in datasets would crash the GUI displays;
  • Fixed an issue where time-series fit would not display properly in fractional change mode;

5.1.3

  • Added logging support for the GUI component. Logs can be reached via the help menu
  • Added an update check. You can see whether an update is available via the help menu, as well as via the status bar.
  • Added the ability to view time-series dynamics in absolute units AND relative change.
  • Pinned dependency to scikit-ued, to prevent upgrade to scikit-ued 2.0 unless appropriate.
  • Pinned dependency to npstreams, to prevent upgrade to npstreams 2.0 unless appropriate.

5.1.2

  • Fixed an issue where the QDarkStyle internal imports were absolute.

5.1.1

  • Fixed an issue where data reduction would freeze when using more than one CPU;
  • Removed the auto-update mechanism. Update checks will run in the background only;
  • Fixed an issue where the in-progress indicator would freeze;
  • Moved tests outside of source repository;
  • Updated GUI stylesheet to QDarkStyle 2.6.6;

5.1.0

  • Added explicit support for Python 3.7;
  • Usability tweaks, for example more visible mask controls;
  • Added the ability to create standalone executables via PyInstaller;
  • Added the ability to create Windows installers;

5.0.5.1

  • Due to new forced image orientation, objects on screens were not properly registered (e.g. diffraction center finder).

5.0.5

  • Added the ability to fit exponentials to time-series;
  • Added region-of-interest text bounds for easier time-series exploration
  • Enforced PyQtGraph to use row-major image orientation
  • Datasets are now opened in read-only mode unless absolutely necessary. This should make it safer to handler multiple instances of iris at the same time.

5.0.4

  • Better plug-in handling and command-line interface.

5.0.3

The major change in this version is the ability to guess raw dataset formats using the iris.open_raw function. This allows the possibility to start the GUI and open a dataset at the same time.

5.0.2

The package now only has dependencies that can be installed through conda

5.0.1

This is a minor bug-fix release that also includes user interface niceties (e.g. link to online documentation) and user experience niceties (e.g. confirmation message if you forget pixel masks).

5.0.0

This new version includes a completely rewritten library and GUI front-end. Earlier datasets will need to be re-processed. New features:

  • Faster performance thanks to better data layout in HDF5;
  • Plug-in architecture for various raw data formats;
  • Faster performance thanks to npstreams package;
  • Easier to extend GUI skeleton;
  • Online documentation accessible from the GUI;
  • Continuous integration.

Authors

  • Laurent P. René de Cotret (McGill)