iris: Ultrafast electron diffraction data exploration¶
iris is both a library for interacting with ultrafast electron diffraction data, as well as a GUI frontend to interactively explore this data.
The code presented herein has been in use at some point by the Siwick research group.

Links¶
General Documentation¶
Installation¶
Standalone Installation¶
Starting with iris 5.1.0, standalone Windows installers and executables are available. You can find them on the GitHub release page.
The standalone installers and executables make the installation of iris completely separate from any other Python installation. This method should be preferred, unless Python scripting using the iris library is required.
Installing the Python Package¶
If you want to script using iris data structures and algorithms, you need to install the iris-ued package.
Note
Users are strongly recommended to manage these dependencies with the excellent Intel Distribution for Python which provides easy access to all of the above dependencies and more.
iris
is available on PyPI as iris-ued:
python -m pip install iris-ued
iris
is also available on the conda-forge channel:
conda config --add channels conda-forge
conda install iris-ued
You can install the latest developer version of iris
by cloning the git
repository:
git clone https://github.com/LaurentRDC/iris-ued.git
…then installing the package with:
cd iris-ued
python setup.py install
In Python code, iris
can be imported as follows
import iris
Test data¶
Test reduced datasets are made available by the Siwick research group. The data can be accessed on the public data repository
Testing¶
If you want to check that all the tests are running correctly with your Python configuration, type:
python setup.py test
Using iris: typical workflow¶
Before you start¶
You might want to download test datasets before you start to play around. Test reduced datasets are made available by the Siwick research group. The data can be accessed on the public data repository
Startup¶
To start the GUI from the command line:
> python -m iris
Note that the command-line interface has some useful options:
> python -m iris --help
usage: iris [-h] [-v] {open,docs} ...
Iris is both a library for interacting with ultrafast electron diffraction
data, as well as a GUI frontend for interactively exploring this data. Below
are some helpful commands.
optional arguments:
-h, --help show this help message and exit
-v, --version show program's version number and exit
Subcommands:
{open,docs} Available sub-commands
open Dataset to open with iris start-up.
docs Open online documentation in your default web browser.
Running this command without any parameters will launch the graphical user
interface. Documentation is available here: https://iris-ued.readthedocs.io/
Most importantly, you can programatically start the GUI with opening a dataset:
> python -m iris open --reduced ~/dataset.hdf5
The path can lead to a reduced HDF5 file (flag –reduced) or a raw dataset (flag –raw).
In case of a raw dataset, the dataset format will be guessed with the same rules as iris.open_raw()
.
The first blank screen is shown below.

Loading raw data¶
The file menu can be used to load raw data. Depending on the installed plugins, options will be available. To install a new plug-in, use the following option:

You’ll be able to select a plug-in file which will be copied to the plug-in directory. The plugin can be used immediately. Once a plug-in is installed, a new raw data format will appear.

Here is an example of loaded raw data: Raw data controls are available to the right.

Data reduction¶
Once raw data is loaded, the following option becomes available:

This opens the data reduction dialog.

Parts of the data can be masked. To add a mask, use the controls on the top of the dialog. Masks can be moved and resized. Note that all images will be masked, so this is best for beam blocks, known hot pixels, etc.

A preview of the mask can be generated:

Once you are satisfied with the processing parameters, the ‘Launch processing’ button will open a file dialog so that you can choose where to save the reduced HDF5 file. Processing might take a few minutes.
Data exploration¶
Once processing is complete, the resulting diffraction dataset will be loaded. New controls will be available.

The ‘Show/hide peak dynamics’ button can be toggled. Doing so allows for the exploration of the time-evolution of the data.

When a diffraction dataset is loaded, new options become available.

One of these options, ‘Compute angular averages’, is best suited for polycrystalline diffraction. It opens the following dialog:

Drag and resize the red circle so it coincides with a diffraction ring. This will allow for the determination of the diffraction center. The averaging will happen after clicking ‘Promote’. This might take a few minutes.
Polycrystalline data exploration¶
After the azimuthal averages have been computed, a new section of the GUI will be made available, with additional controls.

The top screen shows the superposition of all radial profiles. Dragging the yellow lines allows for exploration of time-evolution on the bottom screen. Note that the trace colors on the top are associated with the time-points and colors of the bottom image.

The baseline can be removed using the controls on the right. You can play with the baseline parameters and compute a baseline many times without any problems.

Polycrystalline scattering vector calibration¶
On the above images, the scattering vector range might not be right. To calibrate the scattering vector range based on a known structure, select the ‘Calibrate scattering vector’ option from the ‘Dataset’ menu.

This opens the calibration dialog.

You must either select a structure file (CIF) or one of the built-in structures. Once a structure is selected, it’s description will be printed on the screen. Make sure this is the crystal structure you expect.
Then, drag the left and right yellow bars on two diffraction peaks with known Miller indices. Click ‘Calibrate’ to calibrate the scattering vector range.

Datasets in Iris¶
The DiffractionDataset
object¶
The DiffractionDataset
object is the basis for iris
’s interaction with
ultrafast electron diffraction data. DiffractionDataset
objects are simply
HDF5 files with a specific layout, and associated methods:
from iris import DiffractionDataset
import h5py
assert issubclass(DiffractionDataset, h5py.File) # yep
You can take a look at h5py’s documentation to familiarize yourself
with h5py.File
.
You can also use other HDF5 bindings to inspect DiffractionDataset
instances.
Creating a DiffractionDataset
¶
An easy way to create a DiffractionDataset is through the DiffractionDataset.from_collection()
method, which
saves diffraction patterns and metadata:
-
classmethod
DiffractionDataset.
from_collection
(patterns, filename, time_points, metadata, valid_mask=None, dtype=None, ckwargs=None, callback=None, **kwargs) Create a DiffractionDataset from a collection of diffraction patterns and metadata.
Parameters: - patterns (iterable of ndarray or ndarray) – Diffraction patterns. These should be in the same order as
time_points
. Note that the iterable can be a generator, in which case it will be consumed. - filename (str or path-like) – Path to the assembled DiffractionDataset.
- time_points (array_like, shape (N,)) – Time-points of the diffraction patterns, in picoseconds.
- metadata (dict) – Valid keys are contained in
DiffractionDataset.valid_metadata
. - valid_mask (ndarray or None, optional) – Boolean array that evaluates to True on valid pixels. This information is useful in cases where a beamblock is used.
- dtype (dtype or None, optional) – Patterns will be cast to
dtype
. If None (default),dtype
will be set to the same data-type as the first pattern inpatterns
. - ckwargs (dict, optional) – HDF5 compression keyword arguments. Refer to
h5py
’s documentation for details. Default is to use the lzf compression pipeline. - callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update when
patterns
is a generator and involves large computations. - kwargs – Keywords are passed to
h5py.File
constructor. Default is file-mode ‘x’, which raises error if file already exists. Default libver is ‘latest’.
Returns: dataset
Return type: - patterns (iterable of ndarray or ndarray) – Diffraction patterns. These should be in the same order as
The required metadata that must be passed to DiffractionDataset.from_collection()
is also listed in
DiffractionDataset.valid_metadata
. Metadata not listed in DiffractionDataset.valid_metadata
will be ignored.
An other possibility is to create a DiffractionDataset
from a AbstractRawDataset
subclass using the
DiffractionDataset.from_raw()
method :
-
classmethod
DiffractionDataset.
from_raw
(raw, filename, exclude_scans=None, valid_mask=None, processes=1, callback=None, align=True, normalize=True, ckwargs=None, dtype=None, **kwargs)¶ Create a DiffractionDataset from a subclass of AbstractRawDataset.
Parameters: - raw (AbstractRawDataset instance) – Raw dataset instance.
- filename (str or path-like) – Path to the assembled DiffractionDataset.
- exclude_scans (iterable of ints or None, optional) – Scans to exclude from the processing. Default is to include all scans.
- valid_mask (ndarray or None, optional) – Boolean array that evaluates to True on valid pixels. This information is useful in cases where a beamblock is used.
- processes (int or None, optional) – Number of Processes to spawn for processing. Default is number of available CPU cores.
- callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update.
- align (bool, optional) – If True (default), raw images will be aligned on a per-scan basis.
- normalize (bool, optional) – If True, images within a scan are normalized to the same integrated diffracted intensity.
- ckwargs (dict or None, optional) – HDF5 compression keyword arguments. Refer to
h5py
’s documentation for details. - dtype (dtype or None, optional) – Patterns will be cast to
dtype
. If None (default),dtype
will be set to the same data-type as the first pattern inpatterns
. - kwargs – Keywords are passed to
h5py.File
constructor. Default is file-mode ‘x’, which raises error if file already exists.
Returns: dataset
Return type: See also
open_raw()
- open raw datasets by guessing the appropriate format based on available plug-ins.
Raises: IOError : If the filename is already associated with a file.
Important Methods for the DiffractionDataset
¶
The following three methods are the bread-and-butter of interacting with data. See the API section for a complete description.
-
DiffractionDataset.
diff_data
(timedelay, relative=False, out=None) Returns diffraction data at a specific time-delay.
Parameters: - timdelay (float or None) – Timedelay [ps]. If None, the entire block is returned.
- relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
- out (ndarray or None, optional) – If an out ndarray is provided, h5py can avoid making intermediate copies.
Returns: arr – Time-delay data. If
out
is provided,arr
is a view intoout
.Return type: ndarray
Raises: ValueError
– If timedelay does not exist.
-
DiffractionDataset.
diff_eq
() Returns the averaged diffraction pattern for all times before photoexcitation. In case no data is available before photoexcitation, an array of zeros is returned. The result of this function is cached to minimize overhead.
Time-zero can be adjusted using the
shift_time_zero
method.Returns: I – Diffracted intensity [counts] Return type: ndarray, shape (N,)
-
DiffractionDataset.
time_series
(rect, relative=False, out=None) Integrated intensity over time inside bounds.
Parameters: - rect (4-tuple of ints) – Bounds of the region in px. Bounds are specified as [row1, row2, col1, col2]
- relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
- out (ndarray or None, optional) – 1-D ndarray in which to store the results. The shape
should be compatible with
(len(time_points),)
Returns: out
Return type: ndarray, ndim 1
See also
time_series_selection()
- intensity integration using arbitrary selections.
-
DiffractionDataset.
time_series_selection
(selection, relative=False, out=None) Integrated intensity over time according to some arbitrary selection. This is a generalization of the
DiffractionDataset.time_series
method, which is much faster, but limited to rectangular selections.New in version 5.2.1.
Parameters: - selection (skued.Selection or ndarray, dtype bool, shape (N,M)) – A selection mask that dictates the regions to integrate in each scattering patterns.
In the case selection is an array, an ArbirarySelection will be used. Performance
may be degraded. Selection mask evaluating to
True
in the regions to integrate. The selection must be the same shape as one scattering pattern (i.e. two-dimensional). - relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
- out (ndarray or None, optional) – 1-D ndarray in which to store the results. The shape
should be compatible with
(len(time_points),)
Returns: out
Return type: ndarray, ndim 1
Raises: ValueError : if the shape of mask does not match the scattering patterns.
See also
time_series()
- integrated intensity in a rectangle.
- selection (skued.Selection or ndarray, dtype bool, shape (N,M)) – A selection mask that dictates the regions to integrate in each scattering patterns.
In the case selection is an array, an ArbirarySelection will be used. Performance
may be degraded. Selection mask evaluating to
The PowderDiffractionDataset
object¶
For polycrystalline data, we can define more data structures and methods. A PowderDiffractionDataset
is a strict
subclass of a DiffractionDataset
, and hence all methods previously described are also available.
Specializing a DiffractionDataset
object into a PowderDiffractionDataset
is done as follows:
from iris import PowderDiffractionDataset
dataset_path = 'C:\\path_do_dataset.hdf5' # DiffractionDataset already exists
with PowderDiffractionDataset.from_dataset(dataset_path, center) as dset:
# Do computation
Important Methods for the PowderDiffractionDataset
¶
The following methods are specific to polycrystalline diffraction data. See the API section for a complete description.
-
PowderDiffractionDataset.
powder_eq
() Returns the average powder diffraction pattern for all times before photoexcitation. In case no data is available before photoexcitation, an array of zeros is returned.
Parameters: bgr (bool) – If True, background is removed. Returns: I – Diffracted intensity [counts] Return type: ndarray, shape (N,)
-
PowderDiffractionDataset.
powder_data
(timedelay, bgr=False, relative=False, out=None) Returns the angular average data from scan-averaged diffraction patterns.
Parameters: - timdelay (float or None) – Time-delay [ps]. If None, the entire block is returned.
- bgr (bool, optional) – If True, background is removed.
- relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation.
- out (ndarray or None, optional) – If an out ndarray is provided, h5py can avoid making intermediate copies.
Returns: I – Diffracted intensity [counts]
Return type: ndarray, shape (N,) or (N,M)
-
PowderDiffractionDataset.
powder_calq
(crystal, peak_indices, miller_indices) Determine the scattering vector q corresponding to a polycrystalline diffraction pattern and a known crystal structure.
For best results, multiple peaks (and corresponding Miller indices) should be provided; the absolute minimum is two.
Parameters: - crystal (skued.Crystal instance) – Crystal that gave rise to the diffraction data.
- peak_indices (n-tuple of ints) – Array index location of diffraction peaks. For best results, peaks should be well-separated. More than two peaks can be used.
- miller_indices (iterable of 3-tuples) – Indices associated with the peaks of
peak_indices
. More than two peaks can be used. E.g.indices = [(2,2,0), (-3,0,2)]
Raises: - ValueError : if the number of peak indices does not match the number of Miller indices.
- ValueError : if the number of peaks given is lower than two.
-
PowderDiffractionDataset.
compute_baseline
(first_stage, wavelet, max_iter=50, level=None, **kwargs) Compute and save the baseline computed based on the dual-tree complex wavelet transform. All keyword arguments are passed to scikit-ued’s baseline_dt function.
Parameters: - first_stage (str, optional) – Wavelet to use for the first stage. See
skued.available_first_stage_filters()
for a list of suitable arguments - wavelet (str, optional) – Wavelet to use in stages > 1. Must be appropriate for the dual-tree complex wavelet transform.
See
skued.available_dt_filters()
for possible values. - max_iter (int, optional) –
- level (int or None, optional) – If None (default), maximum level is used.
- first_stage (str, optional) – Wavelet to use for the first stage. See
HDF5 layout¶
DiffractionDataset
instances (and by extension, PowderDiffractionDataset
instances) are a specialization of HDF5 files.
Therefore, it is possible to inspect and manipulate instances with any other tool that has bindings to the HDF5 libraries. The
HDF5 layout is presented below.

Dataset Plug-ins¶
To use your own raw data with iris
, a plug-in functionality is made available.
Plug-ins are Python modules that implement a subclass of AbstractRawDataset
, and should be placed in ~/iris_plugins
(C:\Users\UserName\iris_plugins
on Windows). Subclasses
of AbstractRawDataset
are automatically detected by iris
and can be used via the GUI.
Installed plug-ins can be imported from iris.plugins
:
from iris.plugins import DatasetSubclass
which would work if the DatasetSubclass
is defined in the file ~/iris_plugins/<anything>.py
. Example plug-ins is available here.
Plug-ins used by members of the Siwick research group are visible here.
Installing a plug-in¶
To install a plug-in that you have written in a file named ~/myplugin.py
:
import iris
iris.install_plugin('~/myplugin.py')
Installing a plug-in in the above makes it immediately available.
install_plugin (path) |
Install and load an iris plug-in. |
Subclassing AbstractRawDataset¶
To take advantage of iris
’s DiffractionDataset
and PowderDiffractionDataset
,
an appropriate subclass of AbstractRawDataset
must be implemented. This subclass can then be fed
to DiffractionDataset.from_raw()
to produce a DiffractionDataset
.
How to assemble a AbstractRawDataset subclass¶
Ultrafast electron diffraction experiments typically have multiple scans. Each scan consists of a time-delay sweep. You can think of it as one scan being an experiment, and so each dataset is composed of multiple, equivalent experiments.
To subclass AbstractRawDataset
, the method AbstractRawDataset.raw_data()
must minimally implemented.
It must follow the following specification:
-
AbstractRawDataset.
raw_data
(timedelay, scan=1, **kwargs)¶ Returns an array of the image at a timedelay and scan.
Parameters: Returns: arr
Return type: ~numpy.ndarray, ndim 2
Raises: - ValueError : if
timedelay
orscan
are invalid / out of bounds. - IOError : Filename is not associated with an image/does not exist.
- ValueError : if
For better performance, or to tailor data reduction to your data acquisition scheme, the following method can also be overloaded:
-
AbstractRawDataset.
reduced
(exclude_scans=None, align=True, normalize=True, mask=None, processes=1, dtype=<class 'float'>)¶ Generator of reduced dataset. The reduced diffraction patterns are generated in order of time-delay.
This particular implementation normalizes diffracted intensity of pictures acquired at the same time-delay while rejecting masked pixels.
Parameters: - exclude_scans (iterable or None, optional) – These scans will be skipped when reducing the dataset.
- align (bool, optional) – If True (default), raw diffraction patterns will be aligned using the masked normalized cross-correlation approach. See skued.align for more information.
- normalize (bool, optional) – If True (default), equivalent diffraction pictures (e.g. same time-delay, different scans) are normalized to the same diffracted intensity.
- mask (array-like of bool or None, optional) – If not None, pixels where
mask = True
are ignored for certain operations (e.g. alignment). - processes (int or None, optional) – Number of Processes to spawn for processing.
- dtype (numpy.dtype or None, optional) – Reduced patterns will be cast to
dtype
.
Yields: pattern (~numpy.ndarray, ndim 2)
AbstractRawDataset metadata¶
AbstractRawDataset
subclasses automatically include the following metadata:
date
(str): Acquisition date. Date format is up to you.energy
(float): Electron energy in keV.pump_wavelength
(int): photoexcitation wavelength in nanometers.fluence
(float): photoexcitation fluence \(\text{mJ}/\text{cm}**2\).time_zero_shift
(float): Time-zero shift in picoseconds.temperature
(float): sample temperature in Kelvins.exposure
(float): picture exposure in seconds.resolution
(2-tuple): pixel resolution of pictures.time_points
(tuple): time-points in picoseconds.scans
(tuple): experimental scans.camera_length
(float): sample-to-camera distance in meters.pixel_width
(float): pixel width in meters.notes
(str): notes.
Subclasses can add more metadata or override the current metadata with new defaults.
All proper subclasses of AbstractRawDataset
are automatically added to the possible raw dataset formats
that can be loaded from the GUI.
Reference/API¶
Opening raw datasets¶
To open any raw dataset, take a look at the open_raw()
function.
-
iris.
open_raw
(path)¶ Open a raw data item, guessing the AbstractRawDataset instance that should be used based on available plug-ins.
This function can also be used as a context manager:
with open_raw('.') as dset: ...
Parameters: path (path-like) – Path to the file/folder containing the raw data. Returns: raw – The raw dataset. If no format could be guessed, an RuntimeError is raised. Return type: AbstractRawDataset instance Raises: RuntimeError : if the data format could not be guessed.
Raw Dataset Classes¶
AbstractRawDataset ([source, metadata]) |
Abstract base class for ultrafast electron diffraction data set. |
Diffraction Dataset Classes¶
DiffractionDataset (name[, mode, driver, …]) |
Abstraction of an HDF5 file to represent diffraction datasets. |
PowderDiffractionDataset (*args, **kwargs) |
Abstraction of HDF5 files for powder diffraction datasets. |
What’s new¶
5.2.1¶
- Added the DiffractionDataset.time_series_selection method, which allows to create time-series integrated across an arbitrary momentum-space selection mask. This allows to create time-series from shapes that are not rectangular, at the expense of performance.
- Added a few methods to create selection masks: DiffractionDataset.selection_rect, DiffractionDataset.selection_disk, and DiffractionDataset.selection_ring.
- Added the ability to show/hide dataset control bar;
- Added the ability to export time-series data in CSV format;
- Fixed an issue where calculations of time-series, relative to pre-time-zero, would raise an error.
- Symmetrization dialog is no longer in “beta”.
5.2.0¶
- Official support for Linux.
- Plug-ins installed via the GUI can now be used right away. No restarts required.
- Added the iris.plugins.load_plugin function to load plug-ins without installing them. Useful for testing.
- Plug-ins can now have the
display_name
property which will be displayed in the GUI. This is optional and backwards-compatible. - Siwick Research Group-specific plugins were removed. They can be found here: https://github.com/Siwick-Research-Group/iris-ued-plugins
- Switched to Azure Pipelines for continuous integration builds;
- Added cursor information (position and image value) for processed data view;
- Fixed an issue where very large relative differences in datasets would crash the GUI displays;
- Fixed an issue where time-series fit would not display properly in fractional change mode;
5.1.3¶
- Added logging support for the GUI component. Logs can be reached via the help menu
- Added an update check. You can see whether an update is available via the help menu, as well as via the status bar.
- Added the ability to view time-series dynamics in absolute units AND relative change.
- Pinned dependency to scikit-ued, to prevent upgrade to scikit-ued 2.0 unless appropriate.
- Pinned dependency to npstreams, to prevent upgrade to npstreams 2.0 unless appropriate.
5.1.2¶
- Fixed an issue where the QDarkStyle internal imports were absolute.
5.1.1¶
- Fixed an issue where data reduction would freeze when using more than one CPU;
- Removed the auto-update mechanism. Update checks will run in the background only;
- Fixed an issue where the in-progress indicator would freeze;
- Moved tests outside of source repository;
- Updated GUI stylesheet to QDarkStyle 2.6.6;
5.1.0¶
- Added explicit support for Python 3.7;
- Usability tweaks, for example more visible mask controls;
- Added the ability to create standalone executables via PyInstaller;
- Added the ability to create Windows installers;
5.0.5.1¶
- Due to new forced image orientation, objects on screens were not properly registered (e.g. diffraction center finder).
5.0.5¶
- Added the ability to fit exponentials to time-series;
- Added region-of-interest text bounds for easier time-series exploration
- Enforced PyQtGraph to use row-major image orientation
- Datasets are now opened in read-only mode unless absolutely necessary. This should make it safer to handler multiple instances of iris at the same time.
5.0.4¶
- Better plug-in handling and command-line interface.
5.0.3¶
The major change in this version is the ability to guess raw dataset formats using the iris.open_raw function. This allows the possibility to start the GUI and open a dataset at the same time.
5.0.2¶
The package now only has dependencies that can be installed through conda
5.0.1¶
This is a minor bug-fix release that also includes user interface niceties (e.g. link to online documentation) and user experience niceties (e.g. confirmation message if you forget pixel masks).
5.0.0¶
This new version includes a completely rewritten library and GUI front-end. Earlier datasets will need to be re-processed. New features:
- Faster performance thanks to better data layout in HDF5;
- Plug-in architecture for various raw data formats;
- Faster performance thanks to npstreams package;
- Easier to extend GUI skeleton;
- Online documentation accessible from the GUI;
- Continuous integration.
Authors¶
- Laurent P. René de Cotret (McGill)