iris.DiffractionDataset¶

class iris.DiffractionDataset(name, mode=None, driver=None, libver=None, userblock_size=None, swmr=False, rdcc_nslots=None, rdcc_nbytes=None, rdcc_w0=None, track_order=None, **kwds)¶

Abstraction of an HDF5 file to represent diffraction datasets.

Create a new file object.

See the h5py user guide for a detailed explanation of the options.

name: Name of the file on disk, or file-like object. Note: for files created with the ‘core’ driver, HDF5 still requires this be non-empty.
mode: r Readonly, file must exist r+ Read/write, file must exist w Create file, truncate if exists w- or x Create file, fail if exists a Read/write if exists, create otherwise (default)
driver: Name of the driver to use. Legal values are None (default, recommended), ‘core’, ‘sec2’, ‘stdio’, ‘mpio’.
libver: Library version bounds. Supported values: ‘earliest’, ‘v108’, ‘v110’, and ‘latest’. The ‘v108’ and ‘v110’ options can only be specified with the HDF5 1.10.2 library or later.
userblock: Desired size of user block. Only allowed when creating a new file (mode w, w- or x).
swmr: Open the file in SWMR read mode. Only used when mode = ‘r’.
rdcc_nbytes: Total size of the raw data chunk cache in bytes. The default size is 1024**2 (1 MB) per dataset.
rdcc_w0: The chunk preemption policy for all datasets. This must be between 0 and 1 inclusive and indicates the weighting according to which chunks which have been fully read or written are penalized when determining which chunks to flush from cache. A value of 0 means fully read or written chunks are treated no differently than other chunks (the preemption is strictly LRU) while a value of 1 means fully read or written chunks are always preempted before other chunks. If your application only reads or writes data once, this can be safely set to 1. Otherwise, this should be set lower depending on how often you re-read or re-write the same data. The default value is 0.75.
rdcc_nslots: The number of chunk slots in the raw data chunk cache for this file. Increasing this value reduces the number of cache collisions, but slightly increases the memory used. Due to the hashing strategy, this value should ideally be a prime number. As a rule of thumb, this value should be at least 10 times the number of chunks that can fit in rdcc_nbytes bytes. For maximum performance, this value should be set approximately 100 times that number of chunks. The default value is 521.
track_order: Track dataset/group/attribute creation order under root group if True. If None use global default h5.get_config().track_order.
Additional keywords: Passed on to the selected file driver.

__init__(name, mode=None, driver=None, libver=None, userblock_size=None, swmr=False, rdcc_nslots=None, rdcc_nbytes=None, rdcc_w0=None, track_order=None, **kwds)¶

Create a new file object.

See the h5py user guide for a detailed explanation of the options.

name: Name of the file on disk, or file-like object. Note: for files created with the ‘core’ driver, HDF5 still requires this be non-empty.
mode: r Readonly, file must exist r+ Read/write, file must exist w Create file, truncate if exists w- or x Create file, fail if exists a Read/write if exists, create otherwise (default)
driver: Name of the driver to use. Legal values are None (default, recommended), ‘core’, ‘sec2’, ‘stdio’, ‘mpio’.
libver: Library version bounds. Supported values: ‘earliest’, ‘v108’, ‘v110’, and ‘latest’. The ‘v108’ and ‘v110’ options can only be specified with the HDF5 1.10.2 library or later.
userblock: Desired size of user block. Only allowed when creating a new file (mode w, w- or x).
swmr: Open the file in SWMR read mode. Only used when mode = ‘r’.
rdcc_nbytes: Total size of the raw data chunk cache in bytes. The default size is 1024**2 (1 MB) per dataset.
rdcc_w0: The chunk preemption policy for all datasets. This must be between 0 and 1 inclusive and indicates the weighting according to which chunks which have been fully read or written are penalized when determining which chunks to flush from cache. A value of 0 means fully read or written chunks are treated no differently than other chunks (the preemption is strictly LRU) while a value of 1 means fully read or written chunks are always preempted before other chunks. If your application only reads or writes data once, this can be safely set to 1. Otherwise, this should be set lower depending on how often you re-read or re-write the same data. The default value is 0.75.
rdcc_nslots: The number of chunk slots in the raw data chunk cache for this file. Increasing this value reduces the number of cache collisions, but slightly increases the memory used. Due to the hashing strategy, this value should ideally be a prime number. As a rule of thumb, this value should be at least 10 times the number of chunks that can fit in rdcc_nbytes bytes. For maximum performance, this value should be set approximately 100 times that number of chunks. The default value is 521.
track_order: Track dataset/group/attribute creation order under root group if True. If None use global default h5.get_config().track_order.
Additional keywords: Passed on to the selected file driver.

__repr__()¶: Return repr(self).

compression_params¶: Compression options in the form of a dictionary

diff_apply(func, callback=None, processes=1)¶

Apply a function to each diffraction pattern possibly in parallel. The diffraction patterns will be modified in-place.

Warning

This is an irreversible in-place operation.

New in version 5.0.3.

Parameters:

func (callable) – Function that takes in an array (diffraction pattern) and returns an array of the exact same shape, with the same data-type.
callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update.
processes (int or None, optional) –
Number of parallel processes to use. If None, all available processes will be used. In case Single Writer Multiple Reader mode is not available, processes is ignored.

New in version 5.0.6.

Raises:

TypeError : if func is not a proper callable

diff_data(timedelay, relative=False, out=None)¶

Returns diffraction data at a specific time-delay.

Parameters:	timdelay (float or None) – Timedelay [ps]. If None, the entire block is returned. relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation. out (ndarray or None, optional) – If an out ndarray is provided, h5py can avoid making intermediate copies.
Returns:	arr – Time-delay data. If `out` is provided, `arr` is a view into `out`.
Return type:	ndarray
Raises:	`ValueError` – If timedelay does not exist.

diff_eq¶

Returns the averaged diffraction pattern for all times before photoexcitation. In case no data is available before photoexcitation, an array of zeros is returned. The result of this function is cached to minimize overhead.

Time-zero can be adjusted using the shift_time_zero method.

Returns:	I – Diffracted intensity [counts]
Return type:	ndarray, shape (N,)

classmethod from_collection(patterns, filename, time_points, metadata, valid_mask=None, dtype=None, ckwargs=None, callback=None, **kwargs)¶

Create a DiffractionDataset from a collection of diffraction patterns and metadata.

Parameters:	patterns (iterable of ndarray or ndarray) – Diffraction patterns. These should be in the same order as `time_points`. Note that the iterable can be a generator, in which case it will be consumed. filename (str or path-like) – Path to the assembled DiffractionDataset. time_points (array_like, shape (N,)) – Time-points of the diffraction patterns, in picoseconds. metadata (dict) – Valid keys are contained in `DiffractionDataset.valid_metadata`. valid_mask (ndarray or None, optional) – Boolean array that evaluates to True on valid pixels. This information is useful in cases where a beamblock is used. dtype (dtype or None, optional) – Patterns will be cast to `dtype`. If None (default), `dtype` will be set to the same data-type as the first pattern in `patterns`. ckwargs (dict, optional) – HDF5 compression keyword arguments. Refer to `h5py`’s documentation for details. Default is to use the lzf compression pipeline. callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update when `patterns` is a generator and involves large computations. kwargs – Keywords are passed to `h5py.File` constructor. Default is file-mode ‘x’, which raises error if file already exists. Default libver is ‘latest’.
Returns:	dataset
Return type:	DiffractionDataset

classmethod from_raw(raw, filename, exclude_scans=None, valid_mask=None, processes=1, callback=None, align=True, normalize=True, ckwargs=None, dtype=None, **kwargs)¶

Create a DiffractionDataset from a subclass of AbstractRawDataset.

Parameters:	raw (AbstractRawDataset instance) – Raw dataset instance. filename (str or path-like) – Path to the assembled DiffractionDataset. exclude_scans (iterable of ints or None, optional) – Scans to exclude from the processing. Default is to include all scans. valid_mask (ndarray or None, optional) – Boolean array that evaluates to True on valid pixels. This information is useful in cases where a beamblock is used. processes (int or None, optional) – Number of Processes to spawn for processing. Default is number of available CPU cores. callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update. align (bool, optional) – If True (default), raw images will be aligned on a per-scan basis. normalize (bool, optional) – If True, images within a scan are normalized to the same integrated diffracted intensity. ckwargs (dict or None, optional) – HDF5 compression keyword arguments. Refer to `h5py`’s documentation for details. dtype (dtype or None, optional) – Patterns will be cast to `dtype`. If None (default), `dtype` will be set to the same data-type as the first pattern in `patterns`. kwargs – Keywords are passed to `h5py.File` constructor. Default is file-mode ‘x’, which raises error if file already exists.
Returns:	dataset
Return type:	DiffractionDataset

See also

open_raw(): open raw datasets by guessing the appropriate format based on available plug-ins.

Raises:	IOError : If the filename is already associated with a file.

invalid_mask¶: Array that evaluates to True on invalid pixels (i.e. on beam-block, hot pixels, etc.)

metadata¶: Dictionary of the dataset’s metadata. Dictionary is sorted alphabetically by keys.

resolution¶: Resolution of diffraction patterns (px, px)

shift_time_zero(shift)¶

Insert a shift in time points. Reset the shift by setting it to zero. Shifts are not consecutive, so that calling shift_time_zero(20) twice will not result in a shift of 40ps.

Parameters:	shift (float) – Shift [ps]. A positive value of shift will move all time-points forward in time, whereas a negative value of shift will move all time-points backwards in time.

symmetrize(mod, center, kernel_size=None, callback=None, processes=1)¶

Symmetrize diffraction images based on n-fold rotational symmetry.

Warning

This is an irreversible in-place operation.

Parameters:

mod (int) – Fold symmetry number.
center (array-like, shape (2,) or None) – Coordinates of the center (in pixels). If None, the data is symmetrized around the center of the images.
kernel_size (float or None, optional) – If not None, every diffraction pattern will be smoothed with a gaussian kernel. kernel_size is the standard deviation of the gaussian kernel in units of pixels.
callback (callable or None, optional) – Callable that takes an int between 0 and 99. This can be used for progress update.
processes (int or None, optional) –
Number of parallel processes to use. If None, all available processes will be used. In case Single Writer Multiple Reader mode is not available, processes is ignored.

New in version 5.0.6.

Raises:

ValueError: if mod is not a divisor of 360.

See also

diff_apply(): apply an operation to each diffraction pattern one-by-one

time_series(rect, relative=False, out=None)¶

Integrated intensity over time inside bounds.

Parameters:	rect (4-tuple of ints) – Bounds of the region in px. Bounds are specified as [row1, row2, col1, col2] relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation. out (ndarray or None, optional) – 1-D ndarray in which to store the results. The shape should be compatible with `(len(time_points),)`
Returns:	out
Return type:	ndarray, ndim 1

See also

time_series_selection(): intensity integration using arbitrary selections.

time_series_selection(selection, relative=False, out=None)¶

Integrated intensity over time according to some arbitrary selection. This is a generalization of the DiffractionDataset.time_series method, which is much faster, but limited to rectangular selections.

New in version 5.2.1.

Parameters:	selection (skued.Selection or ndarray, dtype bool, shape (N,M)) – A selection mask that dictates the regions to integrate in each scattering patterns. In the case selection is an array, an ArbirarySelection will be used. Performance may be degraded. Selection mask evaluating to `True` in the regions to integrate. The selection must be the same shape as one scattering pattern (i.e. two-dimensional). relative (bool, optional) – If True, data is returned relative to the average of all diffraction patterns before photoexcitation. out (ndarray or None, optional) – 1-D ndarray in which to store the results. The shape should be compatible with `(len(time_points),)`
Returns:	out
Return type:	ndarray, ndim 1
Raises:	ValueError : if the shape of mask does not match the scattering patterns.

See also

time_series(): integrated intensity in a rectangle.

valid_mask¶: Array that evaluates to True on valid pixels (i.e. not on beam-block, not hot pixels, etc.)