WrightTools.data package

Data class and associated.

class WrightTools.data.Data(*args, **kwargs)[source]

Bases: WrightTools._group.Group

Multidimensional dataset.

__init__(*args, **kwargs)[source]

Create a new Group object by binding to a low-level GroupID.

attrs

Attributes attached to this object

axes
axis_expressions

Axis expressions.

axis_names

Axis names.

bring_to_front(channel)[source]

Bring a specific channel to the zero-indexed position in channels.

All other channels get pushed back but remain in order.

Parameters:channel (int or str) – Channel index or name.
channel_names

Channel names.

channels

Channels.

chop(*args, at={}, parent=None, verbose=True) → WrightTools.collection._collection.Collection[source]

Divide the dataset into its lower-dimensionality components.

Parameters:
  • axis (str or int (args)) – Axes of the returned data objects. Strings refer to the names of axes in this object, integers refer to their index. Provide multiple axes to return multidimensional data objects.
  • at (dict (optional)) – Choice of position along an axis. Keys are axis names, values are lists [position, input units]. If exact position does not exist, the closest valid position is used.
  • parent (WrightTools Collection instance (optional)) – Collection to place the new “chop” collection within. Default is None (new parent).
  • verbose (bool (optional)) – Toggle talkback. Default is True.
Returns:

Collection of chopped data objects.

Return type:

WrightTools Collection

Examples

>>> data.axis_names
['d2', 'w1', 'w2']

Get all w1 wigners.

>>> datas = data.chop('d2', 'w1')
>>> len(datas)
51

Get 2D frequency at d2=0 fs.

>>> datas = data.chop('w1', 'w2', at={'d2': [0, 'fs']})
>>> len(datas)
0
>>> datas[0].axis_names
['w1', 'w2']
>>> datas[0].d2[:]
0.

See also

collapse()
Collapse the dataset along one axis.
split()
Split the dataset while maintaining its dimensionality.
class_name = 'Data'
clear() → None. Remove all items from D.
close()

Close the file that contains the Group.

All groups which are in the file will be closed and removed from the _instances dictionaries. Tempfiles, if they exist, will be removed

collapse(axis, method='integrate')[source]

Collapse the dataset along one axis, adding lower rank channels.

New channels have names <channel name>_<axis name>_<method>.

Parameters:
  • axis (int or str) – The axis to collapse along. If given as an integer, the axis in the underlying array is used. If given as a string, the axis must exist, and be a 1D array-aligned axis. (i.e. have a shape with a single value which is not 1) The axis to collapse along is inferred from the shape of the axis.
  • method ({'integrate', 'average', 'sum', 'max', 'min'} (optional)) – The method of collapsing the given axis. Method may also be list of methods corresponding to the channels of the object. Default is integrate. All methods but integrate disregard NANs. Can also be a list, allowing for different treatment for varied channels. In this case, None indicates that no change to that channel should occur.

See also

chop()
Divide the dataset into its lower-dimensionality components.
split()
Split the dataset while maintaining its dimensionality.
constant_expressions

Axis expressions.

constant_names

Axis names.

constant_units

All constant units.

constants
convert(destination_units, *, convert_variables=False, verbose=True)[source]

Convert all compatable axes and constants to given units.

Parameters:
  • destination_units (str) – Destination units.
  • convert_variables (boolean (optional)) – Toggle conversion of stored arrays. Default is False
  • verbose (bool (optional)) – Toggle talkback. Default is True.

See also

Axis.convert()
Convert a single axis object to compatable units. Call on an axis object in data.axes.
copy(parent=None, name=None, verbose=True)

Create a copy under parent.

All children are copied as well.

Parameters:
  • parent (WrightTools Collection (optional)) – Parent to copy within. If None, copy is created in root of new tempfile. Default is None.
  • name (string (optional)) – Name of new copy at destination. If None, the current natural name is used. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

Created copy.

Return type:

Group

create_channel(name, values=None, *, shape=None, units=None, dtype=None, **kwargs) → WrightTools.data._channel.Channel[source]

Append a new channel.

Parameters:
  • name (string) – Unique name for this channel.
  • values (array (optional)) – Array. If None, an empty array equaling the data shape is created. Default is None.
  • shape (tuple of int) – Shape to use. Must broadcast with the full shape. Only used if values is None. Default is the full shape of self.
  • units (string (optional)) – Channel units. Default is None.
  • dtype (numpy.dtype (optional)) – dtype to use for dataset, default is np.float64. Only used if values is None.
  • kwargs (dict) – Additional keyword arguments passed to Channel instantiation.
Returns:

Created channel.

Return type:

Channel

create_constant(expression, *, verbose=True)[source]

Append a constant to the stored list.

Parameters:
  • expression (str) – Expression for the new constant.
  • verbose (boolean (optional)) – Toggle talkback. Default is True

See also

set_constants()
Remove and replace all constants.
remove_constant()
Remove an individual constant.
create_dataset(name, shape=None, dtype=None, data=None, **kwds)

Create a new HDF5 dataset

name
Name of the dataset (absolute or relative). Provide None to make an anonymous dataset.
shape
Dataset shape. Use “()” for scalar datasets. Required if “data” isn’t provided.
dtype
Numpy dtype or string. If omitted, dtype(‘f’) will be used. Required if “data” isn’t provided; otherwise, overrides data array’s dtype.
data
Provide data to initialize the dataset. If used, you can omit shape and dtype arguments.

Keyword-only arguments:

chunks
(Tuple) Chunk shape, or True to enable auto-chunking.
maxshape
(Tuple) Make the dataset resizable up to this shape. Use None for axes you want to be unlimited.
compression
(String or int) Compression strategy. Legal values are ‘gzip’, ‘szip’, ‘lzf’. If an integer in range(10), this indicates gzip compression level. Otherwise, an integer indicates the number of a dynamically loaded compression filter.
compression_opts
Compression settings. This is an integer for gzip, 2-tuple for szip, etc. If specifying a dynamically loaded compression filter number, this must be a tuple of values.
scaleoffset
(Integer) Enable scale/offset filter for (usually) lossy compression of integer or floating-point data. For integer data, the value of scaleoffset is the number of bits to retain (pass 0 to let HDF5 determine the minimum number of bits necessary for lossless compression). For floating point data, scaleoffset is the number of digits after the decimal place to retain; stored values thus have absolute error less than 0.5*10**(-scaleoffset).
shuffle
(T/F) Enable shuffle filter.
fletcher32
(T/F) Enable fletcher32 error detection. Not permitted in conjunction with the scale/offset filter.
fillvalue
(Scalar) Use this value for uninitialized parts of the dataset.
track_times
(T/F) Enable dataset creation timestamps.
create_group(name, track_order=False)

Create and return a new subgroup.

Name may be absolute or relative. Fails if the target name already exists.

track_order
Track dataset/group creation order under this group if True.
create_variable(name, values=None, *, shape=None, units=None, dtype=None, **kwargs) → WrightTools.data._variable.Variable[source]

Add new child variable.

Parameters:
  • name (string) – Unique identifier.
  • values (array-like (optional)) – Array to populate variable with. If None, an variable will be filled with NaN. Default is None.
  • shape (tuple of int) – Shape to use. must broadcast with the full shape. Only used if values is None. Default is the full shape of self.
  • units (string (optional)) – Variable units. Default is None.
  • dtype (numpy.dtype (optional)) – dtype to use for dataset, default is np.float64. Only used if values is None.
  • kwargs – Additional kwargs to variable instantiation.
Returns:

New child variable.

Return type:

WrightTools Variable

created
datasets

Datasets.

downscale(tup, name=None, parent=None) → WrightTools.data._data.Data[source]

Down sample the data array using local averaging.

See skimage.transform.downscale_local_mean for more info.

Parameters:
  • tup (tuple of ints) – The collection of step sizes by which each axis is binned. Each axis is sliced with step size determined by the tuple. To keep an axis sampling unchanged, use 1 or None
  • name (string (optional)) – The name of the string. Default is None.
  • parent (WrightTools Collection instance (optional)) – Collection to place the downscaled data object. Default is None (new parent).
Returns:

New data object with the downscaled channels and axes

Return type:

WrightTools Data instance

See also

zoom()
Zoom the data array using spline interpolation of the requested order.
file

Return a File instance associated with this object

flush()

Ensure contents are written to file.

fullpath

file and internal structure.

Type:Full path
get(name, default=None, getclass=False, getlink=False)

Retrieve an item or other information.

“name” given only:
Return the item, or “default” if it doesn’t exist
“getclass” is True:
Return the class of object (Group, Dataset, etc.), or “default” if nothing with that name exists
“getlink” is True:
Return HardLink, SoftLink or ExternalLink instances. Return “default” if nothing with that name exists.
“getlink” and “getclass” are True:
Return HardLink, SoftLink and ExternalLink classes. Return “default” if nothing with that name exists.

Example:

>>> cls = group.get('foo', getclass=True)
>>> if cls == SoftLink:
...     print '"foo" is a soft link!'
get_nadir(channel=0) → tuple[source]

Get the coordinates, in units, of the minimum in a channel.

Parameters:channel (int or str (optional)) – Channel. Default is 0.
Returns:Coordinates in units for each axis.
Return type:generator of numbers
get_zenith(channel=0) → tuple[source]

Get the coordinates, in units, of the maximum in a channel.

Parameters:channel (int or str (optional)) – Channel. Default is 0.
Returns:Coordinates in units for each axis.
Return type:generator of numbers
heal(channel=0, method='linear', fill_value=nan, verbose=True)[source]

Remove nans from channel using interpolation.

Parameters:
  • channel (int or str (optional)) – Channel to heal. Default is 0.
  • method ({'linear', 'nearest', 'cubic'} (optional)) – The interpolation method. Note that cubic interpolation is only possible for 1D and 2D data. See griddata for more information. Default is linear.
  • fill_value (number-like (optional)) – The value written to pixels that cannot be filled by interpolation. Default is nan.
  • verbose (bool (optional)) – Toggle talkback. Default is True.

Note

Healing may take several minutes for large datasets. Interpolation time goes as nearest, linear, then cubic.

id

Low-level identifier appropriate for this object

item_names

Item names.

items()

Get a view object on member items

keys()

Get a view object on member names

kind

Kind.

level(channel, axis, npts, *, verbose=True)[source]

Subtract the average value of npts at the edge of a given axis.

Parameters:
  • channel (int or str) – Channel to level.
  • axis (int) – Axis to level along.
  • npts (int) – Number of points to average for each slice. Positive numbers take points at leading indicies and negative numbers take points at trailing indicies.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
map_variable(variable, points, input_units='same', *, name=None, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Map points of an axis to new points using linear interpolation.

Out-of-bounds points are written nan.

Parameters:
  • variable (string) – The variable to map onto.
  • points (array-like or int) – If array, the new points. If int, new points will have the same limits, with int defining the number of evenly spaced points between.
  • input_units (str (optional)) – The units of the new points. Default is same, which assumes the new points have the same units as the axis.
  • name (string (optional)) – The name of the new data object. If None, generated from natural_name. Default is None.
  • parent (WrightTools.Collection (optional)) – Parent of new data object. If None, data is made at root of a new temporary file.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
Returns:

New data object.

Return type:

WrightTools.Data

move(source, dest)

Move a link to a new location in the file.

If “source” is a hard link, this effectively renames the object. If “source” is a soft or external link, the link itself is moved, with its value unmodified.

name

Return the full name of this object. None if anonymous.

natural_name

Natural name.

ndim

Get number of dimensions.

offset(points, offsets, along, offset_axis, units='same', offset_units='same', mode='valid', method='linear', verbose=True)[source]

Offset one axis based on another axis’ values.

Useful for correcting instrumental artifacts such as zerotune.

Parameters:
  • points (1D array-like) – Points.
  • offsets (1D array-like) – Offsets.
  • along (str or int) – Axis that points array lies along.
  • offset_axis (str or int) – Axis to offset using offsets.
  • units (str (optional)) – Units of points array.
  • offset_units (str (optional)) – Units of offsets aray.
  • mode ({'valid', 'full', 'old'} (optional)) – Define how far the new axis will extend. Points outside of valid interpolation range will be written nan.
  • method ({'linear', 'nearest', 'cubic'} (optional)) – The interpolation method. Note that cubic interpolation is only possible for 1D and 2D data. See griddata for more information. Default is linear.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
>>> points  # an array of w1 points
>>> offsets  # an array of d1 corrections
>>> data.offset(points, offsets, 'w1', 'd1')
parent

Parent.

pop(k[, d]) → v, remove specified key and return the corresponding value.

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() → (k, v), remove and return some (key, value) pair

as a 2-tuple; but raise KeyError if D is empty.

print_tree(*, verbose=True)[source]

Print a ascii-formatted tree representation of the data contents.

ref

An (opaque) HDF5 reference to this object

regionref

Create a region reference (Datasets only).

The syntax is regionref[<slices>]. For example, dset.regionref[…] creates a region reference in which the whole dataset is selected.

Can also be used to determine the shape of the referenced dataset (via .shape property), or the shape of the selection (via the .selection property).

remove_channel(channel, *, verbose=True)[source]

Remove channel from data.

Parameters:
  • channel (int or str) – Channel index or name to remove.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
remove_constant(constant, *, verbose=True)[source]

Remove a constant from the stored list.

Parameters:
  • constant (str or Constant or int) – Expression for the new constant.
  • verbose (boolean (optional)) – Toggle talkback. Default is True

See also

set_constants()
Remove and replace all constants.
create_constant()
Add an individual constant.
remove_variable(variable, *, implied=True, verbose=True)[source]

Remove variable from data.

Parameters:
  • variable (int or str) – Variable index or name to remove.
  • implied (boolean (optional)) – Toggle deletion of other variables that start with the same name. Default is True.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
rename_channels(*, verbose=True, **kwargs)[source]

Rename a set of channels.

Parameters:
  • kwargs – Keyword arguments of the form current:’new’.
  • verbose (boolean (optional)) – Toggle talkback. Default is True
rename_variables(*, implied=True, verbose=True, **kwargs)[source]

Rename a set of variables.

Parameters:
  • kwargs – Keyword arguments of the form current:’new’.
  • implied (boolean (optional)) – Toggle inclusion of other variables that start with the same name. Default is True.
  • verbose (boolean (optional)) – Toggle talkback. Default is True
require_dataset(name, shape, dtype, exact=False, **kwds)

Open a dataset, creating it if it doesn’t exist.

If keyword “exact” is False (default), an existing dataset must have the same shape and a conversion-compatible dtype to be returned. If True, the shape and dtype must match exactly.

Other dataset keywords (see create_dataset) may be provided, but are only used if a new dataset is to be created.

Raises TypeError if an incompatible object already exists, or if the shape or dtype don’t match according to the above rules.

require_group(name)

Return a group, creating it if it doesn’t exist.

TypeError is raised if something with that name already exists that isn’t a group.

save(filepath=None, overwrite=False, verbose=True)

Save as root of a new file.

Parameters:
  • filepath (Path-like object (optional)) – Filepath to write. If None, file is created using natural_name.
  • overwrite (boolean (optional)) – Toggle overwrite behavior. Default is False.
  • verbose (boolean (optional)) – Toggle talkback. Default is True
Returns:

Written filepath.

Return type:

str

set_constants(*constants, verbose=True)[source]

Set the constants associated with the data.

Parameters:
  • constants (str) – Expressions for the new set of constants.
  • verbose (boolean (optional)) – Toggle talkback. Default is True

See also

transform()
Similar method except for axes.
create_constant()
Add an individual constant.
remove_constant()
Remove an individual constant.
setdefault(k[, d]) → D.get(k,d), also set D[k]=d if k not in D
shape

Shape.

share_nans()[source]

Share not-a-numbers between all channels.

If any channel is nan at a given index, all channels will be nan at that index after this operation.

Uses the share_nans method found in wt.kit.

size

Size.

smooth(factors, channel=None, verbose=True) → WrightTools.data._data.Data[source]

Smooth a channel using an n-dimenional kaiser window.

Note, all arrays are loaded into memory.

Parameters:
  • factors (int or list of int) – The smoothing factor. You may provide a list of smoothing factors for each axis.
  • channel (int or str or None (optional)) – The channel to smooth. If None, all channels will be smoothed. Default is None.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
source

Source.

split(expression, positions, *, units=None, parent=None, verbose=True) → WrightTools.collection._collection.Collection[source]

Split the data object along a given expression, in units.

Parameters:
  • expression (int or str) – The expression to split along. If given as an integer, the axis at that index is used.
  • positions (number-type or 1D array-type) – The position(s) to split at, in units.
  • units (str (optional)) – The units of the given positions. Default is same, which assumes input units are identical to first variable units.
  • parent (WrightTools.Collection (optional)) – The parent collection in which to place the ‘split’ collection. Default is a new Collection.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
Returns:

A Collection of data objects. The order of the objects is such that the axis points retain their original order.

Return type:

WrightTools.collection.Collection

See also

chop()
Divide the dataset into its lower-dimensionality components.
collapse()
Collapse the dataset along one axis.
transform(*axes, verbose=True)[source]

Transform the data.

Parameters:
  • axes (strings) – Expressions for the new set of axes.
  • verbose (boolean (optional)) – Toggle talkback. Default is True

See also

set_constants()
Similar method except for constants
units

All axis units.

update([E, ]**F) → None. Update D from mapping/iterable E and F.

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

values()

Get a view object on member objects

variable_names

Variable names.

variables

Variables.

visit(func)

Recursively visit all names in this group and subgroups (HDF5 1.8).

You supply a callable (function, method or callable object); it will be called exactly once for each link in this group and every group below it. Your callable must conform to the signature:

func(<member name>) => <None or return value>

Returning None continues iteration, returning anything else stops and immediately returns that value from the visit method. No particular order of iteration within groups is guaranteed.

Example:

>>> # List the entire contents of the file
>>> f = File("foo.hdf5")
>>> list_of_names = []
>>> f.visit(list_of_names.append)
visititems(func)

Recursively visit names and objects in this group (HDF5 1.8).

You supply a callable (function, method or callable object); it will be called exactly once for each link in this group and every group below it. Your callable must conform to the signature:

func(<member name>, <object>) => <None or return value>

Returning None continues iteration, returning anything else stops and immediately returns that value from the visit method. No particular order of iteration within groups is guaranteed.

Example:

# Get a list of all datasets in the file >>> mylist = [] >>> def func(name, obj): … if isinstance(obj, Dataset): … mylist.append(name) … >>> f = File(‘foo.hdf5’) >>> f.visititems(func)

zoom(factor, order=1, verbose=True)[source]

Zoom the data array using spline interpolation of the requested order.

The number of points along each axis is increased by factor. See scipy ndimage for more info.

Parameters:
  • factor (float) – The number of points along each axis will increase by this factor.
  • order (int (optional)) – The order of the spline used to interpolate onto new points.
  • verbose (bool (optional)) – Toggle talkback. Default is True.

See also

downscale()
Down-sample the data array using local averaging.
WrightTools.data.join(datas, *, atol=None, rtol=None, name='join', parent=None, method='first', verbose=True) → WrightTools.data._data.Data[source]

Join a list of data objects together.

Joined datas must have the same transformation applied. This transformation will define the array order for the joined dataset. All axes in the applied transformation must be a single variable, the result will have sorted numbers.

Join does not perform any interpolation. For that, look to Data.map_variable or Data.heal

Parameters:
  • datas (list of data or WrightTools.Collection) – The list or collection of data objects to join together.
  • atol (numeric or list of numeric) – The absolute tolerance to use (in np.isclose) to consider points overlapped. If given as a single number, applies to all axes. If given as a list, must have same length as the data transformation. None in the list invokes default behavior. Default is 10% of the minimum spacing between consecutive points in any input data file.
  • rtol (numeric or list of numeric) – The relative tolerance to use (in np.isclose) to consider points overlapped. If given as a single number, applies to all axes. If given as a list, must have same length as the data transformation. None in the list invokes default behavior. Default is 4 * np.finfo(dtype).resolution for floating point types, 0 for integer types.
  • name (str (optional)) – The name for the data object which is created. Default is ‘join’.
  • parent (WrightTools.Collection (optional)) – The location to place the joined data object. Default is new temp file at root.
  • method ({'first', 'last', 'min', 'max', 'sum', 'mean'}) – Mode to use for merged points in the joined space. Default is ‘first’.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
Returns:

A new Data instance.

Return type:

WrightTools.data.Data

class WrightTools.data.Axis(parent, expression, units=None)[source]

Bases: object

Axis class.

__init__(parent, expression, units=None)[source]

Data axis.

Parameters:
  • parent (WrightTools.Data) – Parent data object.
  • expression (string) – Axis expression.
  • units (string (optional)) – Axis units. Default is None.
convert(destination_units, *, convert_variables=False)[source]

Convert axis to destination_units.

Parameters:
  • destination_units (string) – Destination units.
  • convert_variables (boolean (optional)) – Toggle conversion of stored arrays. Default is False.
full

Axis expression evaluated and repeated to match the shape of the parent data object.

identity

Complete identifier written to disk in data.attrs[‘axes’].

label

A latex formatted label representing axis expression.

masked

Axis expression evaluated, and masked with NaN shared from data channels.

max()[source]

Axis max.

min()[source]

Axis min.

natural_name

Valid python identifier representation of the expession.

ndim

Get number of dimensions.

points

Squeezed array.

shape

Shape.

size

Size.

units_kind

Units kind.

variables

Variables.

class WrightTools.data.Channel(parent, id, *, units=None, null=None, signed=None, label=None, label_seed=None, **kwargs)[source]

Bases: WrightTools._dataset.Dataset

Channel.

__init__(parent, id, *, units=None, null=None, signed=None, label=None, label_seed=None, **kwargs)[source]

Construct a channel object.

Parameters:
  • values (array-like) – Values.
  • name (string) – Channel name.
  • units (string (optional)) – Channel units. Default is None.
  • null (number (optional)) – Channel null. Default is None (0).
  • signed (booelan (optional)) – Channel signed flag. Default is None (guess).
  • label (string.) – Label. Default is None.
  • label_seed (list of strings) – Label seed. Default is None.
  • **kwargs – Additional keyword arguments are added to the attrs dictionary and to the natural namespace of the object (if possible).
argmax()

Index of the maximum, ignorning nans.

argmin()

Index of the minimum, ignoring nans.

astype(dtype)

Get a context manager allowing you to perform reads to a different destination type, e.g.:

>>> with dataset.astype('f8'):
...     double_precision = dataset[0:100:2]
attrs

Attributes attached to this object

chunks

Dataset chunks (or None)

chunkwise(func, *args, **kwargs)

Execute a function for each chunk in the dataset.

Order of excecution is not guaranteed.

Parameters:
  • func (function) – Function to execute. First two arguments must be dataset, slices.
  • (optional) (kwargs) – Additional (unchanging) arguments passed to func.
  • (optional) – Additional (unchanging) keyword arguments passed to func.
Returns:

Dictionary of index: function output. Index is to lowest corner of each chunk.

Return type:

collections OrderedDict

class_name = 'Channel'
clip(min=None, max=None, replace=nan)

Clip values outside of a defined range.

Parameters:
  • min (number (optional)) – New channel minimum. Default is None.
  • max (number (optional)) – New channel maximum. Default is None.
  • replace (number or 'value' (optional)) – Replace behavior. Default is nan.
compression

Compression strategy (or None)

compression_opts

Compression setting. Int(0-9) for gzip, 2-tuple for szip.

convert(destination_units)

Convert units.

Parameters:destination_units (string (optional)) – Units to convert into.
dims

Access dimension scales attached to this dataset.

dtype

Numpy dtype representing the datatype

file

Return a File instance associated with this object

fillvalue

Fill value for this dataset (0 by default)

fletcher32

Fletcher32 filter is present (T/F)

flush

Flush the dataset data and metadata to the file. If the dataset is chunked, raw data chunks are written to the file.

This is part of the SWMR features and only exist when the HDF5 library version >=1.9.178

full
fullpath

file and internal structure.

Type:Full path
id

Low-level identifier appropriate for this object

len()

The size of the first axis. TypeError if scalar.

Use of this method is preferred to len(dset), as Python’s built-in len() cannot handle values greater then 2**32 on 32-bit systems.

log(base=2.718281828459045, floor=None)

Take the log of the entire dataset.

Parameters:
  • base (number (optional)) – Base of log. Default is e.
  • floor (number (optional)) – Clip values below floor after log. Default is None.
log10(floor=None)

Take the log base 10 of the entire dataset.

Parameters:floor (number (optional)) – Clip values below floor after log. Default is None.
log2(floor=None)

Take the log base 2 of the entire dataset.

Parameters:floor (number (optional)) – Clip values below floor after log. Default is None.
mag() → complex[source]

Channel magnitude (maximum deviation from null).

major_extent

Maximum deviation from null.

max()

Maximum, ignorning nans.

maxshape

Shape up to which this dataset can be resized. Axes with value None have no resize limit.

min()

Minimum, ignoring nans.

minor_extent

Minimum deviation from null.

name

Return the full name of this object. None if anonymous.

natural_name

Natural name of the dataset. May be different from name.

ndim

Numpy-style attribute giving the number of dimensions

normalize(mag=1.0)[source]

Normalize a Channel, set null to 0 and the mag to given value.

Parameters:mag (float (optional)) – New value of mag. Default is 1.
null
parent

Parent.

points

Squeezed array.

read_direct(dest, source_sel=None, dest_sel=None)

Read data directly from HDF5 into an existing NumPy array.

The destination array must be C-contiguous and writable. Selections must be the output of numpy.s_[<args>].

Broadcasting is supported for simple indexing.

ref

An (opaque) HDF5 reference to this object

refresh

Refresh the dataset metadata by reloading from the file.

This is part of the SWMR features and only exist when the HDF5 library version >=1.9.178

regionref

Create a region reference (Datasets only).

The syntax is regionref[<slices>]. For example, dset.regionref[…] creates a region reference in which the whole dataset is selected.

Can also be used to determine the shape of the referenced dataset (via .shape property), or the shape of the selection (via the .selection property).

resize(size, axis=None)

Resize the dataset, or the specified axis.

The dataset must be stored in chunked format; it can be resized up to the “maximum shape” (keyword maxshape) specified at creation time. The rank of the dataset cannot be changed.

“Size” should be a shape tuple, or if an axis is specified, an integer.

BEWARE: This functions differently than the NumPy resize() method! The data is not “reshuffled” to fit in the new shape; each axis is grown or shrunk independently. The coordinates of existing data are fixed.

scaleoffset

Scale/offset filter settings. For integer data types, this is the number of bits stored, or 0 for auto-detected. For floating point data types, this is the number of decimal places retained. If the scale/offset filter is not in use, this is None.

shape

Numpy-style shape tuple giving dataset dimensions

shuffle

Shuffle filter present (T/F)

signed
size

Numpy-style attribute giving the total dataset size

slices()

Returns a generator yielding tuple of slice objects.

Order is not guaranteed.

symmetric_root(root=2)
trim(neighborhood, method='ztest', factor=3, replace='nan', verbose=True)[source]

Remove outliers from the dataset.

Identifies outliers by comparing each point to its neighbors using a statistical test.

Parameters:
  • neighborhood (list of integers) – Size of the neighborhood in each dimension. Length of the list must be equal to the dimensionality of the channel.
  • method ({'ztest'} (optional)) –

    Statistical test used to detect outliers. Default is ztest.

    ztest
    Compare point deviation from neighborhood mean to neighborhood standard deviation.
  • factor (number (optional)) – Tolerance factor. Default is 3.
  • replace ({'nan', 'mean', 'mask', number} (optional)) –

    Behavior of outlier replacement. Default is nan.

    nan
    Outliers are replaced by numpy nans.
    mean
    Outliers are replaced by the mean of its neighborhood.
    mask
    Array is masked at outliers.
    number
    Array becomes given number.
Returns:

Indicies of trimmed outliers.

Return type:

list of tuples

See also

clip()
Remove pixels outside of a certain range.
units

Units.

value

Alias for dataset[()]

write_direct(source, source_sel=None, dest_sel=None)

Write data directly to HDF5 from a NumPy array.

The source array must be C-contiguous. Selections must be the output of numpy.s_[<args>].

Broadcasting is supported for simple indexing.

class WrightTools.data.Constant(parent, expression, units=None, format_spec='0.3g', round_spec=None)[source]

Bases: WrightTools.data._axis.Axis

Constant class.

__init__(parent, expression, units=None, format_spec='0.3g', round_spec=None)[source]

Data constant.

Parameters:
  • parent (WrightTools.Data) – Parent data object.
  • expression (string) – Constant expression.
  • units (string (optional)) – Constant units. Default is None.
  • format_spec (string (optional)) – Format string specification, as passed to format() Default is “0.3g”
  • round_spec (int or None (optional)) – Decimal digits to round to before formatting, as passed to round(). Default is None (no rounding).
convert(destination_units, *, convert_variables=False)

Convert axis to destination_units.

Parameters:
  • destination_units (string) – Destination units.
  • convert_variables (boolean (optional)) – Toggle conversion of stored arrays. Default is False.
full

Axis expression evaluated and repeated to match the shape of the parent data object.

identity

Complete identifier written to disk in data.attrs[‘axes’].

label

A latex formatted label representing constant expression and united value.

masked

Axis expression evaluated, and masked with NaN shared from data channels.

max()

Axis max.

min()

Axis min.

natural_name

Valid python identifier representation of the expession.

ndim

Get number of dimensions.

points

Squeezed array.

shape

Shape.

size

Size.

std

The standard deviation of the constant.

units_kind

Units kind.

value

The value of the constant.

variables

Variables.

class WrightTools.data.Variable(parent, id, units=None, **kwargs)[source]

Bases: WrightTools._dataset.Dataset

Variable.

__init__(parent, id, units=None, **kwargs)[source]

Variable.

Parameters:
  • parent (WrightTools.Data) – Parent data object.
  • id (h5py DatasetID) – Dataset ID.
  • units (string (optional)) – Variable units. Default is None.
  • kwargs – Additional keys and values to be written into dataset attrs.
argmax()

Index of the maximum, ignorning nans.

argmin()

Index of the minimum, ignoring nans.

astype(dtype)

Get a context manager allowing you to perform reads to a different destination type, e.g.:

>>> with dataset.astype('f8'):
...     double_precision = dataset[0:100:2]
attrs

Attributes attached to this object

chunks

Dataset chunks (or None)

chunkwise(func, *args, **kwargs)

Execute a function for each chunk in the dataset.

Order of excecution is not guaranteed.

Parameters:
  • func (function) – Function to execute. First two arguments must be dataset, slices.
  • (optional) (kwargs) – Additional (unchanging) arguments passed to func.
  • (optional) – Additional (unchanging) keyword arguments passed to func.
Returns:

Dictionary of index: function output. Index is to lowest corner of each chunk.

Return type:

collections OrderedDict

class_name = 'Variable'
clip(min=None, max=None, replace=nan)

Clip values outside of a defined range.

Parameters:
  • min (number (optional)) – New channel minimum. Default is None.
  • max (number (optional)) – New channel maximum. Default is None.
  • replace (number or 'value' (optional)) – Replace behavior. Default is nan.
compression

Compression strategy (or None)

compression_opts

Compression setting. Int(0-9) for gzip, 2-tuple for szip.

convert(destination_units)

Convert units.

Parameters:destination_units (string (optional)) – Units to convert into.
dims

Access dimension scales attached to this dataset.

dtype

Numpy dtype representing the datatype

file

Return a File instance associated with this object

fillvalue

Fill value for this dataset (0 by default)

fletcher32

Fletcher32 filter is present (T/F)

flush

Flush the dataset data and metadata to the file. If the dataset is chunked, raw data chunks are written to the file.

This is part of the SWMR features and only exist when the HDF5 library version >=1.9.178

full
fullpath

file and internal structure.

Type:Full path
id

Low-level identifier appropriate for this object

label
len()

The size of the first axis. TypeError if scalar.

Use of this method is preferred to len(dset), as Python’s built-in len() cannot handle values greater then 2**32 on 32-bit systems.

log(base=2.718281828459045, floor=None)

Take the log of the entire dataset.

Parameters:
  • base (number (optional)) – Base of log. Default is e.
  • floor (number (optional)) – Clip values below floor after log. Default is None.
log10(floor=None)

Take the log base 10 of the entire dataset.

Parameters:floor (number (optional)) – Clip values below floor after log. Default is None.
log2(floor=None)

Take the log base 2 of the entire dataset.

Parameters:floor (number (optional)) – Clip values below floor after log. Default is None.
max()

Maximum, ignorning nans.

maxshape

Shape up to which this dataset can be resized. Axes with value None have no resize limit.

min()

Minimum, ignoring nans.

name

Return the full name of this object. None if anonymous.

natural_name

Natural name of the dataset. May be different from name.

ndim

Numpy-style attribute giving the number of dimensions

parent

Parent.

points

Squeezed array.

read_direct(dest, source_sel=None, dest_sel=None)

Read data directly from HDF5 into an existing NumPy array.

The destination array must be C-contiguous and writable. Selections must be the output of numpy.s_[<args>].

Broadcasting is supported for simple indexing.

ref

An (opaque) HDF5 reference to this object

refresh

Refresh the dataset metadata by reloading from the file.

This is part of the SWMR features and only exist when the HDF5 library version >=1.9.178

regionref

Create a region reference (Datasets only).

The syntax is regionref[<slices>]. For example, dset.regionref[…] creates a region reference in which the whole dataset is selected.

Can also be used to determine the shape of the referenced dataset (via .shape property), or the shape of the selection (via the .selection property).

resize(size, axis=None)

Resize the dataset, or the specified axis.

The dataset must be stored in chunked format; it can be resized up to the “maximum shape” (keyword maxshape) specified at creation time. The rank of the dataset cannot be changed.

“Size” should be a shape tuple, or if an axis is specified, an integer.

BEWARE: This functions differently than the NumPy resize() method! The data is not “reshuffled” to fit in the new shape; each axis is grown or shrunk independently. The coordinates of existing data are fixed.

scaleoffset

Scale/offset filter settings. For integer data types, this is the number of bits stored, or 0 for auto-detected. For floating point data types, this is the number of decimal places retained. If the scale/offset filter is not in use, this is None.

shape

Numpy-style shape tuple giving dataset dimensions

shuffle

Shuffle filter present (T/F)

size

Numpy-style attribute giving the total dataset size

slices()

Returns a generator yielding tuple of slice objects.

Order is not guaranteed.

symmetric_root(root=2)
units

Units.

value

Alias for dataset[()]

write_direct(source, source_sel=None, dest_sel=None)

Write data directly to HDF5 from a NumPy array.

The source array must be C-contiguous. Selections must be the output of numpy.s_[<args>].

Broadcasting is supported for simple indexing.

WrightTools.data.from_BrunoldrRaman(filepath, name=None, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a data object from the Brunold rRaman instrument.

Expects one energy (in wavenumbers) and one counts value.

Parameters:
  • filepath (string, list of strings, or array of strings) – Path to .txt file.
  • name (string (optional)) – Name to give to the created data object. If None, filename is used. Default is None.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

New data object(s).

Return type:

data

WrightTools.data.from_COLORS(filepaths, name=None, cols=None, invert_d1=True, ignore=['w3', 'wa', 'dref', 'm0', 'm1', 'm2', 'm3', 'm4', 'm5', 'm6'], parent=None, verbose=True)[source]

Create data object from COLORS file(s).

Parameters:
  • filepaths (string or list of strings) – Filepath(s).
  • name (string (optional)) – Unique dataset identifier. If None (default), autogenerated.
  • cols ({'v0', 'v1', 'v2'} (optional)) – Format of COLORS dat file. If None, autorecognized. Default is None.
  • invert_d1 (boolean (optional)) – Toggle inversion of D1 at import time. Default is True.
  • ignore (list of strings (optional)) – Columns to ignore.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
Returns:

Data from COLORS.

Return type:

WrightTools.Data

WrightTools.data.from_JASCO(filepath, name=None, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a data object from JASCO UV-Vis spectrometers.

Parameters:
  • filepath (string, list of strings, or array of strings) – Path to .txt file.
  • name (string (optional)) – Name to give to the created data object. If None, filename is used. Default is None.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

New data object(s).

Return type:

data

WrightTools.data.from_KENT(filepaths, name=None, ignore=['wm'], delay_tolerance=0.1, frequency_tolerance=0.5, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create data object from KENT file(s).

Parameters:
  • filepaths (string or list of strings) – Filepath(s).
  • name (string (optional)) – Unique dataset identifier. If None (default), autogenerated.
  • ignore (list of strings (optional)) – Columns to ignore. Default is [‘wm’].
  • delay_tolerance (float (optional)) – Tolerance below-which to ignore delay changes (in picoseconds). Default is 0.1.
  • frequency_tolerance (float (optional)) – Tolerance below-which to ignore frequency changes (in wavenumbers). Default is 0.5.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
Returns:

Data from KENT.

Return type:

WrightTools.Data

WrightTools.data.from_PyCMDS(filepath, name=None, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a data object from a single PyCMDS output file.

Parameters:
  • filepath (str) – The file to load. Can accept .data, .fit, or .shots files.
  • name (str or None (optional)) – The name to be applied to the new data object. If None, name is read from file.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (bool (optional)) – Toggle talkback. Default is True.
Returns:

A Data instance.

Return type:

data

WrightTools.data.from_ocean_optics(filepath, name=None, *, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a data object from an Ocean Optics brand spectrometer.

Parameters:
  • filepath (string, list of strings, or array of strings) – Path to an ocean optics output file.
  • name (string (optional)) – Name to give to the created data object. If None, filename is used. Default is None.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

New data object.

Return type:

data

WrightTools.data.from_shimadzu(filepath, name=None, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a data object from Shimadzu .txt file.

Parameters:
  • filepath (string) – Path to .txt file.
  • name (string (optional)) – Name to give to the created data object. If None, filename is used. Default is None.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

New data object.

Return type:

data

WrightTools.data.from_Solis(filepath, name=None, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a data object from Andor Solis software (ascii exports).

Parameters:
  • filepath (string, list of strings, or array of strings) – Path to .txt file.
  • name (string (optional)) – Name to give to the created data object. If None, filename is used. Default is None.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

New data object.

Return type:

data

WrightTools.data.from_spcm(filepath, name=None, *, delimiter=', ', parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a Data object from a Becker & Hickl spcm file (ASCII-exported, .asc).

If provided, setup parameters are stored in the attrs dictionary of the Data object.

Parameters:
  • filepath (string) – Path to SPC-xxx .asc file.
  • name (string (optional)) – Name to give to the created data object. If None, filename is used. Default is None.
  • delimiter (string (optional)) – The string used to separate values. Default is ‘,’.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

Return type:

WrightTools.data.Data object

WrightTools.data.from_Tensor27(filepath, name=None, parent=None, verbose=True) → WrightTools.data._data.Data[source]

Create a data object from a Tensor27 FTIR file.

>>> import WrightTools as wt
>>> import matplotlib.pyplot as plt
>>> from WrightTools import datasets
>>> p = datasets.Tensor27.CuPCtS_powder_ATR
>>> data = wt.data.from_Tensor27(p)
>>> artist = wt.artists.quick1D(data)
>>> plt.xlim(1300,1700)
>>> plt.ylim(-0.005,.02)

(Source code, png, pdf)

../_images/WrightTools-data-1.png
Parameters:
  • filepath (string) – Path to Tensor27 output file (.dpt).
  • name (string (optional)) – Name to give to the created data object. If None, filename is used. Default is None.
  • parent (WrightTools.Collection (optional)) – Collection to place new data object within. Default is None.
  • verbose (boolean (optional)) – Toggle talkback. Default is True.
Returns:

New data object.

Return type:

data