The wt5 File Format
WrightTools stores data in binary wt5 files.
wt5 is a sub-format of HDF5.
wt5
wt5
files are hdf5 files with particular structure and attributes defined.
wt5
objects may appear embedded within a larger hdf5 file or vise-versa, however this is untested.
At the root of a wt5
file, a Collection
or Data
object is found.
Collection
and Data
are hdf5 groups.
A Collection
may have children consisting of Collection
and/or Data
.
A Data
may have children consisting of Variable
and/or Channel
.
Variable
and Channel
are hdf5 datasets.
Metadata
The following metadata is handled within WrightTools
and define the necessary attributes to be a wt5
file.
It is recommended not to write over these attributes manually except at import time (e.g. from_<x>
function).
name |
Collection |
Data |
Variable |
Channel |
description/notes |
---|---|---|---|---|---|
|
yes |
yes |
yes |
yes |
Usually matches the last component of the path,
except for root, |
|
yes |
yes |
yes |
yes |
Identifies which kind of WrightTools object it is. |
|
yes |
yes |
Timestamp of when the object was made,
can be overwritten with source file creation time by |
||
|
yes |
yes |
|
||
|
yes |
yes |
Ordered list of the children |
||
|
yes |
Ordered list of all Variables |
|||
|
yes |
Ordered list of all Channels |
|||
|
yes |
Ordered list of axes expressions which define how a Data object is represented |
|||
|
yes |
Ordered list of expressions for values which are constant |
|||
|
yes |
Short description of what type of file it originated from, usually the instrument |
|||
|
yes |
File path/url to the original file as read in |
|||
|
yes |
yes |
Identifier used to create more complex labels in Axes or Constants, which are used to plot |
||
|
yes |
yes |
Units assigned to the dataset |
||
|
yes |
yes |
Cached minimum value |
||
|
yes |
yes |
Cached maximum value |
||
|
yes |
yes |
Cached index of minimum value |
||
|
yes |
yes |
Cached index of maximum value |
||
|
yes |
Boolean for treating channel as signed/unsigned |
HDF5
The HDF5 data model contains two primary objects: the group and the dataset.
Groups are used to hierarchically organize content within the file.
Each group is a container for datasets and other groups.
Think of groups like folders in your computers file system.
Every HDF5 file contains a top-level root group, signified by /
.
Datasets are specialty containers for raw data values. Think of datasets like multidimensional arrays, similar to the numpy ndarray. Each dataset has a specific data type, such as integer, float, or character.
Groups and datasets can contain additional metadata.
This metadata is stored in a key: value pair system called attrs
, similar to a python dictionary.
Much more information can be found on the HDF5 tutorial.
WrightTools relies upon the h5py package, a Pythonic interface to HDF5.
Access
wt5 is a binary format, so it cannot be interpreted with traditional text editors. Since wt5 is a sub-format of HDF5, WrightTools benefits from the ecosystem of HDF5 tools that already exists. This means that it is possible to import and interact with wt5 files without WrightTools, or even without python.
ASCII
Export an HDF5 file to a human-readable ASCII file using h5dump.
See also HDF to Excel.
Fortran
Use the official HDF5 Fortran Library.
Graphical
HDF COMPASS, a simple tool for navigating and viewing data within HDF5 files (no editing functionality).
HDF VIEW, a visual tool for browsing and editing HDF5 files.
MATLAB
MATLAB offers built-in high-level HDF5 functions including h5disp
, h5read
, and h5readatt
.
Python (without WrightTools)
We reccomend the amazing h5py package.
Shell
h5cli: bash-like interface to interacting with HDF5 files.
h5diff: compare two HDF5 files, reporting the differences.
h5ls: print information about one or more HDF5 files.
Changes
Version 1.0.0
Initial release of the format.
Version 1.0.1
Changes internal handling of strings. Bare strings are no longer required to call encode()
before storing.
Version 1.0.2
Adds “constants” as a stored attribute in the attrs dictionary, a list of strings just like axes.
Version 1.0.3
Changed identity as stored in attrs dictionary (axis
and constant
) to use the expression
including operators.
Previous versions exhibited a bug where decimal points would be ignored when the expression was generated from the attrs (thus “2.0” would be stored as “2_0” and read in as “20”).