Data Model

TOAST works with data organized into observations. Each observation is independent of any other observation. An observation consists of co-sampled detectors for some span of time. The intrinsic detector noise is assumed to be stationary within an observation. Typically there are other quantities which are constant for an observation (e.g. elevation, weather conditions, satellite procession axis, etc).

An observation is just a dictionary with at least one member (“tod”) which is an instance of a class that derives from the toast.TOD base class. Every experiment will have their own TOD derived classes, but TOAST includes some built-in ones as well.

The inputs to a TOD class constructor are at least:

  1. The detector names for the observation.
  2. The number of samples in the observation.
  3. The geometric offset of the detectors from the boresight.
  4. Information about how detectors and samples are distributed among processes.
class toast.tod.TOD(mpicomm, detectors, samples, detindx=None, detranks=1, detbreaks=None, sampsizes=None, sampbreaks=None, meta=None)[source]

Base class for an object that provides detector pointing and timestreams for a single observation.

This class provides high-level functions that are common to all derived classes. It also defines the internal methods that should be overridden by all derived classes. These internal methods throw an exception if they are called. A TOD base class should never be directly instantiated.

Parameters:
  • mpicomm (mpi4py.MPI.Comm) – the MPI communicator over which the data is distributed, or None.
  • detectors (list) – The list of detector names.
  • samples (int) – The total number of samples.
  • detindx (dict) – the detector indices for use in simulations. Default is { x[0] : x[1] for x in zip(detectors, range(len(detectors))) }.
  • detranks (int) – The dimension of the process grid in the detector direction. If not None, the MPI communicator size must be evenly divisible by this number.
  • detbreaks (list) – Optional list of hard breaks in the detector distribution.
  • sampsizes (list) – Optional list of sample chunk sizes which cannot be split.
  • sampbreaks (list) – Optional list of hard breaks in the sample distribution.
  • meta (dict) – Optional dictionary of metadata properties.
COMMON_FLAG_NAME = 'common_flags'

Default cache name for common flags.

FLAG_NAME = 'flags'

Default cache name for flags.

HWP_ANGLE_NAME = 'hwp_angle'

Default cache name for HWP angle.

POINTING_NAME = 'quat'

Default cache name for pointing quaternions.

POSITION_NAME = 'position'

Default cache name for position.

SIGNAL_NAME = 'signal'

Default cache name for signal.

TIMESTAMP_NAME = 'timestamps'

Default cache name for timestamps.

VELOCITY_NAME = 'velocity'

Default cache name for velociyt.

detectors

The total list of detectors.

Type:(list)
detindx

The detector indices.

Type:(dict)
detoffset()[source]

Return dictionary of detector quaternions.

This returns a dictionary with the detector names as the keys and the values are 4-element numpy arrays containing the quaternion offset from the boresight.

Parameters:None
Returns (dict):
the dictionary of quaternions.
dist_chunks

this is a list of 2-tuples, one for each column of the process grid. Each element of the list is the same as the information returned by the “local_chunks” member for a given process column.

Type:(list)
dist_samples

This is a list of 2-tuples, with one element per column of the process grid. Each tuple is the same information returned by the “local_samples” member for the corresponding process grid column rank.

Type:(list)
grid_comm_col

a communicator across all detectors in the same column of the process grid (or None).

Type:(mpi4py.MPI.Comm)
grid_comm_row

a communicator across all detectors in the same row of the process grid (or None).

Type:(mpi4py.MPI.Comm)
grid_ranks

the ranks of this process in the (detector, sample) directions.

Type:(tuple)
grid_size

the dimensions of the process grid in (detector, sample) directions.

Type:(tuple)
local_chunks

the first element of the tuple is the index of the first chunk assigned to this process (i.e. the index in the list given by the “total_chunks” member). The second element of the tuple is the number of chunks assigned to this process.

Type:(2-tuple)
local_common_flags(name=None, **kwargs)[source]

Locally stored common flags.

Parameters:name (str) – Optional cache key to use.
Returns:A cache reference to a common flag vector. If ‘name’ is None a default name ‘common_flags’ is used and the vector may be constructed and cached using the ‘read_common_flags’ method. If ‘name’ is given, then the flags must already be cached.
local_dets

The detectors assigned to this process.

Type:(list)
local_flags(det, name=None, **kwargs)[source]

Locally stored flags.

Parameters:
  • det (str) – Name of the detector.
  • name (str) – Optional cache key to use.
Returns:

A cache reference to a flag vector. If ‘name’ is None a default name ‘flags’ is used and the vector may be constructed and cached using the ‘read_flags’ method. If ‘name’ is given, then the flags must already be cached.

local_hwp_angle(name=None, **kwargs)[source]

Locally stored half-wave plate angle.

Parameters:name (str) – Optional cache key to use.
Returns:A cache reference to a hwp angle vector. If ‘name’ is None a default name ‘hwp_angle’ is used and the vector may be constructed and cached using the ‘read_hwp_angle’ method. If ‘name’ is given, then the angles must already be cached.
local_intervals(intervals)[source]

Translate observation-wide intervals into local sample indices.

local_pointing(det, name=None, **kwargs)[source]

Locally stored pointing.

Parameters:
  • det (str) – Name of the detector.
  • name (str) – Optional cache key to use.
Returns:

A cache reference to a pointing array. If ‘name’ is None a default name ‘quat’ is used and the array may be constructed and cached using the ‘read_pntg’ method. If ‘name’ is given, then the pointing must already be cached.

local_position(name=None, **kwargs)[source]

Locally stored position.

Parameters:name (str) – Optional cache key to use.
Returns:A cache reference to a position array. If ‘name’ is None a default name ‘position’ is used and the array may be constructed and cached using the ‘read_position’ method. If ‘name’ is given, then the position must already be cached.
local_samples

The first element of the tuple is the first global sample assigned to this process. The second element of the tuple is the number of samples assigned to this process.

Type:(2-tuple)
local_signal(det, name=None, **kwargs)[source]

Locally stored signal.

Parameters:
  • det (str) – Name of the detector.
  • name (str) – Optional cache key to use.
Returns:

A cache reference to a signal vector. If ‘name’ is None a default name ‘signal’ is used and the vector may be constructed and cached using the ‘read’ method. If ‘name’ is given, then the signal must already be cached.

local_times(name=None, **kwargs)[source]

Timestamps covering locally stored data.

Parameters:name (str) – Optional cache key to use.
Returns:A cache reference to a timestamp vector. If ‘name’ is None a default name ‘timestamps’ is used and the vector may be constructed and cached using the ‘read_times’ method. If ‘name’ is given, then the times must already be cached.
local_velocity(name=None, **kwargs)[source]

Locally stored velocity.

Parameters:name (str) – Optional cache key to use.
Returns:A cache reference to a velocity array. If ‘name’ is None a default name ‘velocity’ is used and the array may be constructed and cached using the ‘read_velocity’ method. If ‘name’ is given, then the velocity must already be cached.
mpicomm

the communicator assigned to this TOD.

Type:(mpi4py.MPI.Comm)
read(detector=None, local_start=0, n=0, **kwargs)[source]

Read detector data.

This returns the timestream data for a single detector.

Parameters:
  • detector (str) – the name of the detector.
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

An array containing the data.

read_boresight(local_start=0, n=0, **kwargs)[source]

Read boresight quaternion pointing.

This returns the pointing of the boresight in quaternions.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

A 2D array of shape (n, 4)

read_boresight_azel(local_start=0, n=0, **kwargs)[source]

Read boresight Azimuth / Elevation quaternion pointing.

This returns the pointing of the boresight in the horizontal coordinate system, if it exists.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

A 2D array of shape (n, 4)

Raises:

NotImplementedError – if the telescope is not on the Earth.

read_common_flags(local_start=0, n=0, **kwargs)[source]

Read common flags.

This reads the common set of flags that should be applied to all detectors.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

a numpy array containing the flags.

Return type:

(array)

read_flags(detector=None, local_start=0, n=0, **kwargs)[source]

Read detector flags.

This returns the detector-specific flags.

Parameters:
  • detector (str) – the name of the detector.
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

An array containing the detector flags.

read_hwp_angle(local_start=0, n=0, **kwargs)[source]

Read half-wave plate angle

This reads the common HWP angle that should be applied to all detectors.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

a numpy array containing the angles or None if the

angle is not defined.

Return type:

(array)

read_pntg(detector=None, local_start=0, n=0, **kwargs)[source]

Read detector quaternion pointing.

This returns the pointing for a single detector in quaternions.

Parameters:
  • detector (str) – the name of the detector.
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

A 2D array of shape (n, 4)

read_position(local_start=0, n=0, **kwargs)[source]

Read telescope position.

This reads the telescope position in solar system barycenter coordinates (in Kilometers).

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

a 2D numpy array containing the x,y,z coordinates at each

sample.

Return type:

(array)

read_times(local_start=0, n=0, **kwargs)[source]

Read timestamps.

This reads the common set of timestamps that apply to all detectors in the TOD.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

a numpy array containing the timestamps.

Return type:

(array)

read_velocity(local_start=0, n=0, **kwargs)[source]

Read telescope velocity.

This reads the telescope velocity in solar system barycenter coordinates (in Kilometers/s).

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • n (int) – the number of samples to read. If zero, read to end.
Returns:

a 2D numpy array containing the x,y,z velocity components

at each sample.

Return type:

(array)

total_chunks

the full list of sample chunk sizes that were used in the data distribution.

Type:(list)
total_samples

the total number of samples in this TOD.

Type:(int)
write(detector=None, local_start=0, data=None, **kwargs)[source]

Write detector data.

This writes the detector data.

Parameters:
  • detector (str) – the name of the detector.
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • data (array) – the data array.
write_boresight(local_start=0, data=None, **kwargs)[source]

Write boresight quaternion pointing.

This writes the quaternion pointing for the boresight.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • data (array) – 2D array of quaternions with shape[1] == 4.
write_boresight_azel(local_start=0, data=None, **kwargs)[source]

Write boresight Azimuth / Elevation quaternion pointing.

This writes the quaternion pointing for the boresight in the horizontal coordinate system, if it exists.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • data (array) – 2D array of quaternions with shape[1] == 4.
Raises:

RuntimeError or AttributeError – if the telescope is not on the Earth.

write_common_flags(local_start=0, flags=None, **kwargs)[source]

Write common flags.

This writes the common set of flags that should be applied to all detectors.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • flags (array) – array containing the flags to write.
write_flags(detector=None, local_start=0, flags=None, **kwargs)[source]

Write detector flags.

This writes the detector-specific flags.

Parameters:
  • detector (str) – the name of the detector.
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • flags (array) – the detector flags.
write_hwp_angle(local_start=0, hwpangle=None, **kwargs)[source]

Write half-wave plate angle

This writes the common HWP angle that should be applied to all detectors.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • flags (array) – array containing the flags to write.
write_pntg(detector=None, local_start=0, data=None, **kwargs)[source]

Write detector quaternion pointing.

This writes the quaternion pointing for a single detector.

Parameters:
  • detector (str) – the name of the detector.
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • data (array) – 2D array of quaternions with shape[1] == 4.
write_position(local_start=0, pos=None, **kwargs)[source]

Write telescope position.

This writes the telescope position in solar system barycenter coordinates (in Kilometers).

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • pos (array) – the 2D array of x,y,z coordinates at each sample.
write_times(local_start=0, stamps=None, **kwargs)[source]

Write timestamps.

This writes the common set of timestamps that apply to all detectors in the TOD.

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • stamps (array) – the array of timestamps to write.
write_velocity(local_start=0, vel=None, **kwargs)[source]

Write telescope velocity.

This writes the telescope velocity in solar system barycenter coordinates (in Kilometers/s).

Parameters:
  • local_start (int) – the sample offset relative to the first locally assigned sample.
  • vel (array) – the 2D array of x,y,z velocity components at each sample.

The TOD class can act as a storage container for different “flavors” of timestreams as well as a source and sink for the observation data (with the read_*() and write_*() methods). The TOD base class has one member which is a Cache class.

class toast.cache.Cache(pymem=False)[source]

Data cache with explicit memory management.

This class acts as a dictionary of named arrays. Each array may be multi-dimensional.

Parameters:pymem (bool) – if True, use python memory rather than external allocations in C. Only used for testing.
add_alias(alias, name)[source]

Add an alias to a name that already exists in the cache.

Parameters:
  • alias (str) – alias to create
  • name (str) – an existing key in the cache
Returns:

None

aliases()[source]

Return a dictionary of all the aliases to keys in the cache.

Returns:Dictionary of aliases.
Return type:(dict)
clear(pattern=None)[source]

Clear one or more buffers.

Parameters:pattern (str) – a regular expression to match against the buffer names when determining what should be cleared. If None, then all buffers are cleared.
Returns:None
create(name, type, shape)[source]

Create a named data buffer of the given type and shape.

Parameters:
  • name (str) – the name to assign to the buffer.
  • type (numpy.dtype) – one of the supported numpy types.
  • shape (tuple) – a tuple containing the shape of the buffer.
Returns:

a reference to the allocated array.

Return type:

(array)

destroy(name)[source]

Deallocate the specified buffer.

Only call this if all numpy arrays that reference the memory are out of use. If the specified name is an alias, then the alias is simply deleted. If the specified name is an actual buffer, then all aliases pointing to that buffer are also deleted.

Parameters:name (str) – the name of the buffer or alias to destroy.
Returns:None
exists(name)[source]

Check whether a buffer exists.

Parameters:name (str) – the name of the buffer to search for.
Returns:True if a buffer or alias exists with the given name.
Return type:(bool)
keys()[source]

Return a list of all the keys in the cache.

Returns:List of key strings.
Return type:(list)
put(name, data, replace=False)[source]

Create a named data buffer to hold the provided data.

If replace is True, existing buffer of the same name is first destroyed. If replace is True and the name is an alias, it is promoted to a new data buffer.

Parameters:
  • name (str) – the name to assign to the buffer.
  • data (numpy.ndarray) – Numpy array
  • replace (bool) – Overwrite any existing keys
Returns:

a numpy array wrapping the raw data buffer.

Return type:

(array)

reference(name)[source]

Return a numpy array pointing to the buffer.

The returned array will wrap a pointer to the raw buffer, but will not claim ownership. When the numpy array is garbage collected, it will NOT attempt to free the memory (you must manually use the destroy method).

Parameters:name (str) – the name of the buffer to return.
Returns:a numpy array wrapping the raw data buffer.
Return type:(array)
report(silent=False)[source]

Report memory usage.

Parameters:silent (bool) – Count and return the memory without printing.
Returns:Amount of allocated memory in bytes
Return type:(int)

This class looks like a dictionary of numpy arrays, but the memory is allocated outside of Python, which means it can be explicitly managed / freed. This cache member is where alternate flavors of the timestream data are stored.

Each observation can also have a noise model associated with it. An instance of a Noise class (or derived class) describes the noise properties for all detectors in the observation.

class toast.tod.Noise(*, detectors, freqs, psds, mixmatrix=None, indices=None)[source]

Noise objects act as containers for noise PSDs.

Noise is a base class for an object that describes the noise properties of all detectors for a single observation.

Parameters:
  • detectors (list) – Names of detectors.
  • freqs (dict) – Dictionary of arrays of frequencies for psds.
  • psds (dict) – Dictionary of arrays which contain the PSD values for each detector or mixmatrix key.
  • mixmatrix (dict) – Mixing matrix describing how the PSDs should be combined for detector noise. If provided, must contain entries for every detector, and every key specified for a detector must be defined in freqs and psds.
  • indices (dict) – Integer index for every PSD, useful for generating indepedendent and repeateable noise realizations. If absent, runnign indices will be assigned and provided.
detectors

List of detector names

Type:list
keys

List of PSD names

Type:list
Raises:
  • KeyError – If freqs, psds, mixmatrix or indices do not include all relevant entries.
  • ValueError – If vector lengths in freqs and psds do not match.
detectors

list of strings containing the detector names.

Type:(list)
freq(key)[source]

Get the frequencies corresponding to key.

Parameters:key (str) – Detector name or mixing matrix key.
Returns:Frequency bins that are used for the PSD.
Return type:(array)
index(key)[source]

Return the PSD index for key

Parameters:key (std) – Detector name or mixing matrix key.
Returns:PSD index.
Return type:index (int)
keys

list of strings containing the PSD names.

Type:(list)
multiply_invntt(key, data)[source]

Filter the data with inverse noise covariance.

multiply_ntt(key, data)[source]

Filter the data with noise covariance.

psd(key)[source]

Get the PSD corresponding to key.

Parameters:key (str) – Detector name or mixing matrix key.
Returns:PSD matching the key.
Return type:(array)
rate(key)[source]

Get the sample rate for key.

Parameters:key (str) – the detector name or mixing matrix key.
Returns:the sample rate in Hz.
Return type:(float)
weight(det, key)[source]

Return the mixing weight for noise key in det.

Parameters:
  • det (str) – Detector name
  • key (std) – Mixing matrix key.
Returns:

Mixing matrix weight

Return type:

weight (float)

The data used by a TOAST workflow consists of a list of observations, and is encapsulated by the toast.Data class.

class toast.dist.Data(comm=<toast.Comm World MPI communicator = None World MPI size = 1 World MPI rank = 0 Group MPI communicator = None Group MPI size = 1 Group MPI rank = 0 Rank MPI communicator = None >)[source]

Class which represents distributed data

A Data object contains a list of observations assigned to each process group in the Comm.

Parameters:comm (toast.Comm) – the toast Comm class for distributing the data.
clear()[source]

Clear the list of observations.

comm

The toast.Comm over which the data is distributed.

info(handle=None, flag_mask=255, common_flag_mask=255, intervals=None)[source]

Print information about the distributed data.

Information is written to the specified file handle. Only the rank 0 process writes. Optional flag masks are used when computing the number of good samples.

Parameters:
  • handle (descriptor) – file descriptor supporting the write() method. If None, use print().
  • flag_mask (int) – bit mask to use when computing the number of good detector samples.
  • common_flag_mask (int) – bit mask to use when computing the number of good telescope pointings.
  • intervals (str) – optional name of an intervals object to print from each observation.
Returns:

None

obs = None

The list of observations.

split(key)[source]

Split the Data object.

Split the Data object based on the value of key in the observation dictionary.

Parameters:key (str) – Observation key to use.
Returns:List of 2-tuples of the form (value, data)

If you are running with a single process, that process has all observations and all data within each observation locally available. If you are running with more than one process, the data with be distributed across processes.

Data Distribution

Although you can use TOAST without MPI, the package is designed for data that is distributed across many processes. When passing the data through a toast workflow, the data is divided up among processes based on the details of the toast.Comm class that is used and also the shape of the process grid in each observation.

A toast.Comm instance takes the global number of processes available (MPI.COMM_WORLD) and divides them into groups. Each process group is assigned one or more observations. Since observations are independent, this means that different groups can be independently working on separate observations in parallel. It also means that inter-process communication needed when working on a single observation can occur with a smaller set of processes.

class toast.mpi.Comm(world=None, groupsize=0)[source]

Class which represents a two-level hierarchy of MPI communicators.

A Comm object splits the full set of processes into groups of size “group”. If group_size does not divide evenly into the size of the given communicator, then those processes remain idle.

A Comm object stores three MPI communicators: The “world” communicator given here, which contains all processes to consider, a “group” communicator (one per group), and a “rank” communicator which contains the processes with the same group-rank across all groups.

If MPI is not enabled, then all communicators are set to None.

Parameters:
  • world (mpi4py.MPI.Comm) – the MPI communicator containing all processes.
  • group (int) – the size of each process group.
comm_group

The communicator shared by processes within this group.

comm_rank

The communicator shared by processes with the same group_rank.

comm_world

The world communicator.

group

The group containing this process.

group_rank

The rank of this process in the group communicator.

group_size

The size of the group containing this process.

ngroups

The number of process groups.

world_rank

The rank of this process in the world communicator.

world_size

The size of the world communicator.

Just to reiterate, if your toast.Comm has multiple process groups, then each group will have an independent list of observations in toast.Data.obs.

What about the data within an observation? A single observation is owned by exactly one of the process groups. The MPI communicator passed to the TOD constructor is the group communicator. Every process in the group will store some piece of the observation data. The division of data within an observation is controlled by the detranks option to the TOD constructor. This option defines the dimension of the rectangular “process grid” along the detector (as opposed to time) direction. Common values of detranks are:

  • “1” (processes in the group have all detectors for some slice of time)
  • Size of the group communicator (processes in the group have some of the detectors for the whole time range of the observation)

The detranks parameter must divide evenly into the number of processes in the group communicator.

As a concrete example, imagine that MPI.COMM_WORLD has 24 processes. We split this into 4 groups of 6 procesess. There are 6 observations of varying lengths and every group has one or 2 observations. Here is a picture of what data each process would have. The global process number is shown as well as the rank within the group:

_images/toast_data_dist.png

In either case the full dataset is divided into one or more observations, and each observation has one TOD object (and optionally other objects that describe the noise, valid data intervals, etc). The toast “Comm” class has two levels of MPI communicators that can be used to divide many observations between whole groups of processes. In practice this is not always needed, and the default construction of the Comm object just results in one group with all processes.