Funsor is a tensor-like library for functions and distributions¶
Domains¶
-
class
Domain
[source]¶ Bases:
funsor.domains.Domain
An object representing the type and shape of a
Funsor
input or output.-
size
¶
-
Operations¶
Built-in operations¶
-
add
= ops.add¶
-
and_
= ops.and_¶
-
eq
= ops.eq¶
-
ge
= ops.ge¶
-
getitem
= ops.GetitemOp(0)¶ Op encoding an index into one dimension, e.g.
x[:,:,y]
for offset of 2.
-
gt
= ops.gt¶
-
invert
= ops.invert¶
-
le
= ops.le¶
-
lt
= ops.lt¶
-
matmul
= ops.matmul¶
-
mul
= ops.mul¶
-
ne
= ops.ne¶
-
neg
= ops.neg¶
-
or_
= ops.or_¶
-
sub
= ops.sub¶
-
truediv
= ops.truediv¶
-
xor
= ops.xor¶
Operation classes¶
-
class
AssociativeOp
(fn)[source]¶ Bases:
funsor.ops.Op
-
class
AddOp
(fn)[source]¶ Bases:
funsor.ops.AssociativeOp
-
class
LogAddExpOp
(fn)[source]¶ Bases:
funsor.ops.AssociativeOp
-
class
SubOp
(fn)[source]¶ Bases:
funsor.ops.Op
-
class
NegOp
(fn)[source]¶ Bases:
funsor.ops.Op
-
class
ReshapeOp
(shape)[source]¶ Bases:
funsor.ops.Op
-
class
GetitemOp
(offset)[source]¶ Bases:
funsor.ops.Op
Op encoding an index into one dimension, e.g.
x[:,:,y]
for offset of 2.
-
class
ReciprocalOp
(fn)[source]¶ Bases:
funsor.ops.Op
Interpretations¶
Interpreter¶
-
reinterpret
(x)[source]¶ Overloaded reinterpretation of a deferred expression.
This handles a limited class of expressions, raising
ValueError
in unhandled cases.Parameters: x (A funsor or data structure holding funsors.) – An input, typically involving deferred Funsor
s.Returns: A reinterpreted version of the input. Raises: ValueError
-
exception
PatternMissingError
[source]¶ Bases:
NotImplementedError
Funsors¶
Basic Funsors¶
-
reflect
(cls, *args, **kwargs)[source]¶ Construct a funsor, populate
._ast_values
, and cons hash. This is the only interpretation allowed to construct funsors.
-
eager_or_die
(cls, *args)[source]¶ Eagerly execute ops with known implementations. Disallows lazy
Subs
,Unary
,Binary
, andReduce
.Raises: NotImplementedError
no pattern is found.
-
sequential
(cls, *args)[source]¶ Eagerly execute ops with known implementations; additonally execute vectorized ops sequentially if no known vectorized implementation exists.
-
moment_matching
(cls, *args)[source]¶ A moment matching interpretation of
Reduce
expressions. This falls back toeager
in other cases.
-
class
Funsor
(inputs, output, fresh=None, bound=None)[source]¶ Bases:
object
Abstract base class for immutable functional tensors.
Concrete derived classes must implement
__init__()
methods taking hashable*args
and no optional**kwargs
so as to support cons hashing.Derived classes with
.fresh
variables must implement aneager_subs()
method. Derived classes with.bound
variables must implement an_alpha_convert()
method.Parameters: - inputs (OrderedDict) – A mapping from input name to domain. This can be viewed as a typed context or a mapping from free variables to domains.
- output (Domain) – An output domain.
-
dtype
¶
-
shape
¶
-
requires_grad
¶
-
sample
(sampled_vars, sample_inputs=None)[source]¶ Create a Monte Carlo approximation to this funsor by replacing functions of
sampled_vars
withDelta
s.The result is a
Funsor
with the same.inputs
and.output
as the original funsor (plussample_inputs
if provided), so that self can be replaced by the sample in expectation computations:y = x.sample(sampled_vars) assert y.inputs == x.inputs assert y.output == x.output exact = (x.exp() * integrand).reduce(ops.add) approx = (y.exp() * integrand).reduce(ops.add)
If
sample_inputs
is provided, this creates a batch of samples scaled samples.Parameters:
-
unscaled_sample
(sampled_vars, sample_inputs)[source]¶ Internal method to draw an unscaled sample. This should be overridden by subclasses.
-
align
(names)[source]¶ Align this funsor to match given
names
. This is mainly useful in preparation for extracting.data
of afunsor.torch.Tensor
.Parameters: names (tuple) – A tuple of strings representing all names but in a new order. Returns: A permuted funsor equivalent to self. Return type: Funsor
-
eager_subs
(subs)[source]¶ Internal substitution function. This relies on the user-facing
__call__()
method to coerce non-Funsors to Funsors. Once all inputs are Funsors,eager_subs()
implementations can recurse to callSubs
.
-
to_data
(x)[source]¶ Extract a python object from a
Funsor
.Raises a
ValueError
if free variables remain or if the funsor is lazy.Parameters: x – An object, possibly a Funsor
.Returns: A non-funsor equivalent to x
.Raises: ValueError if any free variables remain. Raises: PatternMissingError if funsor is not fully evaluated.
-
class
Variable
(name, output)[source]¶ Bases:
funsor.terms.Funsor
Funsor representing a single free variable.
Parameters: - name (str) – A variable name.
- output (funsor.domains.Domain) – A domain.
-
class
Subs
(arg, subs)[source]¶ Bases:
funsor.terms.Funsor
Lazy substitution of the form
x(u=y, v=z)
.Parameters: - arg (Funsor) – A funsor being substituted into.
- subs (tuple) – A tuple of
(name, value)
pairs, wherename
is a string andvalue
can be coerced to aFunsor
viato_funsor()
.
-
class
Unary
(op, arg)[source]¶ Bases:
funsor.terms.Funsor
Lazy unary operation.
Parameters:
-
class
Binary
(op, lhs, rhs)[source]¶ Bases:
funsor.terms.Funsor
Lazy binary operation.
Parameters:
-
class
Reduce
(op, arg, reduced_vars)[source]¶ Bases:
funsor.terms.Funsor
Lazy reduction over multiple variables.
Parameters:
-
class
Number
(data, dtype=None)[source]¶ Bases:
funsor.terms.Funsor
Funsor backed by a Python number.
Parameters: - data (numbers.Number) – A python number.
- dtype – A nonnegative integer or the string “real”.
-
class
Slice
(name, start, stop, step, dtype)[source]¶ Bases:
funsor.terms.Funsor
Symbolic representation of a Python
slice
object.Parameters:
-
class
Stack
(name, parts)[source]¶ Bases:
funsor.terms.Funsor
Stack of funsors along a new input dimension.
Parameters:
-
class
Cat
(name, parts, part_name=None)[source]¶ Bases:
funsor.terms.Funsor
Concatenate funsors along an existing input dimension.
Parameters:
-
class
Lambda
(var, expr)[source]¶ Bases:
funsor.terms.Funsor
Lazy inverse to
ops.getitem
.This is useful to simulate higher-order functions of integers by representing those functions as arrays.
Parameters: - var (Variable) – A variable to bind.
- expr (funsor) – A funsor.
-
class
Independent
(fn, reals_var, bint_var, diag_var)[source]¶ Bases:
funsor.terms.Funsor
Creates an independent diagonal distribution.
This is equivalent to substitution followed by reduction:
f = ... # a batched distribution assert f.inputs['x_i'] == reals(4, 5) assert f.inputs['i'] == bint(3) g = Independent(f, 'x', 'i', 'x_i') assert g.inputs['x'] == reals(3, 4, 5) assert 'x_i' not in g.inputs assert 'i' not in g.inputs x = Variable('x', reals(3, 4, 5)) g == f(x_i=x['i']).reduce(ops.logaddexp, 'i')
Parameters:
-
to_funsor
(*args, **kwargs)¶ Multiply dispatched method: to_funsor
- Convert to a
Funsor
. Only
Funsor
s and scalars are accepted.param x: An object. param funsor.domains.Domain output: An optional output hint. return: A Funsor equivalent to x
.rtype: Funsor raises: ValueError - Other signatures:
- Funsor object, object object, Domain Funsor, Domain str, Domain Number Number, Domain Tensor Tensor, Domain ndarray ndarray, Domain
- Convert to a
Delta¶
PyTorch¶
-
align_tensor
(new_inputs, x, expand=False)[source]¶ Permute and add dims to a tensor to match desired
new_inputs
.Parameters: - new_inputs (OrderedDict) – A target set of inputs.
- x (funsor.terms.Funsor) – A
Tensor
orNumber
. - expand (bool) – If False (default), set result size to 1 for any input
of
x
not innew_inputs
; if True expand tonew_inputs
size.
Returns: a number or
torch.Tensor
that can be broadcast to other tensors with inputsnew_inputs
.Return type: int or float or torch.Tensor
-
align_tensors
(*args, **kwargs)[source]¶ Permute multiple tensors before applying a broadcasted op.
This is mainly useful for implementing eager funsor operations.
Parameters: - *args (funsor.terms.Funsor) – Multiple
Tensor
s andNumber
s. - expand (bool) – Whether to expand input tensors. Defaults to False.
Returns: a pair
(inputs, tensors)
where tensors are alltorch.Tensor
s that can be broadcast together to a single data with giveninputs
.Return type: - *args (funsor.terms.Funsor) – Multiple
-
class
Tensor
(data, inputs=None, dtype='real')[source]¶ Bases:
funsor.terms.Funsor
Funsor backed by a PyTorch Tensor.
This follows the
torch.distributions
convention of arranging named “batch” dimensions on the left and remaining “event” dimensions on the right. The output shape is determined by all remaining dims. For example:data = torch.zeros(5,4,3,2) x = Tensor(data, OrderedDict([("i", bint(5)), ("j", bint(4))])) assert x.output == reals(3, 2)
Operators like
matmul
and.sum()
operate only on the output shape, and will not change the named inputs.Parameters: - data (torch.Tensor) – A PyTorch tensor.
- inputs (OrderedDict) – An optional mapping from input name (str) to
datatype (
Domain
). Defaults to empty. - dtype (int or the string "real".) – optional output datatype. Defaults to “real”.
-
requires_grad
¶
-
arange
(name, *args, **kwargs)[source]¶ Helper to create a named
torch.arange()
funsor. In some cases this can be replaced by a symbolicSlice
.Parameters: Return type:
-
materialize
(x)[source]¶ Attempt to convert a Funsor to a
Number
orTensor
by substitutingarange()
s into its free variables.Parameters: x (Funsor) – A funsor. Return type: Funsor
-
class
Function
(fn, output, args)[source]¶ Bases:
funsor.terms.Funsor
Funsor wrapped by a PyTorch function.
Functions are assumed to support broadcasting and can be eagerly evaluated on funsors with free variables of int type (i.e. batch dimensions).
Function
s are usually created via thefunction()
decorator.Parameters: - fn (callable) – A PyTorch function to wrap.
- output (funsor.domains.Domain) – An output domain.
- args (Funsor) – Funsor arguments.
-
function
(*signature)[source]¶ Decorator to wrap a PyTorch function.
Example:
@funsor.torch.function(reals(3,4), reals(4,5), reals(3,5)) def matmul(x, y): return torch.matmul(x, y) @funsor.torch.function(reals(10), reals(10, 10), reals()) def mvn_log_prob(loc, scale_tril, x): d = torch.distributions.MultivariateNormal(loc, scale_tril) return d.log_prob(x)
To support functions that output nested tuples of tensors, specify a nested tuple of output types, for example:
@funsor.torch.function(reals(8), (reals(), bint(8))) def max_and_argmax(x): return torch.max(x, dim=-1)
Parameters: *signature – A sequence if input domains followed by a final output domain or nested tuple of output domains.
-
class
Einsum
(equation, operands)[source]¶ Bases:
funsor.terms.Funsor
Wrapper around
torch.einsum()
to operate on real-valued Funsors.Note this operates only on the
output
tensor. To perform sum-product contractions on named dimensions, instead use+
andReduce
.Parameters: - equation (str) – An
torch.einsum()
equation. - operands (tuple) – A tuple of input funsors.
- equation (str) – An
-
torch_tensordot
(x, y, dims)[source]¶ Wrapper around
torch.tensordot()
to operate on real-valued Funsors.Note this operates only on the
output
tensor. To perform sum-product contractions on named dimensions, instead use+
andReduce
.Arguments should satisfy:
len(x.shape) >= dims len(y.shape) >= dims dims == 0 or x.shape[-dims:] == y.shape[:dims]
Parameters: Return type:
NumPy¶
-
align_array
(new_inputs, x)[source]¶ Permute and expand an array to match desired
new_inputs
.Parameters: - new_inputs (OrderedDict) – A target set of inputs.
- x (funsor.terms.Funsor) – A
Array
s or orNumber
.
Returns: a number or
numpy.ndarray
that can be broadcast to other array with inputsnew_inputs
.Return type:
-
align_arrays
(*args)[source]¶ Permute multiple arrays before applying a broadcasted op.
This is mainly useful for implementing eager funsor operations.
Parameters: *args (funsor.terms.Funsor) – Multiple Array
s andNumber
s.Returns: a pair (inputs, arrays)
where arrayss are allnumpy.ndarray
s that can be broadcast together to a single data with giveninputs
.Return type: tuple
-
class
ArrayMeta
(name, bases, dct)[source]¶ Bases:
funsor.terms.FunsorMeta
Wrapper to fill in default args and convert between OrderedDict and tuple.
-
class
Array
(data, inputs=None, dtype='real')[source]¶ Bases:
funsor.terms.Funsor
Funsor backed by a NumPy Array.
This follows the
torch.distributions
convention of arranging named “batch” dimensions on the left and remaining “event” dimensions on the right. The output shape is determined by all remaining dims. For example:data = np.zeros((5,4,3,2)) x = Array(data, OrderedDict([("i", bint(5)), ("j", bint(4))])) assert x.output == reals(3, 2)
Operators like
matmul
and.sum()
operate only on the output shape, and will not change the named inputs.Parameters:
-
arange
(name, size)[source]¶ Helper to create a named
numpy.arange()
funsor.Parameters: Return type:
-
materialize
(x)[source]¶ Attempt to convert a Funsor to a
Number
ornumpy.ndarray
by substitutingarange()
s into its free variables.
Gaussian¶
-
class
BlockVector
(shape)[source]¶ Bases:
object
Jit-compatible helper to build blockwise vectors. Syntax is similar to
torch.zeros()
x = BlockVector((100, 20)) x[..., 0:4] = x1 x[..., 6:10] = x2 x = x.as_tensor() assert x.shape == (100, 20)
-
class
BlockMatrix
(shape)[source]¶ Bases:
object
Jit-compatible helper to build blockwise matrices. Syntax is similar to
torch.zeros()
x = BlockMatrix((100, 20, 20)) x[..., 0:4, 0:4] = x11 x[..., 0:4, 6:10] = x12 x[..., 6:10, 0:4] = x12.transpose(-1, -2) x[..., 6:10, 6:10] = x22 x = x.as_tensor() assert x.shape == (100, 20, 20)
-
align_gaussian
(new_inputs, old)[source]¶ Align data of a Gaussian distribution to a new
inputs
shape.
-
class
Gaussian
(info_vec, precision, inputs)[source]¶ Bases:
funsor.terms.Funsor
Funsor representing a batched joint Gaussian distribution as a log-density function.
Mathematically, a Gaussian represents the density function:
f(x) = < x | info_vec > - 0.5 * < x | precision | x > = < x | info_vec - 0.5 * precision @ x >
Note that
Gaussian
s are not normalized, rather they are canonicalized to evaluate to zero log density at the origin:f(0) = 0
. This canonical form is useful in combination with the information filter representation because it allowsGaussian
s with incomplete information, i.e. zero eigenvalues in the precision matrix. These incomplete distributions arise when making low-dimensional observations on higher dimensional hidden state.Parameters: - info_vec (torch.Tensor) – An optional batched information vector,
where
info_vec = precision @ mean
. - precision (torch.Tensor) – A batched positive semidefinite precision matrix.
- inputs (OrderedDict) – Mapping from name to
Domain
.
- info_vec (torch.Tensor) – An optional batched information vector,
where
Joint¶
Contraction¶
-
class
Contraction
(red_op, bin_op, reduced_vars, terms)[source]¶ Bases:
funsor.terms.Funsor
Declarative representation of a finitary sum-product operation.
After normalization via the
normalize()
interpretation contractions will canonically order their terms by type:Delta, Number, Tensor, Gaussian
-
GaussianMixture
¶ alias of
funsor.cnf.Contraction
Integrate¶
-
class
Integrate
(log_measure, integrand, reduced_vars)[source]¶ Bases:
funsor.terms.Funsor
Funsor representing an integral wrt a log density funsor.
Parameters:
Optimizer¶
Adjoint Algorithms¶
Sum-Product Algorithms¶
-
partial_sum_product
(sum_op, prod_op, factors, eliminate=frozenset(), plates=frozenset())[source]¶ Performs partial sum-product contraction of a collection of factors.
Returns: a list of partially contracted Funsors. Return type: list
-
sum_product
(sum_op, prod_op, factors, eliminate=frozenset(), plates=frozenset())[source]¶ Performs sum-product contraction of a collection of factors.
Returns: a single contracted Funsor. Return type: Funsor
-
sequential_sum_product
(sum_op, prod_op, trans, time, step)[source]¶ For a funsor
trans
with dimensionstime
,prev
andcurr
, computes a recursion equivalent to:tail_time = 1 + arange("time", trans.inputs["time"].size - 1) tail = sequential_sum_product(sum_op, prod_op, trans(time=tail_time), time, {"prev": "curr"}) return prod_op(trans(time=0)(curr="drop"), tail(prev="drop")) .reduce(sum_op, "drop")
but does so efficiently in parallel in O(log(time)).
Parameters: - sum_op (AssociativeOp) – A semiring sum operation.
- prod_op (AssociativeOp) – A semiring product operation.
- trans (Funsor) – A transition funsor.
- time (Variable) – The time input dimension.
- step (dict) – A dict mapping previous variables to current variables. This can contain multiple pairs of prev->curr variable names.
-
class
MarkovProductMeta
(name, bases, dct)[source]¶ Bases:
funsor.terms.FunsorMeta
Wrapper to convert
step
to a tuple and fill in defaultstep_names
.
-
class
MarkovProduct
(sum_op, prod_op, trans, time, step, step_names)[source]¶ Bases:
funsor.terms.Funsor
Lazy representation of
sequential_sum_product()
.Parameters: - sum_op (AssociativeOp) – A marginalization op.
- prod_op (AssociativeOp) – A Bayesian fusion op.
- trans (Funsor) – A sequence of transition factors,
usually varying along the
time
input. - time (str or Variable) – A time dimension.
- step (dict) – A str-to-str mapping of “previous” inputs of
trans
to “current” inputs oftrans
. - step_names (dict) – Optional, for internal use by alpha conversion.
Affine Pattern Matching¶
-
is_affine
(fn)[source]¶ A sound but incomplete test to determine whether a funsor is affine with respect to all of its real inputs.
Parameters: fn (Funsor) – A funsor. Return type: bool
-
affine_inputs
(fn)[source]¶ Returns a [sound sub]set of real inputs of
fn
wrt whichfn
is known to be affine.Parameters: fn (Funsor) – A funsor. Returns: A set of input names wrt which fn
is affine.Return type: frozenset
-
extract_affine
(fn)[source]¶ Extracts an affine representation of a funsor, satisfying:
x = ... const, coeffs = extract_affine(x) y = sum(Einsum(eqn, (coeff, Variable(var, coeff.output))) for var, (coeff, eqn) in coeffs.items()) assert_close(y, x) assert frozenset(coeffs) == affine_inputs(x)
The
coeffs
will have one key per input wrt whichfn
is known to be affine (viaaffine_inputs()
), andconst
andcoeffs.values
will all be constant wrt these inputs.The affine approximation is computed by ev evaluating
fn
at zero and each basis vector. To improve performance, users may want to run under thememoize()
interpretation.Parameters: fn (Funsor) – A funsor that is affine wrt the (add,mul) semiring in some subset of its inputs. Returns: A pair (const, coeffs)
where const is a funsor with no real inputs andcoeffs
is an OrderedDict mapping input name to a(coefficient, eqn)
pair in einsum form.Return type: tuple
Testing Utiltites¶
-
class
ActualExpected
[source]¶ Bases:
funsor.testing.LazyComparison
Lazy string formatter for test assertions.
-
random_tensor
(inputs, output=reals())[source]¶ Creates a random
funsor.torch.Tensor
with given inputs and output.
-
random_array
(inputs, output)[source]¶ Creates a random
funsor.numpy.Array
with given inputs and output.
-
random_gaussian
(inputs)[source]¶ Creates a random
funsor.gaussian.Gaussian
with given inputs.
Pyro-Compatible Distributions¶
This interface provides a number of PyTorch-style distributions that use
funsors internally to perform inference. These high-level objects are based on
a wrapping class: FunsorDistribution
which
wraps a funsor in a PyTorch-distributions-compatible interface.
FunsorDistribution
objects can be used
directly in Pyro models (using the standard Pyro backend).
FunsorDistribution Base Class¶
-
class
FunsorDistribution
(funsor_dist, batch_shape=torch.Size([]), event_shape=torch.Size([]), dtype='real', validate_args=None)[source]¶ Bases:
pyro.distributions.torch_distribution.TorchDistribution
Distribution
wrapper around aFunsor
for use in Pyro code. This is typically used as a base class for specific funsor inference algorithms wrapped in a distribution interface.Parameters: - funsor_dist (funsor.terms.Funsor) – A funsor with an input named “value” that is treated as a random variable. The distribution should be normalized over “value”.
- batch_shape (torch.Size) – The distribution’s batch shape. This must
be in the same order as the input of the
funsor_dist
, but may contain extra dims of size 1. - event_shape – The distribution’s event shape.
-
arg_constraints
= {}¶
-
support
¶
Hidden Markov Models¶
-
class
DiscreteHMM
(initial_logits, transition_logits, observation_dist, validate_args=None)[source]¶ Bases:
funsor.pyro.distribution.FunsorDistribution
Hidden Markov Model with discrete latent state and arbitrary observation distribution. This uses [1] to parallelize over time, achieving O(log(time)) parallel complexity.
The event_shape of this distribution includes time on the left:
event_shape = (num_steps,) + observation_dist.event_shape
This distribution supports any combination of homogeneous/heterogeneous time dependency of
transition_logits
andobservation_dist
. However, because time is included in this distribution’s event_shape, the homogeneous+homogeneous case will have a broadcastable event_shape withnum_steps = 1
, allowinglog_prob()
to work with arbitrary length data:# homogeneous + homogeneous case: event_shape = (1,) + observation_dist.event_shape
This class should be interchangeable with
pyro.distributions.hmm.DiscreteHMM
.References:
- [1] Simo Sarkka, Angel F. Garcia-Fernandez (2019)
- “Temporal Parallelization of Bayesian Filters and Smoothers” https://arxiv.org/pdf/1905.13002.pdf
Parameters: - initial_logits (Tensor) – A logits tensor for an initial
categorical distribution over latent states. Should have rightmost size
state_dim
and be broadcastable tobatch_shape + (state_dim,)
. - transition_logits (Tensor) – A logits tensor for transition
conditional distributions between latent states. Should have rightmost
shape
(state_dim, state_dim)
(old, new), and be broadcastable tobatch_shape + (num_steps, state_dim, state_dim)
. - observation_dist (Distribution) – A conditional
distribution of observed data conditioned on latent state. The
.batch_shape
should have rightmost sizestate_dim
and be broadcastable tobatch_shape + (num_steps, state_dim)
. The.event_shape
may be arbitrary.
-
has_rsample
¶
-
class
GaussianHMM
(initial_dist, transition_matrix, transition_dist, observation_matrix, observation_dist, validate_args=None)[source]¶ Bases:
funsor.pyro.distribution.FunsorDistribution
Hidden Markov Model with Gaussians for initial, transition, and observation distributions. This adapts [1] to parallelize over time to achieve O(log(time)) parallel complexity, however it differs in that it tracks the log normalizer to ensure
log_prob()
is differentiable.This corresponds to the generative model:
z = initial_distribution.sample() x = [] for t in range(num_steps): z = z @ transition_matrix + transition_dist.sample() x.append(z @ observation_matrix + observation_dist.sample())
The event_shape of this distribution includes time on the left:
event_shape = (num_steps,) + observation_dist.event_shape
This distribution supports any combination of homogeneous/heterogeneous time dependency of
transition_dist
andobservation_dist
. However, because time is included in this distribution’s event_shape, the homogeneous+homogeneous case will have a broadcastable event_shape withnum_steps = 1
, allowinglog_prob()
to work with arbitrary length data:event_shape = (1, obs_dim) # homogeneous + homogeneous case
This class should be compatible with
pyro.distributions.hmm.GaussianHMM
, but additionally supports funsoradjoint
algorithms.References:
- [1] Simo Sarkka, Angel F. Garcia-Fernandez (2019)
- “Temporal Parallelization of Bayesian Filters and Smoothers” https://arxiv.org/pdf/1905.13002.pdf
Variables: Parameters: - initial_dist (MultivariateNormal) – A distribution
over initial states. This should have batch_shape broadcastable to
self.batch_shape
. This should have event_shape(hidden_dim,)
. - transition_matrix (Tensor) – A linear transformation of hidden
state. This should have shape broadcastable to
self.batch_shape + (num_steps, hidden_dim, hidden_dim)
where the rightmost dims are ordered(old, new)
. - transition_dist (MultivariateNormal) – A process
noise distribution. This should have batch_shape broadcastable to
self.batch_shape + (num_steps,)
. This should have event_shape(hidden_dim,)
. - transition_matrix – A linear transformation from hidden
to observed state. This should have shape broadcastable to
self.batch_shape + (num_steps, hidden_dim, obs_dim)
. - observation_dist (MultivariateNormal or
Normal) – An observation noise distribution. This should
have batch_shape broadcastable to
self.batch_shape + (num_steps,)
. This should have event_shape(obs_dim,)
.
-
has_rsample
= True¶
-
arg_constraints
= {}¶
-
class
GaussianMRF
(initial_dist, transition_dist, observation_dist, validate_args=None)[source]¶ Bases:
funsor.pyro.distribution.FunsorDistribution
Temporal Markov Random Field with Gaussian factors for initial, transition, and observation distributions. This adapts [1] to parallelize over time to achieve O(log(time)) parallel complexity, however it differs in that it tracks the log normalizer to ensure
log_prob()
is differentiable.The event_shape of this distribution includes time on the left:
event_shape = (num_steps,) + observation_dist.event_shape
This distribution supports any combination of homogeneous/heterogeneous time dependency of
transition_dist
andobservation_dist
. However, because time is included in this distribution’s event_shape, the homogeneous+homogeneous case will have a broadcastable event_shape withnum_steps = 1
, allowinglog_prob()
to work with arbitrary length data:event_shape = (1, obs_dim) # homogeneous + homogeneous case
This class should be compatible with
pyro.distributions.hmm.GaussianMRF
, but additionally supports funsoradjoint
algorithms.References:
- [1] Simo Sarkka, Angel F. Garcia-Fernandez (2019)
- “Temporal Parallelization of Bayesian Filters and Smoothers” https://arxiv.org/pdf/1905.13002.pdf
Variables: Parameters: - initial_dist (MultivariateNormal) – A distribution
over initial states. This should have batch_shape broadcastable to
self.batch_shape
. This should have event_shape(hidden_dim,)
. - transition_dist (MultivariateNormal) – A joint
distribution factor over a pair of successive time steps. This should
have batch_shape broadcastable to
self.batch_shape + (num_steps,)
. This should have event_shape(hidden_dim + hidden_dim,)
(old+new). - observation_dist (MultivariateNormal) – A joint
distribution factor over a hidden and an observed state. This should
have batch_shape broadcastable to
self.batch_shape + (num_steps,)
. This should have event_shape(hidden_dim + obs_dim,)
.
-
has_rsample
= True¶
-
class
SwitchingLinearHMM
(initial_logits, initial_mvn, transition_logits, transition_matrix, transition_mvn, observation_matrix, observation_mvn, exact=False, validate_args=None)[source]¶ Bases:
funsor.pyro.distribution.FunsorDistribution
Switching Linear Dynamical System represented as a Hidden Markov Model.
This corresponds to the generative model:
z = Categorical(logits=initial_logits).sample() y = initial_mvn[z].sample() x = [] for t in range(num_steps): z = Categorical(logits=transition_logits[t, z]).sample() y = y @ transition_matrix[t, z] + transition_mvn[t, z].sample() x.append(y @ observation_matrix[t, z] + observation_mvn[t, z].sample())
Viewed as a dynamic Bayesian network:
z[t-1] ----> z[t] ---> z[t+1] Discrete latent class | \ | \ | \ | y[t-1] ----> y[t] ----> y[t+1] Gaussian latent state | / | / | / V / V / V / x[t-1] x[t] x[t+1] Gaussian observation
Let
class
be the latent class,state
be the latent multivariate normal state, andvalue
be the observed multivariate normal value.Parameters: - initial_logits (Tensor) – Represents
p(class[0])
. - initial_mvn (MultivariateNormal) – Represents
p(state[0] | class[0])
. - transition_logits (Tensor) – Represents
p(class[t+1] | class[t])
. - transition_matrix (Tensor) –
- transition_mvn (MultivariateNormal) – Together
with
transition_matrix
, this representsp(state[t], state[t+1] | class[t])
. - observation_matrix (Tensor) –
- observation_mvn (MultivariateNormal) – Together
with
observation_matrix
, this representsp(value[t+1], state[t+1] | class[t+1])
. - exact (bool) – If True, perform exact inference at cost exponential in
num_steps
. If False, use amoment_matching()
approximation and use parallel scan algorithm to reduce parallel complexity to logarithmic innum_steps
. Defaults to False.
-
has_rsample
= True¶
-
arg_constraints
= {}¶
-
filter
(value)[source]¶ Compute posterior over final state given a sequence of observations.
Parameters: value (Tensor) – A sequence of observations. Returns: A posterior distribution over latent states at the final time step, represented as a pair (cat, mvn)
, whereCategorical
distribution over mixture components andmvn
is aMultivariateNormal
with rightmost batch dimension ranging over mixture components. This can then be used to initialize a sequential Pyro model for prediction.Return type: tuple
- initial_logits (Tensor) – Represents
Conversion Utilities¶
This module follows a convention for converting between funsors and PyTorch distribution objects. This convention is compatible with NumPy/PyTorch-style broadcasting. Following PyTorch distributions (and Tensorflow distributions), we consider “event shapes” to be on the right and broadcast-compatible “batch shapes” to be on the left.
This module also aims to be forgiving in inputs and pedantic in outputs:
methods accept either the superclass torch.distributions.Distribution
objects or the subclass pyro.distributions.TorchDistribution
objects.
Methods return only the narrower subclass
pyro.distributions.TorchDistribution
objects.
-
tensor_to_funsor
(tensor, event_inputs=(), event_output=0, dtype='real')[source]¶ Convert a
torch.Tensor
to afunsor.torch.Tensor
.Note this should not touch data, but may trigger a
torch.Tensor.reshape()
op.Parameters: - tensor (torch.Tensor) – A PyTorch tensor.
- event_inputs (tuple) – A tuple of names for rightmost tensor
dimensions. If
tensor
has these names, they will be converted toresult.inputs
. - event_output (int) – The number of tensor dimensions assigned to
result.output
. These must be on the right of anyevent_input
dimensions.
Returns: A funsor.
Return type:
-
funsor_to_tensor
(funsor_, ndims, event_inputs=())[source]¶ Convert a
funsor.torch.Tensor
to atorch.Tensor
.Note this should not touch data, but may trigger a
torch.Tensor.reshape()
op.Parameters: - funsor (funsor.torch.Tensor) – A funsor.
- ndims (int) – The number of result dims,
== result.dim()
. - event_inputs (tuple) – Names assigned to rightmost dimensions.
Returns: A PyTorch tensor.
Return type:
-
mvn_to_funsor
(pyro_dist, event_dims=(), real_inputs={})[source]¶ Convert a joint
torch.distributions.MultivariateNormal
distribution into aFunsor
with multiple real inputs.This should satisfy:
sum(d.num_elements for d in real_inputs.values()) == pyro_dist.event_shape[0]
Parameters: - pyro_dist (torch.distributions.MultivariateNormal) – A multivariate normal distribution over one or more variables of real or vector or tensor type.
- event_dims (tuple) – A tuple of names for rightmost dimensions.
These will be assigned to
result.inputs
of typebint
. - real_inputs (OrderedDict) – A dict mapping real variable name
to appropriately sized
reals()
. The sum of all.numel()
of all real inputs should be equal to thepyro_dist
dimension.
Returns: A funsor with given
real_inputs
and possibly additional bint inputs.Return type:
-
funsor_to_mvn
(gaussian, ndims, event_inputs=())[source]¶ Convert a
Funsor
to apyro.distributions.MultivariateNormal
, dropping the normalization constant.Parameters: - gaussian (funsor.gaussian.Gaussian or funsor.joint.Joint) – A Gaussian funsor.
- ndims (int) – The number of batch dimensions in the result.
- event_inputs (tuple) – A tuple of names to assign to rightmost dimensions.
Returns: a multivariate normal distribution.
Return type:
-
funsor_to_cat_and_mvn
(funsor_, ndims, event_inputs)[source]¶ Converts a labeled gaussian mixture model to a pair of distributions.
Parameters: - funsor (funsor.joint.Joint) – A Gaussian mixture funsor.
- ndims (int) – The number of batch dimensions in the result.
Returns: A pair
(cat, mvn)
, wherecat
is aCategorical
distribution over mixture components andmvn
is aMultivariateNormal
with rightmost batch dimension ranging over mixture components.
-
class
AffineNormal
(matrix, loc, scale, value_x, value_y)[source]¶ Bases:
funsor.terms.Funsor
Represents a conditional diagonal normal distribution over a random variable
Y
whose mean is an affine function of a random variableX
. The likelihood ofX
is thus:AffineNormal(matrix, loc, scale).condition(y).log_density(x)
which is equivalent to:
Normal(x @ matrix + loc, scale).to_event(1).log_prob(y)
Parameters: - matrix (Funsor) – A transformation from
X
toY
. Should have rightmost shape(x_dim, y_dim)
. - loc (Funsor) – A constant offset for
Y
’s mean. Should have output shape(y_dim,)
. - scale (Funsor) – Standard deviation for
Y
. Should have output shape(y_dim,)
. - value_x (Funsor) – A value
X
. - value_y (Funsor) – A value
Y
.
- matrix (Funsor) – A transformation from
-
matrix_and_mvn_to_funsor
(matrix, mvn, event_dims=(), x_name='value_x', y_name='value_y')[source]¶ Convert a noisy affine function to a Gaussian. The noisy affine function is defined as:
y = x @ matrix + mvn.sample()
The result is a non-normalized Gaussian funsor with two real inputs,
x_name
andy_name
, corresponding to a conditional distribution of real vectory` given real vector ``x
.Parameters: - matrix (torch.Tensor) – A matrix with rightmost shape
(x_size, y_size)
. - mvn (torch.distributions.MultivariateNormal or
torch.distributions.Independent of torch.distributions.Normal) – A multivariate normal distribution with
event_shape == (y_size,)
. - event_dims (tuple) – A tuple of names for rightmost dimensions.
These will be assigned to
result.inputs
of typebint
. - x_name (str) – The name of the
x
random variable. - y_name (str) – The name of the
y
random variable.
Returns: A funsor with given
real_inputs
and possibly additional bint inputs.Return type: - matrix (torch.Tensor) – A matrix with rightmost shape
-
dist_to_funsor
(pyro_dist, event_inputs=())[source]¶ Convert a PyTorch distribution to a Funsor.
This is currently implemented for only a subset of distribution types.
Parameters: torch.distribution.Distribution – A PyTorch distribution. Returns: A funsor. Return type: funsor.terms.Funsor
Distribution Funsors¶
This interface provides a number of standard normalized probability distributions implemented as funsors.
-
class
Distribution
(*args)[source]¶ Bases:
funsor.terms.Funsor
Funsor backed by a PyTorch distribution object.
Parameters: *args – Distribution-dependent parameters. These can be either funsors or objects that can be coerced to funsors via to_funsor()
. See derived classes for details.-
dist_class
= 'defined by derived classes'¶
-
-
class
BernoulliLogits
(logits, value=None)[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Bernoulli
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Bernoulli
-
-
Bernoulli
(probs=None, logits=None, value='value')[source]¶ Wraps
pyro.distributions.Bernoulli
.This dispatches to either
BernoulliProbs
orBernoulliLogits
to accept eitherprobs
orlogits
args.Parameters:
-
class
Beta
(concentration1, concentration0, value=None)[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Beta
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Beta
-
-
class
Binomial
(total_count, probs, value=None)[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Binomial
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Binomial
-
-
class
Categorical
(probs, value='value')[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Categorical
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Categorical
-
-
class
Delta
(v, log_density=0, value='value')[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Delta
.Parameters: -
dist_class
¶ alias of
pyro.distributions.delta.Delta
-
-
class
Dirichlet
(concentration, value='value')[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Dirichlet
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Dirichlet
-
-
class
DirichletMultinomial
(concentration, total_count, value='value')[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.DirichletMultinomial
.Parameters: -
dist_class
¶ alias of
pyro.distributions.conjugate.DirichletMultinomial
-
-
LogNormal
(loc, scale, value='value')[source]¶ Wraps
pyro.distributions.LogNormal
.Parameters:
-
class
Multinomial
(total_count, probs, value=None)[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Multinomial
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Multinomial
-
-
class
Normal
(loc, scale, value='value')[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Normal
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Normal
-
-
class
MultivariateNormal
(loc, scale_tril, value='value')[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.MultivariateNormal
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.MultivariateNormal
-
-
class
Poisson
(rate, value=None)[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Poisson
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Poisson
-
-
class
Gamma
(concentration, rate, value=None)[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.Gamma
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.Gamma
-
-
class
VonMises
(loc, concentration, value=None)[source]¶ Bases:
funsor.distributions.Distribution
Wraps
pyro.distributions.VonMises
.Parameters: -
dist_class
¶ alias of
pyro.distributions.torch.VonMises
-
Mini-Pyro Interface¶
This interface provides a backend for the Pyro probabilistic programming
language. This interface is intended to be used indirectly by writing standard
Pyro code and setting pyro_backend("funsor")
. See examples/minipyro.py for
example usage.
Mini Pyro¶
This file contains a minimal implementation of the Pyro Probabilistic
Programming Language. The API (method signatures, etc.) match that of
the full implementation as closely as possible. This file is independent
of the rest of Pyro, with the exception of the pyro.distributions
module.
An accompanying example that makes use of this implementation can be found at examples/minipyro.py.
-
class
trace
(fn=None)[source]¶ Bases:
funsor.minipyro.Messenger
-
class
replay
(fn, guide_trace)[source]¶ Bases:
funsor.minipyro.Messenger
-
class
block
(fn=None, hide_fn=<function block.<lambda>>)[source]¶ Bases:
funsor.minipyro.Messenger
-
class
seed
(fn=None, rng_seed=None)[source]¶ Bases:
funsor.minipyro.Messenger
-
class
CondIndepStackFrame
(name, size, dim)¶ Bases:
tuple
-
dim
¶ Alias for field number 2
-
name
¶ Alias for field number 0
-
size
¶ Alias for field number 1
-
-
class
PlateMessenger
(fn, name, size, dim)[source]¶ Bases:
funsor.minipyro.Messenger
-
class
log_joint
(fn=None)[source]¶ Bases:
funsor.minipyro.Messenger
-
class
Adam
(optim_args)[source]¶ Bases:
funsor.minipyro.PyroOptim
-
TorchOptimizer
¶ alias of
torch.optim.adam.Adam
-
-
class
ClippedAdam
(optim_args)[source]¶ Bases:
funsor.minipyro.PyroOptim
-
TorchOptimizer
¶ alias of
pyro.optim.clipped_adam.ClippedAdam
-
-
class
Trace_ELBO
(**kwargs)[source]¶ Bases:
funsor.minipyro.ELBO
-
class
TraceMeanField_ELBO
(**kwargs)[source]¶ Bases:
funsor.minipyro.ELBO
-
class
TraceEnum_ELBO
(**kwargs)[source]¶ Bases:
funsor.minipyro.ELBO
-
class
Jit_ELBO
(elbo, **kwargs)[source]¶ Bases:
funsor.minipyro.ELBO
Einsum Interface¶
This interface implements tensor variable elimination among tensors. In particular it does not implement continuous variable elimination.
-
naive_plated_einsum
(eqn, *terms, **kwargs)[source]¶ Implements Tensor Variable Elimination (Algorithm 1 in [Obermeyer et al 2019])
- [Obermeyer et al 2019] Obermeyer, F., Bingham, E., Jankowiak, M., Chiu, J.,
- Pradhan, N., Rush, A., and Goodman, N. Tensor Variable Elimination for Plated Factor Graphs, 2019
-
einsum
(eqn, *terms, **kwargs)[source]¶ Top-level interface for optimized tensor variable elimination.
Parameters: - equation (str) – An einsum equation.
- *terms (funsor.terms.Funsor) – One or more operands.
- plates (set) – Optional keyword argument denoting which funsor dimensions are plate dimensions. Among all input dimensions (from terms): dimensions in plates but not in outputs are product-reduced; dimensions in neither plates nor outputs are sum-reduced.