Probabilistic numerics casts numerical tasks, such the numerical solution of differential equations, as inference problems to be solved. One approach is to model the unknown quantity of interest as a random variable, and to constrain this variable using data generated during the course of a traditional numerical method. However, data may be nonlinearly related to the quantity of interest, rendering the proper conditioning of random variables difficult and limiting the range of numerical tasks that can be addressed. Instead, this paper proposes to construct probabilistic numerical methods based only on the final output from a traditional method. A convergent sequence of approximations to the quantity of interest constitute a dataset, from which the limiting quantity of interest can be extrapolated, in a probabilistic analogue of Richardson’s deferred approach to the limit. This black box approach (1) massively expands the range of tasks to which probabilistic numerics can be applied, (2) inherits the features and performance of state-of-the-art numerical methods, and (3) enables provably higher orders of convergence to be achieved. Applications are presented for nonlinear ordinary and partial differential equations, as well as for eigenvalue problems-a setting for which no probabilistic numerical methods have yet been developed.

This article attempts to cast the emergence of probabilistic numerics as a mathematical-statistical research field within its historical context and to explore how its gradual development can be related to modern formal treatments and applications. We highlight in particular the parallel contributions of Sul’din and Larkin in the 1960s and how their pioneering early ideas have reached a degree of maturity in the intervening period, mediated by paradigms such as average-case analysis and information-based complexity. We provide a subjective assessment of the state of research in probabilistic numerics and highlight some difficulties to be addressed by future works.

Optimality Criteria for Probabilistic Numerical Methods

It is well understood that Bayesian decision theory and average case analysis are essentially identical. However, if one is interested in performing uncertainty quantification for a numerical task, it can be argued that the decision-theoretic framework is neither appropriate nor sufficient. To this end, we consider an alternative optimality criterion from Bayesian experimental design and study its implied optimal information in the numerical context. This information is demonstrated to differ, in general, from the information that would be used in an average-case-optimal numerical method. The explicit connection to Bayesian experimental design suggests several distinct regimes in which optimal probabilistic numerical methods can be developed.

2017

Universal Scalable Robust Solvers from Computational Information Games and fast eigenspace adapted Multiresolution Analysis

We show how the discovery of robust scalable numerical solvers for arbitrary bounded linear operators can be automated as a Game Theory problem by reformulating the process of computing with partial information and limited resources as that of playing underlying hierarchies of adversarial information games. When the solution space is a Banach space \B endowed with a quadratic norm {}\textbar}cdot}\textbar the optimal measure (mixed strategy) for such games (e.g. the adversarial recovery of \u}in B given partial measurements \[}phi_i, u] with {}phi_i}in B^* using relative error in {}\textbar}cdot}\textbar\-norm as a loss) is a centered Gaussian field {}xi solely determined by the norm {}\textbar}cdot}\textbar whose conditioning (on measurements) produces optimal bets. When measurements are hierarchical, the process of conditioning this Gaussian field produces a hierarchy of elementary bets (gamblets). These gamblets generalize the notion of Wavelets and Wannier functions in the sense that they are adapted to the norm {}\textbar}cdot}\textbar and induce a multi-resolution decomposition of \B that is adapted to the eigensubspaces of the operator defining the norm {}\textbar}cdot}\textbar\. When the operator is localized, we show that the resulting gamblets are localized both in space and frequency and introduce the Fast Gamblet Transform (FGT) with rigorous accuracy and (near-linear) complexity estimates. As the FFT can be used to solve and diagonalize arbitrary PDEs with constant coefficients, the FGT can be used to decompose a wide range of continuous linear operators (including arbitrary continuous linear bijections from \H^s_0 to \H^{-s} or to Ł^2\) into a sequence of independent linear systems with uniformly bounded condition numbers and leads to {}mathcal{O}(N }operatorname{polylog} N) solvers and eigenspace adapted Multiresolution Analysis (resulting in near linear complexity approximation of all eigensubspaces).

The emergent field of probabilistic numerics has thus far lacked rigorous statistical principals. This paper establishes Bayesian probabilistic numerical methods as those which can be cast as solutions to certain Bayesian inverse problems, albeit problems that are non-standard. This allows us to establish general conditions under which Bayesian probabilistic numerical methods are well-defined, encompassing both non-linear and non-Gaussian models. For general computation, a numerical approximation scheme is developed and its asymptotic convergence is established. The theoretical development is then extended to pipelines of computation, wherein probabilistic numerical methods are composed to solve more challenging numerical tasks. The contribution highlights an important research frontier at the interface of numerical analysis and uncertainty quantification, with some illustrative applications presented.

The past century has seen a steady increase in the need of estimating and predicting complex systems and making (possibly critical) decisions with limited information. Although computers have made possible the numerical evaluation of sophisticated statistical models, these models are still designed by humans because there is currently no known recipe or algorithm for dividing the design of a statistical model into a sequence of arithmetic operations. Indeed enabling computers to think as humans, especially when faced with uncertainty, is challenging in several major ways: (1) Finding optimal statistical models remains to be formulated as a well-posed problem when information on the system of interest is incomplete and comes in the form of a complex combination of sample data, partial knowledge of constitutive relations and a limited description of the distribution of input random variables. (2) The space of admissible scenarios along with the space of relevant information, assumptions, and/or beliefs, tends to be infinite dimensional, whereas calculus on a computer is necessarily discrete and finite. With this purpose, this paper explores the foundations of a rigorous framework for the scientific computation of optimal statistical estimators/models and reviews their connections with decision theory, machine learning, Bayesian inference, stochastic optimization, robust optimization, optimal uncertainty quantification, and information-based complexity.

For a Gaussian measure on a separable Hilbert space with covariance operator \C we show that the family of conditional measures associated with conditioning on a closed subspace \S^{}perp} are Gaussian with covariance operator the short {}mathcal{S}(C) of the operator \C to \S\. We provide two proofs. The first uses the theory of Gaussian Hilbert spaces and a characterization of the shorted operator by Andersen and Trapp. The second uses recent developments by Corach, Maestripieri and Stojanoff on the relationship between the shorted operator and \C\-symmetric oblique projections onto \S^{}perp}\. To obtain the assertion when such projections do not exist, we develop an approximation result for the shorted operator by showing, for any positive operator \A how to construct a sequence of approximating operators \A^{n} which possess \A^{n}\-symmetric oblique projections onto \S^{}perp} such that the sequence of shorted operators {}mathcal{S}(A^{n}) converges to {}mathcal{S}(A) in the weak operator topology. This result combined with the martingale convergence of random variables associated with the corresponding approximations \C^{n} establishes the main assertion in general. Moreover, it in turn strengthens the approximation theorem for shorted operator when the operator is trace class; then the sequence of shorted operators {}mathcal{S}(A^{n}) converges to {}mathcal{S}(A) in trace norm.

Probabilistic numerics and uncertainty in computations

We deliver a call to arms for probabilistic numerical methods: algorithms for
numerical tasks, including linear algebra, integration, optimization and
solving differential equations, that return uncertainties in their
calculations. Such uncertainties, arising from the loss of precision induced by
numerical calculation with limited time or hardware, are important for much
contemporary science and industry. Within applications such as climate science
and astrophysics, the need to make decisions on the basis of computations with
large and complex data has led to a renewed focus on the management of
numerical uncertainty. We describe how several seminal classic numerical
methods can be interpreted naturally as probabilistic inference. We then show
that the probabilistic view suggests new algorithms that can flexibly be
adapted to suit application specifics, while delivering improved empirical
performance. We provide concrete illustrations of the benefits of probabilistic
numeric algorithms on real scientific problems from astrometry and astronomical
imaging, while highlighting open problems with these new algorithms. Finally,
we describe how probabilistic numerical methods provide a coherent framework
for identifying the uncertainty in calculations performed with a combination of
numerical algorithms (e.g. both numerical optimisers and differential equation
solvers), potentially allowing the diagnosis (and control) of error sources in
computations.

The numerical analyst is often called upon to estimate a function from a very limited knowledge of its properties (e.g. a finite number of ordinate values). This problem may be made well posed in a variety of ways, but an attractive approach is to regard the required function as a member of a linear space on which a probability measure is constructed, and then use established techniques of probability theory and statistics in order to infer properties of the function from the given information. This formulation agrees with established theory, for the problem of optimal linear approximation (using a Gaussian probability distribution), and also permits the estimation of nonlinear functionals, as well as extension to the case of "noisy" data.