Probabilistic Numerics | Ordinary Differential Equations

To avoid a frequent initial confusion for new readers, it may be helpful to point out that there are two common ways in which probabilistic methods are used in combination with ordinary differential equations: The “classic” problem of numerics is to infer the solution to an initial value problem given access to the differential equation. Below, we term this problem “solving ODEs”. The reverse problem, in some sense, has also found interest in machine learning: inferring a differential equation from (noisy) observations of trajectories that are assumed to be governed by this ODE. Below, this is listed under “inferring ODEs”.

Solving ODEs

2020

A role for symmetry in the Bayesian solution of differential equations

Wang, Junyang, Cockayne, Jon, and Oates, Chris J

Bayesian Analysis 2020

2018

Convergence Rates of Gaussian ODE Filters

Kersting, H., Sullivan, T. J., and Hennig, P.

ArXiv e-prints 2018

Abs link

A recently-introduced class of probabilistic (uncertainty-aware) solvers for ordinary differential equations (ODEs) applies Gaussian (Kalman) filtering to initial value problems. These methods model the true solution x and its first q derivatives a priori as a Gauss–Markov process X, which is then iteratively conditioned on information about x’. We prove worst-case local convergence rates of order h^q+1 for a wide range of versions of this Gaussian ODE filter, as well as global convergence rates of order h^q in the case of q=1 and an integrated Brownian motion prior, and analyze how inaccurate information on x’ coming from approximate evaluations of f affects these rates. Moreover, we present explicit formulas for the steady states and show that the posterior confidence intervals are well calibrated in all considered cases that exhibit global convergence—in the sense that they globally contract at the same rate as the truncation error.
Random time step probabilistic methods for uncertainty quantification in chaotic and geometric numerical integration

Abdulle, A., and Garegnani, G.

ArXiv e-prints 2018

Abs link

A novel probabilistic numerical method for quantifying the uncertainty induced by the time integration of ordinary differential equations (ODEs) is introduced. Departing from the classical strategy to randomize ODE solvers by adding a random forcing term, we show that a probability measure over the numerical solution of ODEs can be obtained by introducing suitable random time-steps in a classical time integrator. This intrinsic randomization allows for the conservation of geometric properties of the underlying deterministic integrator such as mass conservation, symplecticity or conservation of first integrals. Weak and mean-square convergence analysis are derived. We also analyze the convergence of the Monte Carlo estimator for the proposed random time step method and show that the measure obtained with repeated sampling converges in mean-square sense independently of the number of samples. Numerical examples including chaotic Hamiltonian systems, chemical reactions and Bayesian inferential problems illustrate the accuracy, robustness and versatility of our probabilistic numerical method.
Implicit Probabilistic Integrators for ODEs

Teymur, Onur, Lie, Han Cheng, Sullivan, Tim, and Calderhead, Ben

2018

Abs link

We introduce a family of implicit probabilistic integrators for initial value problems (IVPs) taking as a starting point the multistep Adams–Moulton method. The implicit construction allows for dynamic feedback from the forthcoming time-step, by contrast with previous probabilistic integrators, all of which are based on explicit methods. We begin with a concise survey of the rapidly-expanding field of probabilistic ODE solvers. We then introduce our method, which builds on and adapts the work of Conrad et al. (2016) and Teymur et al. (2016), and provide a rigorous proof of its well-definedness and convergence. We discuss the problem of the calibration of such integrators and suggest one approach. We give an illustrative example highlighting the effect of the use of probabilistic integrators – including our new method – in the setting of parameter inference within an inverse problem.

2016

A probabilistic model for the numerical solution of initial value problems

Schober, M., Särkkä, S., and Hennig, P.

ArXiv e-prints 2016

Abs link

Like many numerical methods, solvers for initial value problems (IVPs) on ordinary differential equations estimate an analytically intractable quantity, using the results of tractable computations as inputs. This structure is closely connected to the notion of inference on latent variables in statistics. We describe a class of algorithms that formulate the solution to an IVP as inference on a latent path that is a draw from a Gaussian process probability measure (or equivalently, the solution of a linear stochastic differential equation). We then show that certain members of this class are connected precisely to generalized linear methods for ODEs, a number of Runge–Kutta methods, and Nordsieck methods. This probabilistic formulation of classic methods is valuable in two ways: analytically, it highlights implicit prior assumptions favoring certain approximate solutions to the IVP over others, and gives a precise meaning to the old observation that these methods act like filters. Practically, it endows the classic solvers with ‘docking points’ for notions of uncertainty and prior information about the initial value, the value of the ODE itself, and the solution of the problem.
Active Uncertainty Calibration in Bayesian ODE Solvers

Kersting, Hans P., and Hennig, Philipp

In Uncertainty in Artificial Intelligence (UAI) 2016

Abs link

There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al. introduced a sampling-based class of methods that are ’well-calibrated’ in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al. pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.
Probabilistic Linear Multistep Methods

Teymur, Onur, Zygalakis, Kostas, and Calderhead, Ben

2016

Abs link

We present a derivation and theoretical investigation of the Adams-Bashforth and Adams-Moulton family of linear multistep methods for solving ordinary differential equations, starting from a Gaussian process (GP) framework. In the limit, this formulation coincides with the classical deterministic methods, which have been used as higher-order initial value problem solvers for over a century. Furthermore, the natural probabilistic framework provided by the GP formulation allows us to derive probabilistic versions of these methods, in the spirit of a number of other probabilistic ODE solvers presented in the recent literature. In contrast to higher-order Runge-Kutta methods, which require multiple intermediate function evaluations per step, Adams family methods make use of previous function evaluations, so that increased accuracy arising from a higher-order multistep approach comes at very little additional computational cost. We show that through a careful choice of covariance function for the GP, the posterior mean and standard deviation over the numerical solution can be made to exactly coincide with the value given by the deterministic method and its local truncation error respectively. We provide a rigorous proof of the convergence of these new methods, as well as an empirical investigation (up to fifth order) demonstrating their convergence rates in practice.

2015

A Random Riemannian Metric for Probabilistic Shortest-Path Tractography

Hauberg, Søren, Schober, Michael, Liptrot, Matthew, Hennig, Philipp, and Feragen, Aasa

In Medical Image Computing and Computer-Assisted Intervention (MICCAI) 2015

Abs

Shortest-path tractography (SPT) algorithms solve global optimization problems defined from local distance functions. As diffusion MRI data is inherently noisy, so are the voxelwise tensors from which local distances are derived. We extend Riemannian SPT by modeling the stochasticity of the diffusion tensor as a “random Riemannian metric”, where a geodesic is a distribution over tracts. We approximate this distribution with a Gaussian process and present a probabilistic numerics algorithm for computing the geodesic distribution. We demonstrate SPT improvements on data from the Human Connectome Project.

2014

On solving Ordinary Differential Equations using Gaussian Processes

Barber, D.

ArXiv pre-print 1408.3807 2014

Abs link

We describe a set of Gaussian Process based approaches that can be used to solve non-linear Ordinary Differential Equations. We suggest an explicit probabilistic solver and two implicit methods, one analogous to Picard iteration and the other to gradient matching. All methods have greater accuracy than previously suggested Gaussian Process approaches. We also suggest a general approach that can yield error estimates from any standard ODE solver.
Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics

Hennig, Philipp, and Hauberg, Søren

In Proc. of the 17th int. Conf. on Artificial Intelligence and Statistics (AISTATS) 2014

Abs Code

We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution. Such methods have concrete value in the statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved in virtually all computations. The probabilistic formulation permits marginalising the uncertainty of the numerical solution such that statistics are less sensitive to inaccuracies. This leads to new Riemannian algorithms for mean value computations and principal geodesic analysis. Marginalisation also means results can be less precise than point estimates, enabling a noticeable speed-up over the state of the art. Our approach is an argument for a wider point that uncertainty caused by numerical calculations should be tracked throughout the pipeline of machine learning algorithms.
Probabilistic shortest path tractography in DTI using Gaussian Process ODE solvers

Schober, Michael, Kasenburg, Niklas, Feragen, Aasa, Hennig, Philipp, and Hauberg, Søren

In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2014 2014

Abs

Tractography in diffusion tensor imaging estimates connectivity in the brain through observations of local diffusivity. These observations are noisy and of low resolution and, as a consequence, connections cannot be found with high precision. We use probabilistic numerics to estimate connectivity between regions of interest and contribute a Gaussian Process tractography algorithm which allows for both quantification and visualization of its posterior uncertainty. We use the uncertainty both in visualization of individual tracts as well as in heat maps of tract locations. Finally, we provide a quantitative evaluation of different metrics and algorithms showing that the adjoint metric combined with our algorithm produces paths which agree most often with experts.
Probabilistic ODE Solvers with Runge-Kutta Means

Schober, Michael, Duvenaud, David K, and Hennig, Philipp

2014

Abs

Runge-Kutta methods are the classic family of solvers for ordinary differential equations (ODEs), and the basis for the state of the art. Like most numerical methods, they return point estimates. We construct a family of probabilistic numerical methods that instead return a Gauss-Markov process defining a probability distribution over the ODE solution. In contrast to prior work, we construct this family such that posterior means match the outputs of the Runge-Kutta family exactly, thus inheriting their proven good properties. Remaining degrees of freedom not identified by the match to Runge-Kutta are chosen such that the posterior probability measure fits the observed structure of the ODE. Our results shed light on the structure of Runge-Kutta solvers from a new direction, provide a richer, probabilistic output, have low computational cost, and raise new research questions.

2013

Simulation of stochastic network dynamics via entropic matching

Ramalho, Tiago, Selig, Marco, Gerland, Ulrich, and Enßlin, Torsten A.

Phys. Rev. E 2013

Abs link

The simulation of complex stochastic network dynamics arising, for instance, from models of coupled biomolecular processes remains computationally challenging. Often, the necessity to scan a model’s dynamics over a large parameter space renders full-fledged stochastic simulations impractical, motivating approximation schemes. Here we propose an approximation scheme which improves upon the standard linear noise approximation while retaining similar computational complexity. The underlying idea is to minimize, at each time step, the Kullback-Leibler divergence between the true time evolved probability distribution and a Gaussian approximation (entropic matching). This condition leads to ordinary differential equations for the mean and the covariance matrix of the Gaussian. For cases of weak nonlinearity, the method is more accurate than the linear method when both are compared to stochastic simulations.
Bayesian Uncertainty Quantification for Differential Equations

Chkrebtii, O., Campbell, D.A., Girolami, M.A., and Calderhead, B.

Bayesin Analysis (discussion paper) 2013

Abs link

This paper advocates expansion of the role of Bayesian statistical inference when formally quantifying uncertainty in computer models defined by systems of ordinary or partial differential equations. We adopt the perspective that implicitly defined infinite dimensional functions representing model states are objects to be inferred probabilistically. We develop a general methodology for the probabilistic integration of differential equations via model based updating of a joint prior measure on the space of functions and their temporal and spatial derivatives. This results in a posterior measure over functions reflecting how well they satisfy the system of differential equations and corresponding initial and boundary values. We show how this posterior measure can be naturally incorporated within the Kennedy and O’Hagan framework for uncertainty quantification and provides a fully Bayesian approach to model calibration. By taking this probabilistic viewpoint, the full force of Bayesian inference can be exploited when seeking to coherently quantify and propagate epistemic uncertainty in computer models of complex natural and physical systems. A broad variety of examples are provided to illustrate the potential of this framework for characterising discretization uncertainty, including initial value, delay, and boundary value differential equations, as well as partial differential equations. We also demonstrate our methodology on a large scale system, by modeling discretization uncertainty in the solution of the Navier-Stokes equations of fluid flow, reduced to over 16,000 coupled and stiff ordinary differential equations. Finally, we discuss the wide range of open research themes that follow from the work presented.

2009

A quantitative probabilistic investigation into the accumulation of rounding errors in numerical ODE solution

Mosbach, Sebastian, and Turner, Amanda G.

Computers & Mathematics with Applications 2009

Abs link

We examine numerical rounding errors of some deterministic solvers for systems of ordinary differential equations (ODEs). We show that the accumulation of rounding errors results in a solution that is inherently random and we obtain the theoretical distribution of the trajectory as a function of time, the step size and the numerical precision of the computer. We consider, in particular, systems which amplify the effect of the rounding errors so that over long time periods the solutions exhibit divergent behaviour. By performing multiple repetitions with different values of the time step size, we observe numerically the random distributions predicted theoretically. We mainly focus on the explicit Euler and RK4 methods but also briefly consider more complex algorithms such as the implicit solvers VODE and RADAU5.

2003

Solving noisy linear operator equations by Gaussian processes: Application to ordinary and partial differential equations

Graepel, Thore

In ICML 2003

1991

Bayesian solution of ordinary differential equations

Skilling, J.

Maximum Entropy and Bayesian Methods, Seattle 1991

Abs

In the numerical solution of ordinary differential equations, a function y(x) is to be reconstructed from knowledge of the functional form of its derivative: dy/dx=f(x,y), together with an appropriate boundary condition. The derivative f is evaluated at a sequence of suitably chosen points (x_k,y_k), from which the form of y(.) is estimated. This is an inference problem, which can and perhaps should be treated by Bayesian techniques. As always, the inference appears as a probability distribution prob(y(.)), from which random samples show the probabilistic reliability of the results. Examples are given.

1973

A Statistical Study Of The Accuracy Of Floating Point Number Systems

Kuki, H., and Cody, W. J.

Communications of the ACM 1973

Abs link

This paper presents the statistical results of tests of the accuracy of certain arithmetic systems in evaluating sums, products and inner products, and analytic error estimates for some of the computations. The arithmetic systems studied are 6-digit hexadecimal and 22-digit binary floating point number representations combined with the usual chop and round modes of arithmetic with various numbers of guard digits, and with a modified round mode with guard digits. In a certain sense, arithmetic systems differing only in their use of binary or hexadecimal number representations are shown to be approximately statistically equivalent in accuracy. Further, the usual round mode with guard digits is shown to be statistically superior in accuracy to the usual chop mode in all cases save one. The modified round mode is found to be superior to the chop mode in all cases.

1966

Test of Probabilistic Models for the Propagation of Roundoff Errors

Hull, T. E., and Swenson, J. R.

Communications of the ACM 1966

Abs link

In any prolonged computation it is generally assumed that the accumulated effect of roundoff errors is in some sense statistical. The purpose of this paper is to give precise descriptions of certain probabilistic models for roundoff error, and then to described a series of experiments for testing the validity of these models. It is concluded that the models are in general very good. Discrepancies are both rare and mild. The test techniques can also be used to experiment with various types of special arithmetic.

Inferring ODEs

2017

Scalable Variational Inference for Dynamical Systems

Gorbach, Nico S, Bauer, Stefan, and Buhmann, Joachim M

2017

Abs link

Gradient matching is a promising tool for learning parameters and state dynamics of ordinary differential equations. It is a grid free inference approach, which, for fully observable systems is at times competitive with numerical integration. However, for many real-world applications, only sparse observations are available or even unobserved variables are included in the model description. In these cases most gradient matching methods are difficult to apply or simply do not provide satisfactory results. That is why, despite the high computational cost, numerical integration is still the gold standard in many applications. Using an existing gradient matching approach, we propose a scalable variational inference framework which can infer states and parameters simultaneously, offers computational speedups, improved accuracy and works well even under model misspecifications in a partially observable system.

2014

Gaussian Processes for Bayesian Estimation in Ordinary Differential Equations

Wang, Yali, and Barber, David

In International Conference on Machine Learning – ICML 2014

Abs

Bayesian parameter estimation in coupled ordinary differential equations (ODEs) is challenging due to the high computational cost of numerical integration. In gradient matching a separate data model is introduced with the property that its gradient may be calculated easily. Parameter estimation is then achieved by requiring consistency between the gradients computed from the data model and those specified by the ODE. We propose a Gaussian process model that directly links state derivative information with system observations, simplifying previous approaches and improving estimation accuracy.

2013

Adaptive Markov chain Monte Carlo forward projection for statistical analysis in epidemic modelling of human papillomavirus

Korostil, Igor A, Peters, Gareth W, Cornebise, Julien, and Regan, David G

Statistics in medicine 2013

Abs link

A Bayesian statistical model and estimation methodology based on forward projection adaptive Markov chain Monte Carlo is developed in order to perform the calibration of a high-dimensional nonlinear system of ordinary differential equations representing an epidemic model for human papillomavirus types 6 and 11 (HPV-6, HPV-11). The model is compartmental and involves stratification by age, gender and sexual-activity group. Developing this model and a means to calibrate it efficiently is relevant because HPV is a very multi-typed and common sexually transmitted infection with more than 100 types currently known. The two types studied in this paper, types 6 and 11, are causing about 90% of anogenital warts. We extend the development of a sexual mixing matrix on the basis of a formulation first suggested by Garnett and Anderson, frequently used to model sexually transmitted infections. In particular, we consider a stochastic mixing matrix framework that allows us to jointly estimate unknown attributes and parameters of the mixing matrix along with the parameters involved in the calibration of the HPV epidemic model. This matrix describes the sexual interactions between members of the population under study and relies on several quantities that are a priori unknown. The Bayesian model developed allows one to estimate jointly the HPV-6 and HPV-11 epidemic model parameters as well as unknown sexual mixing matrix parameters related to assortativity. Finally, we explore the ability of an extension to the class of adaptive Markov chain Monte Carlo algorithms to incorporate a forward projection strategy for the ordinary differential equation state trajectories. Efficient exploration of the Bayesian posterior distribution developed for the ordinary differential equation parameters provides a challenge for any Markov chain sampling methodology, hence the interest in adaptive Markov chain methods. We conclude with simulation studies on synthetic and recent actual data.

2009

Accelerating Bayesian Inference over Nonlinear Differential Equations with Gaussian Processes

Calderhead, Ben, Girolami, Mark, and Lawrence, Neil D.

2009

link