GoGP

GoGP is a library for probabilistic programming around Gaussian processes. It uses Infergo for automatic differentiation and inference.

GoGP is built around a dual view on the Gaussian process

  • as a stochastic process,
  • as a probabilistic model with respect to kernel.

Gaussian process instance

GP, the Gaussian process type, encapsulates similarity and noise kernels, their parameters, and observation inputs and outputs:

// Type GP is the barebone implementation of GP.
type GP struct {
	NDim                   int       // number of dimensions
	Simil, Noise           Kernel    // kernels
	ThetaSimil, ThetaNoise []float64 // kernel parameters

	X [][]float64 // inputs
	Y []float64   // outputs
}

Public methods defined on GP fall into two groups: Gaussian process fitting and prediction, on one hand, and probabilistic model interface, on the other hand.

Gaussian process methods

Absorb updates the GP with observations, Produce returns predicted outputs for unseen inputs.

func (*GP) Absorb(x [][]float64, y []float64)
func (*GP) Produce(x [][]float64) (mu, sigma []float64)

When kernel parameters are fixed (known or learned), Absorb and Produce are all that is needed for posterior GP inference.

Probabilistic model methods

Observe and Gradient turn a GP instance into an elemental Infergo model (that is, a model with supplied gradient). The model can be passed to inference algorithms or used within another model.

func (*GP) Observe(x []float64) float64
func (*GP) Gradient() []float64 // `elemental' model

If the length of x is equal to the number of kernel hyperparameters (Simil.NTheta() + Noise.NTheta()) then the gradient of marginal likelihood is computed with respect to kernel hyperparameters only. Observations must be provided in the fields of the GP instance. Otherwise, if the length of x is greater than the number of parameters, the rest of x is interpreted as observations. In the latter case, the gradient is computed with respect to both kernel hyperparameters and observations.

Kernels

There are two kernel kinds:

  • similarity kernel;
  • noise kernel.

Both kinds must satisfy the Kernel interface:

type Kernel interface {
	Observe([]float64) float64
	NTheta() int
}

The Observe method computes the variance or covariance, the NTheta method returns the number of kernel parameters.

A similarity (or covariance) kernel receives concanetation of kernel parameters and coordinates of two points. A noise kernel receives concatenation of kernel parameters and coordinates of a single point. Here is an example implementation of the RBF (or normal) kernel:

type RBF struct{}

func (RBF) Observe(x []float64) float64 {
	l, xa, xb := x[0], x[1], x[2]
	d := (xa - xb) / l
	return math.Exp(-d * d / 2)
}

func (RBF) NTheta() int { return 1 }

Case studies

GoGP includes case studies, illustrating, on simple examples, common patterns of GoGP use. We briefly summarize here some of the case studies.

Basic usage

In the basic case, similar to that supported by many Gaussian process libraries, a GP directly serves as the model for inference on hyperparameters (or the hyperparameters can be just fixed).

The library user specifies the kernel:

type Basic struct{}
func (Basic) Observe(x []float64) float64 {
    return x[0] * kernel.Normal.Observe(x[1:])
}
func (Basic) NTheta() int { return 2 }
and initializes GP with a kernel instance:
gp := &gp.GP{
    NDim:  1,
    Simil: Basic{},
}

MLE inference on hyperparameters and prediction can then be performed through library functions.

Priors on hyperparameters

If priors on hyperparameters are to be specified, the library user provides both the kernel and the model. GP is initialized with the kernel, and then GP and the model are combined for inference in a derived model:

type Model struct {
    gp           *gp.GP
    priors       *Priors
    gGrad, pGrad []float64
}
func (m *Model) Observe(x []float64) float64 {
    var gll, pll float64
    gll, m.gGrad =
      m.gp.Observe(x), model.Gradient(m.gp)
    pll, m.pGrad = 
      m.priors.Observe(x),model.Gradient(m.priors)
    return gll + pll
}
func (m *Model) Gradient() []float64 {
	for i := range m.pGrad {
		m.gGrad[i] += m.pGrad[i]
	}
	return m.gGrad
}
In Model, gp holds a GP instance and priors holds an instance of the model expressing beliefs about hyperparameters. A Model instance is used for inference on hyperparameters, a GP instance — for prediction.

Uncertain observation inputs

When observation inputs are uncertain, beliefs about inputs can be specified, and the log-likelihood gradient can be computed with respect to both hyperparameters and observation inputs. For example, in a time series, one can assume that observation inputs come from a renewal process and let the inputs move relative to each other. Then, forecasting can be performed relative to posterior observation inputs.

Non-Gaussian noise

In basic usage, the observation noise is assumed to be Gaussian. This usage is supported by initializing GP with a noise kernel, along with a similarity kernel. When the noise is not Gaussian, an analytical solution for posterior Gaussian process inference does not always exist. However, non-Gaussian noise is straightforwardly supported by GoGP through supplying a model for beliefs on observation outputs:

type Noise struct {
	Y []float64 // noisy outputs
}
func (m *Noise) Observe(x []float64) (ll float64){
	// Laplacian noise
	for i := range m.Y {
		ll += Expon.Logp(1/math.Exp(x[0]),
			math.Abs(m.Y[i]-x[1+i]))
	}
	return ll
}