This content refers to an unreleased development version of PyMVPA
To provide the most recent news and documentation www.pymvpa.org reflects the development 0.6 series of PyMVPA. If you are interested in the documentation of the previous stable 0.4 series of PyMVPA, please visit v04.pymvpa.org.

The effect of different hyperparameters in GPR

The following example runs Gaussian Process Regression (GPR) on a simple 1D dataset using squared exponential (i.e., Gaussian or RBF) kernel and different hyperparameters. The resulting classifier solutions are finally visualized in a single figure.

As usual we start by importing all of PyMVPA:

# Lets use LaTeX for proper rendering of greek
from matplotlib import rc
rc('text', usetex=True)

from mvpa2.suite import *

The next lines build two datasets using one of PyMVPA’s data generators.

# Generate dataset for training:
train_size = 40
F = 1
dataset = data_generators.sin_modulated(train_size, F)

# Generate dataset for testing:
test_size = 100
dataset_test = data_generators.sin_modulated(test_size, F, flat=True)

The last configuration step is the definition of four sets of hyperparameters to be used for GPR.

# Hyperparameters. Each row is [sigma_f, length_scale, sigma_noise]
hyperparameters = np.array([[1.0, 0.2, 0.4],
                           [1.0, 0.1, 0.1],
                           [1.0, 1.0, 0.1],
                           [1.0, 0.1, 1.0]])

The plotting of the final figure and the actually GPR runs are performed in a single loop.

rows = 2
columns = 2
pl.figure(figsize=(12, 12))
for i in range(rows*columns):
    pl.subplot(rows, columns, i+1)
    regression = True
    logml = True

    data_train = dataset.samples
    label_train = dataset.sa.targets
    data_test = dataset_test.samples
    label_test = dataset_test.sa.targets

The next lines configure a squared exponential kernel with the set of hyperparameters for the current subplot and assign the kernel to the GPR instance.

»    sigma_f, length_scale, sigma_noise = hyperparameters[i, :]
     kse = SquaredExponentialKernel(length_scale=length_scale,
                                    sigma_f=sigma_f)
     g = GPR(kse, sigma_noise=sigma_noise)
     if not regression:
         g = RegressionAsClassifier(g)
     print g

     if regression:
         g.ca.enable("predicted_variances")

     if logml:
         g.ca.enable("log_marginal_likelihood")

After training GPR the predictions are queried by passing the test dataset samples and accuracy measures are computed.

»    g.train(dataset)
     prediction = g.predict(data_test)

     # print label_test
     # print prediction
     accuracy = None
     if regression:
         accuracy = np.sqrt(((prediction-label_test)**2).sum()/prediction.size)
         print "RMSE:", accuracy
     else:
         accuracy = (prediction.astype('l')==label_test.astype('l')).sum() \
                    / float(prediction.size)
         print "accuracy:", accuracy

The remaining code simply plots both training and test datasets, as well as the GPR solutions.

»    if F == 1:
         pl.title(r"$\sigma_f=%0.2f$, $length_s=%0.2f$, $\sigma_n=%0.2f$" \
                 % (sigma_f,length_scale,sigma_noise))
         pl.plot(data_train, label_train, "ro", label="train")
         pl.plot(data_test, prediction, "b-", label="prediction")
         pl.plot(data_test, label_test, "g+", label="test")
         if regression:
             pl.plot(data_test, prediction - np.sqrt(g.ca.predicted_variances),
                        "b--", label=None)
             pl.plot(data_test, prediction+np.sqrt(g.ca.predicted_variances),
                        "b--", label=None)
             pl.text(0.5, -0.8, "$RMSE=%.3f$" %(accuracy))
             pl.text(0.5, -0.95, "$LML=%.3f$" %(g.ca.log_marginal_likelihood))
         else:
             pl.text(0.5, -0.8, "$accuracy=%s" % accuracy)

         pl.legend(loc='lower right')

     print "LML:", g.ca.log_marginal_likelihood

See also

The full source code of this example is included in the PyMVPA source distribution (doc/examples/gpr.py).

Previous topic

Classifier Sweep

Next topic

Simple model selection: grid search for GPR

NeuroDebian

NITRC-listed