mvpa2.datasets.sources.skl_data.skl_friedman1

mvpa2.datasets.sources.skl_data.skl_friedman1(n_samples=100, n_features=10, noise=0.0, random_state=None)

Generate the “Friedman #1” regression problem

This dataset is described in Friedman [1] and Breiman [2].

Inputs X are independent features uniformly distributed on the interval [0, 1]. The output y is created according to the formula:

y(X) = 10 * sin(pi * X[:, 0] * X[:, 1]) + 20 * (X[:, 2] - 0.5) ** 2 + 10 * X[:, 3] + 5 * X[:, 4] + noise * N(0, 1).

Out of the n_features features, only 5 are actually used to compute y. The remaining features are independent of y.

The number of features has to be >= 5.

Parameters:

n_samples : int, optional (default=100)

The number of samples.

n_features : int, optional (default=10)

The number of features. Should be at least 5.

noise : float, optional (default=0.0)

The standard deviation of the gaussian noise applied to the output.

random_state : int, RandomState instance or None, optional (default=None)

If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

Returns:

X : array of shape [n_samples, n_features]

The input samples.

y : array of shape [n_samples]

The output values.

Notes

This function has been auto-generated by wrapping make_friedman1() from the sklearn package. The documentation of this function has been kept verbatim. Consequently, the actual return value is not as described in the documentation, but the data is returned as a PyMVPA dataset.

References

[R25]J. Friedman, “Multivariate adaptive regression splines”, The Annals of Statistics 19 (1), pages 1-67, 1991.
[R26]L. Breiman, “Bagging predictors”, Machine Learning 24, pages 123-140, 1996.