normal_feature_dataset(perlabel=50, nlabels=2, nfeatures=4, nchunks=5, means=None, nonbogus_features=None, snr=3.0, normalize=True)¶
Generate a univariate dataset with normal noise and specified means.
Could be considered to be a generalization of
pure_multivariate_signalwhere means=[ [0,1], [1,0] ].
Specify either means or
nonbogus_featuresso means get assigned accordingly. If neither
nonbogus_featuresare provided, data will be pure noise and no per-label information.
perlabel : int, optional
Number of samples per each label
nlabels : int, optional
Number of labels in the dataset
nfeatures : int, optional
Total number of features (including bogus features which carry no label-related signal)
nchunks : int, optional
Number of chunks (perlabel should be multiple of nchunks)
means : None or ndarray of (nlabels, nfeatures) shape
Specified means for each of features (columns) for all labels (rows).
nonbogus_features : None or list of int
Indexes of non-bogus features (1 per label).
snr : float, optional
Signal-to-noise ration assuming that signal has std 1.0 so we just divide random normal noise by snr
normalize : bool, optional
Divide by max(abs()) value to bring data into [-1, 1] range.