VectorSizeNormaliser

class frlearn.feature_preprocessors.VectorSizeNormaliser(measure: str = 'boscovich')[source]

Rescales each instance (seen as a vector) to size 1. Typically used on datasets of frequency counts, when only the relative frequencies are considered important, e.g. token counts of texts in NLP.

Parameters
measure: str or float or (np.array -> float) = ‘boscovich’

The vector size measure to use. A float is interpreted as Minkowski size with the corresponding value for p. For convenience, a number of popular measures can be referred to by name.

Notes

If the size of an instance is 0, it will be left unscaled. If the size of an instance is ∞, it will be scaled to 0.

class Model[source]