VectorSizeNormaliser¶
-
class
frlearn.feature_preprocessors.
VectorSizeNormaliser
(measure: str = 'boscovich')[source]¶ Rescales each instance (seen as a vector) to size 1. Typically used on datasets of frequency counts, when only the relative frequencies are considered important, e.g. token counts of texts in NLP.
- Parameters
- measure: str or float or (np.array -> float) = ‘boscovich’
The vector size measure to use. A float is interpreted as Minkowski size with the corresponding value for p. For convenience, a number of popular measures can be referred to by name.
Notes
If the size of an instance is 0, it will be left unscaled. If the size of an instance is ∞, it will be scaled to 0.