The Outlier trait specifies an outlier detection operation to be defined by the objects implementing it, i.e., DistanceOutlier - outlier = beyond 'STDEV_CUTOFF' units from mean QuantileOutlier - outlier = in the 'PERCENTILE' tails of the distribution QuartileOutlier - outlier = 'X_MULTIPLIER' times beyond the middle two quartiles Leaving extreme values in datasets that are highly unlikely to represent legitimate values will reduce the quality of models. However, removing legitimate extreme values will only make the model appear to be good, and it may fail in the real world.
Attributes
See also
Imputation as an alternative to removal of outliers