This function marks outliers in the input vector.
detect_outliers(
x,
apriori,
...,
plot = FALSE,
verbose = FALSE,
title = NULL,
timestamps = NULL
)
# S4 method for numeric,missing
detect_outliers(x, plot, verbose, title, timestamps)
# S4 method for numeric,Apriori
detect_outliers(x, apriori, plot, verbose, title, timestamps)
numeric vector of values
Apriori class
optional parameters, depending on signature:
prints comprehensive plots
prints comprehensive information
adds title to the plot
timestamp vector. For airpressure, timestamps are of no importance, except aestehtical for the scatter plot if plot = TRUE. In case there are duplicates and NA values, only warnings will be raised which might suggest that something is wrong with x. For hydrostatic pressure, timestamps are important. Therefore an error is raised if timestamps are not supplied, or if any of the timestamps are NA or duplicates.
Logical vector with same length as x, specifying TRUE for an outlier.
x = numeric,apriori = missing
: Only considers x, without any a-priori information.
A normal distribution is assumed with mean and variance estimated using
median and MAD as described in Leys, 2013.
x = numeric,apriori = Apriori
: Takes a-priori information about x into consideration.
# In case of a vector:
x <- c(1:9, 100)
detect_outliers(x)
#> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
# In case of a dataframe, select the column:
df <- data.frame('x' = x)
detect_outliers(df$x)
#> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
# Or use the tidyverse approach:
library(magrittr)
df %>% dplyr::mutate("outlier" = detect_outliers(x))
#> x outlier
#> 1 1 FALSE
#> 2 2 FALSE
#> 3 3 FALSE
#> 4 4 FALSE
#> 5 5 FALSE
#> 6 6 FALSE
#> 7 7 FALSE
#> 8 8 FALSE
#> 9 9 FALSE
#> 10 100 TRUE