woodwork.column_accessor.WoodworkColumnAccessor.box_plot_dict¶
- WoodworkColumnAccessor.box_plot_dict(quantiles=None, include_indices_and_values=True)[source]¶
Gets the information necessary to create a box and whisker plot with outliers for a numeric column using the IQR method.
- Parameters
quantiles (dict[float -> float], optional) – A dictionary containing the quantiles for the data where the key indicates the quantile, and the value is the quantile’s value for the data. If no qantiles are provided, they will be computed from the data.
include_indices_and_values (bool, optional) – Whether or not the lists containing individual outlier values and their indices will be included in the returned dictionary. Defaults to True.
Note
The minimum quantiles necessary for outlier detection using the IQR method are the first quartile (0.25) and third quartile (0.75). If these keys are missing from the quantiles dictionary, the following quantiles will be calculated: {0.0, 0.25, 0.5, 0.75, 1.0}, which correspond to {min, first quantile, median, third quantile, max}.
- Returns
- Returns a dictionary containing box plot information for the Series.
The following elements will be found in the dictionary:
low_bound (float): the lower bound below which outliers lay - to be used as a whisker
high_bound (float): the high bound above which outliers lay - to be used as a whisker
- quantiles (list[float]): the quantiles used to determine the bounds.
If quantiles were passed in, will contain all quantiles passed in. Otherwise, contains the five quantiles {0.0, 0.25, 0.5, 0.75, 1.0}.
- low_values (list[float, int], optional): the values of the lower outliers.
Will not be included if
include_indices_and_values
is False.
- high_values (list[float, int], optional): the values of the upper outliers
Will not be included if
include_indices_and_values
is False.
- low_indices (list[int], optional): the corresponding index values for each of the lower outliers
Will not be included if
include_indices_and_values
is False.
- high_indices (list[int], optional): the corresponding index values for each of the upper outliers
Will not be included if
include_indices_and_values
is False.
- Return type
(dict[str -> float,list[number]])