woodwork.column_accessor.WoodworkColumnAccessor.box_plot_dict

WoodworkColumnAccessor.box_plot_dict(quantiles=None, include_indices_and_values=True)[source]

Gets the information necessary to create a box and whisker plot with outliers for a numeric column using the IQR method.

Parameters
  • quantiles (dict[float -> float], optional) – A dictionary containing the quantiles for the data where the key indicates the quantile, and the value is the quantile’s value for the data. If no qantiles are provided, they will be computed from the data.

  • include_indices_and_values (bool, optional) – Whether or not the lists containing individual outlier values and their indices will be included in the returned dictionary. Defaults to True.

Note

The minimum quantiles necessary for outlier detection using the IQR method are the first quartile (0.25) and third quartile (0.75). If these keys are missing from the quantiles dictionary, the following quantiles will be calculated: {0.0, 0.25, 0.5, 0.75, 1.0}, which correspond to {min, first quantile, median, third quantile, max}.

Returns

Returns a dictionary containing box plot information for the Series.

The following elements will be found in the dictionary:

  • low_bound (float): the lower bound below which outliers lay - to be used as a whisker

  • high_bound (float): the high bound above which outliers lay - to be used as a whisker

  • quantiles (list[float]): the quantiles used to determine the bounds.

    If quantiles were passed in, will contain all quantiles passed in. Otherwise, contains the five quantiles {0.0, 0.25, 0.5, 0.75, 1.0}.

  • low_values (list[float, int], optional): the values of the lower outliers.

    Will not be included if include_indices_and_values is False.

  • high_values (list[float, int], optional): the values of the upper outliers

    Will not be included if include_indices_and_values is False.

  • low_indices (list[int], optional): the corresponding index values for each of the lower outliers

    Will not be included if include_indices_and_values is False.

  • high_indices (list[int], optional): the corresponding index values for each of the upper outliers

    Will not be included if include_indices_and_values is False.

Return type

(dict[str -> float,list[number]])