woodwork.datatable.DataTable.mutual_information_dict

DataTable.mutual_information_dict(num_bins=10, nrows=None, include_index=False)[source]

Calculates mutual information between all pairs of columns in the DataTable that support mutual information. Logical Types that support mutual information are as follows: Boolean, Categorical, CountryCode, Datetime, Double, Integer, Ordinal, SubRegionCode, and ZIPCode

Parameters
  • num_bins (int) – Determines number of bins to use for converting numeric features into categorical.

  • nrows (int) – The number of rows to sample for when determining mutual info. If specified, samples the desired number of rows from the data. Defaults to using all rows.

  • include_index (bool) – If True, the column specified as the index will be included as long as its LogicalType is valid for mutual information calculations. If False, the index column will not have mutual information calculated for it. Defaults to False.

Returns

A list containing dictionaries that have keys column_1, column_2, and mutual_info that is sorted in decending order by mutual info. Mutual information values are between 0 (no mutual information) and 1 (perfect dependency).

Return type

list(dict)