woodwork.data_table.
DataTable
__init__
Create DataTable
dataframe (pd.DataFrame) – Dataframe providing the data for the datatable.
name (str, optional) – Name used to identify the datatable.
index (str, optional) – Name of the index column in the dataframe.
time_index (str, optional) – Name of the time index column in the dataframe.
semantic_tags (dict, optional) – Dictionary mapping column names in the dataframe to the semantic tags for the column. The keys in the dictionary should be strings that correspond to columns in the underlying dataframe. There are two options for specifying the dictionary values: (str): If only one semantic tag is being set, a single string can be used as a value. (list[str] or set[str]): If multiple tags are being set, a list or set of strings can be used as the value. Semantic tags will be set to an empty set for any column not included in the dictionary.
logical_types (dict[str -> LogicalType], optional) – Dictionary mapping column names in the dataframe to the LogicalType for the column. LogicalTypes will be inferred for any columns not present in the dictionary.
copy_dataframe (bool, optional) – If True, a copy of the input dataframe will be made prior to creating the DataTable. Defaults to False, which results in using a reference to the input dataframe.
use_standard_tags (bool, optional) – If True, will add standard semantic tags to columns based on the inferred or specified logical type for the column. Defaults to True.
Methods
__init__(dataframe[, name, index, …])
add_semantic_tags(semantic_tags)
add_semantic_tags
Adds specified semantic tags to columns.
describe()
describe
Calculates statistics for data contained in DataTable.
get_mutual_information([num_bins, nrows])
get_mutual_information
Calculates mutual information between all pairs of columns in the DataTable that support mutual information.
remove_semantic_tags(semantic_tags)
remove_semantic_tags
Remove the semantic tags for any column names in the provided semantic_tags dictionary.
reset_semantic_tags([columns, retain_index_tags])
reset_semantic_tags
Reset the semantic tags for the specified columns to the default values and return a new DataTable.
select(include)
select
Create a DataTable including only columns whose logical type and semantic tags are specified in the list of types and tags to include.
select_ltypes(include)
select_ltypes
Create a DataTable that includes only columns whose logical types are specified here.
select_semantic_tags(include)
select_semantic_tags
Create a DataTable that includes only columns that have at least one of the semantic tags specified here.
set_index(index)
set_index
Set the index column and return a new DataTable.
set_logical_types(logical_types[, …])
set_logical_types
Update the logical type for any columns names in the provided logical_types dictionary.
set_semantic_tags(semantic_tags[, …])
set_semantic_tags
Update the semantic tags for any column names in the provided semantic_tags dictionary.
set_time_index(time_index)
set_time_index
Set the time index column.
to_pandas([copy])
to_pandas
Retrieves the DataTable’s underlying dataframe.
Attributes
index
The index column for the table
logical_types
A dictionary containing logical types for each column
ltypes
A series listing the logical types for each column in the table
physical_types
A dictionary containing physical types for each column
semantic_tags
A dictionary containing semantic tags for each column
time_index
The time index column for the table
types
Dataframe containing the physical dtypes, logical types and semantic tags for the table