API Reference

DataTable

DataTable(dataframe[, name, index, …])

DataTable.add_semantic_tags(semantic_tags)

Adds specified semantic tags to columns.

DataTable.describe([include])

Calculates statistics for data contained in DataTable.

DataTable.describe_dict([include])

Calculates statistics for data contained in DataTable.

DataTable.drop(columns)

Drop specified columns from a DataTable.

DataTable.head([n])

Shows the first n rows of the DataTable along with typing information.

DataTable.iloc

Purely integer-location based indexing for selection by position.

DataTable.mutual_information([num_bins, …])

Calculates mutual information between all pairs of columns in the DataTable that support mutual information.

DataTable.mutual_information_dict([…])

Calculates mutual information between all pairs of columns in the DataTable that support mutual information.

DataTable.pop(column_name)

Return a DataColumn and drop it from the DataTable.

DataTable.remove_semantic_tags(semantic_tags)

Remove the semantic tags for any column names in the provided semantic_tags dictionary.

DataTable.rename(columns)

Renames columns in a DataTable

DataTable.reset_semantic_tags([columns, …])

Reset the semantic tags for the specified columns to the default values and return a new DataTable.

DataTable.select(include)

Create a DataTable including only columns whose logical type and semantic tags are specified in the list of types and tags to include.

DataTable.set_index(index)

Set the index column and return a new DataTable.

DataTable.set_time_index(time_index)

Set the time index column.

DataTable.set_types([logical_types, …])

Update the logical type and semantic tags for any columns names in the provided types dictionary.

DataTable.shape

Returns a tuple representing the dimensionality of the DataTable.

DataTable.to_csv(path[, sep, encoding, …])

Write DataTable to disk in the CSV format, location specified by path.

DataTable.to_dataframe()

Retrieves the DataTable’s underlying dataframe.

DataTable.to_parquet(path[, compression, …])

Write DataTable to disk in the parquet format, location specified by path.

DataTable.to_pickle(path[, compression, …])

Write DataTable to disk in the pickle format, location specified by path.

DataTable.update_dataframe(new_df[, …])

Replace the DataTable’s dataframe with a new dataframe, making sure the new dataframe dtypes are updated.

DataTable.value_counts([ascending, top_n, …])

Returns a list of dictionaries with counts for the most frequent values in each column (only

DataColumn

DataColumn(series[, logical_type, …])

DataColumn.add_semantic_tags(semantic_tags)

Add the specified semantic tags to the column and return a new DataColumn object.

DataColumn.iloc

Purely integer-location based indexing for selection by position.

DataColumn.remove_semantic_tags(semantic_tags)

Removes specified semantic tags from column and returns a new column.

DataColumn.reset_semantic_tags([…])

Reset the semantic tags to the default values.

DataColumn.set_logical_type(logical_type[, …])

Update the logical type for the column and return a new DataColumn object.

DataColumn.set_semantic_tags(semantic_tags)

Replace current semantic tags with new values and return a new DataColumn object.

DataColumn.shape

Returns a tuple representing the dimensionality of the DataTable.

DataColumn.to_series()

Retrieves the DataColumn’s underlying series.

Logical Types

Boolean()

Represents Logical Types that contain binary values indicating true/false.

Categorical([encoding])

Represents Logical Types that contain unordered discrete values that fall into one of a set of possible values.

CountryCode()

Represents Logical Types that contain categorical information specifically used to represent countries.

Datetime([datetime_format])

Represents Logical Types that contain date and time information.

Double()

Represents Logical Types that contain positive and negative numbers, some of which include a fractional component.

EmailAddress()

Represents Logical Types that contain email address values.

Filepath()

Represents Logical Types that specify locations of directories and files in a file system.

FullName()

Represents Logical Types that may contain first, middle and last names, including honorifics and suffixes.

Integer()

Represents Logical Types that contain positive and negative numbers without a fractional component, including zero (0).

IPAddress()

Represents Logical Types that contain IP addresses, including both IPv4 and IPv6 addresses.

LatLong()

Represents Logical Types that contain latitude and longitude values in decimal degrees.

NaturalLanguage()

Represents Logical Types that contain text or characters representing natural human language

Ordinal(order)

Represents Logical Types that contain ordered discrete values.

PhoneNumber()

Represents Logical Types that contain numeric digits and characters representing a phone number

SubRegionCode()

Represents Logical Types that contain codes representing a portion of a larger geographic region.

Timedelta()

Represents Logical Types that contain values specifying a duration of time

URL()

Represents Logical Types that contain URLs, which may include protocol, hostname and file name

ZIPCode()

Represents Logical Types that contain a series of postal codes used by the US Postal Service for representing a group of addresses.

TypeSystem

TypeSystem([inference_functions, …])

TypeSystem.add_type(logical_type[, …])

Add a new LogicalType to the TypeSystem, optionally specifying the corresponding inference function and a parent type.

TypeSystem.infer_logical_type(series)

Infer the logical type for the given series

TypeSystem.remove_type(logical_type)

Remove a logical type from the TypeSystem.

TypeSystem.reset_defaults()

Reset type system to the default settings that were specified at initialization.

TypeSystem.update_inference_function(…)

Update the inference function for the specified LogicalType.

TypeSystem.update_relationship(logical_type, …)

Add or update a relationship.

Utils

Type Utils

list_logical_types

Returns a dataframe describing all of the available Logical Types.

list_semantic_tags

Returns a dataframe describing all of the common semantic tags.

General Utils

get_valid_mi_types

Generate a list of LogicalTypes that are valid for calculating mutual information.

read_csv

Read data from the specified CSV file and return a Woodwork DataTable

Demo Data

load_retail([id, nrows, return_dataframe])

Load a demo retail dataset into either a DataTable or a DataFrame