API Reference

WoodworkTableAccessor

WoodworkTableAccessor(dataframe)

WoodworkTableAccessor.add_semantic_tags(…)

Adds specified semantic tags to columns, updating the Woodwork typing information.

WoodworkTableAccessor.describe([include])

Calculates statistics for data contained in the DataFrame.

WoodworkTableAccessor.describe_dict([include])

Calculates statistics for data contained in the DataFrame.

WoodworkTableAccessor.drop(columns)

Drop specified columns from a DataFrame.

WoodworkTableAccessor.iloc

Integer-location based indexing for selection by position.

WoodworkTableAccessor.index

The index column for the table

WoodworkTableAccessor.init([index, …])

Initializes Woodwork typing information for a DataFrame.

WoodworkTableAccessor.loc

Access a group of rows by label(s) or a boolean array.

WoodworkTableAccessor.logical_types

A dictionary containing logical types for each column

WoodworkTableAccessor.mutual_information([…])

Calculates mutual information between all pairs of columns in the DataFrame that support mutual information.

WoodworkTableAccessor.mutual_information_dict([…])

Calculates mutual information between all pairs of columns in the DataFrame that support mutual information.

WoodworkTableAccessor.physical_types

A dictionary containing physical types for each column

WoodworkTableAccessor.pop(column_name)

Return a Series with Woodwork typing information and remove it from the DataFrame.

WoodworkTableAccessor.remove_semantic_tags(…)

Remove the semantic tags for any column names in the provided semantic_tags dictionary, updating the Woodwork typing information.

WoodworkTableAccessor.rename(columns)

Renames columns in a DataFrame, maintaining Woodwork typing information.

WoodworkTableAccessor.reset_semantic_tags([…])

Reset the semantic tags for the specified columns to the default values.

WoodworkTableAccessor.schema

A copy of the Woodwork typing information for the DataFrame.

WoodworkTableAccessor.select(include)

Create a DataFrame with Woodwork typing information initialized that includes only columns whose Logical Type and semantic tags are specified in the list of types and tags to include.

WoodworkTableAccessor.semantic_tags

A dictionary containing semantic tags for each column

WoodworkTableAccessor.set_index(new_index)

Sets the index column of the DataFrame.

WoodworkTableAccessor.set_time_index(…)

Set the time index.

WoodworkTableAccessor.set_types([…])

Update the logical type and semantic tags for any columns names in the provided types dictionaries, updating the Woodwork typing information for the DataFrame.

WoodworkTableAccessor.time_index

The time index column for the table

WoodworkTableAccessor.to_csv(path[, sep, …])

Write Woodwork table to disk in the CSV format, location specified by path.

WoodworkTableAccessor.to_dictionary()

Get a dictionary representation of the Woodwork typing information.

WoodworkTableAccessor.to_parquet(path[, …])

Write Woodwork table to disk in the parquet format, location specified by path.

WoodworkTableAccessor.to_pickle(path[, …])

Write Woodwork table to disk in the pickle format, location specified by path.

WoodworkTableAccessor.types

DataFrame containing the physical dtypes, logical types and semantic tags for the Schema.

WoodworkTableAccessor.value_counts([…])

Returns a list of dictionaries with counts for the most frequent values in each column (only

WoodworkColumnAccessor

WoodworkColumnAccessor(series)

WoodworkColumnAccessor.add_semantic_tags(…)

Add the specified semantic tags to the set of tags.

WoodworkColumnAccessor.description

The description of the series

WoodworkColumnAccessor.iloc

Integer-location based indexing for selection by position.

WoodworkColumnAccessor.init([logical_type, …])

Initializes Woodwork typing information for a Series.

WoodworkColumnAccessor.loc

Access a group of rows by label(s) or a boolean array.

WoodworkColumnAccessor.logical_type

The logical type of the series

WoodworkColumnAccessor.metadata

The metadata of the series

WoodworkColumnAccessor.remove_semantic_tags(…)

Removes specified semantic tags from the current tags.

WoodworkColumnAccessor.reset_semantic_tags()

Reset the semantic tags to the default values.

WoodworkColumnAccessor.semantic_tags

The semantic tags assigned to the series

WoodworkColumnAccessor.set_logical_type(…)

Update the logical type for the series, clearing any previously set semantic tags, and returning a new series with Woodwork initialied.

WoodworkColumnAccessor.set_semantic_tags(…)

Replace current semantic tags with new values.

Schema

Schema(column_names, logical_types[, name, …])

Schema.add_semantic_tags(semantic_tags)

Adds specified semantic tags to columns, updating the Woodwork typing information.

Schema.index

The index column for the table

Schema.logical_types

A dictionary containing logical types for each column

Schema.rename(columns)

Renames columns in a Schema

Schema.remove_semantic_tags(semantic_tags)

Remove the semantic tags for any column names in the provided semantic_tags dictionary, updating the Woodwork typing information.

Schema.reset_semantic_tags([columns, …])

Reset the semantic tags for the specified columns to the default values.

Schema.semantic_tags

A dictionary containing semantic tags for each column

Schema.set_index(new_index)

Sets the index.

Schema.set_time_index(new_time_index)

Set the time index.

Schema.set_types([logical_types, …])

Update the logical type and semantic tags for any columns names in the provided types dictionaries, updating the Schema at those columns.

Schema.time_index

The time index column for the table

Schema.types

DataFrame containing the physical dtypes, logical types and semantic tags for the Schema.

Serialization

typing_info_to_dict(dataframe)

Creates the description for a Woodwork table, including typing information for each column and loading information.

write_dataframe(dataframe, path[, format])

Write underlying DataFrame data to disk or S3 path.

write_typing_info(typing_info, path)

Writes Woodwork typing information to the specified path at woodwork_typing_info.json

write_woodwork_table(dataframe, path[, …])

Serialize Woodwork table and write to disk or S3 path.

Deserialization

read_table_typing_information(path)

Read Woodwork typing information from disk, S3 path, or URL.

read_woodwork_table(path[, profile_name])

Read Woodwork table from disk, S3 path, or URL.

Logical Types

Boolean()

Represents Logical Types that contain binary values indicating true/false.

Categorical([encoding])

Represents Logical Types that contain unordered discrete values that fall into one of a set of possible values.

CountryCode()

Represents Logical Types that contain categorical information specifically used to represent countries.

Datetime([datetime_format])

Represents Logical Types that contain date and time information.

Double()

Represents Logical Types that contain positive and negative numbers, some of which include a fractional component.

EmailAddress()

Represents Logical Types that contain email address values.

Filepath()

Represents Logical Types that specify locations of directories and files in a file system.

FullName()

Represents Logical Types that may contain first, middle and last names, including honorifics and suffixes.

Integer()

Represents Logical Types that contain positive and negative numbers without a fractional component, including zero (0).

IPAddress()

Represents Logical Types that contain IP addresses, including both IPv4 and IPv6 addresses.

LatLong()

Represents Logical Types that contain latitude and longitude values in decimal degrees.

NaturalLanguage()

Represents Logical Types that contain text or characters representing natural human language

Ordinal(order)

Represents Logical Types that contain ordered discrete values.

PhoneNumber()

Represents Logical Types that contain numeric digits and characters representing a phone number

SubRegionCode()

Represents Logical Types that contain codes representing a portion of a larger geographic region.

Timedelta()

Represents Logical Types that contain values specifying a duration of time

URL()

Represents Logical Types that contain URLs, which may include protocol, hostname and file name

ZIPCode()

Represents Logical Types that contain a series of postal codes used by the US Postal Service for representing a group of addresses.

TypeSystem

TypeSystem([inference_functions, …])

TypeSystem.add_type(logical_type[, …])

Add a new LogicalType to the TypeSystem, optionally specifying the corresponding inference function and a parent type.

TypeSystem.infer_logical_type(series)

Infer the logical type for the given series

TypeSystem.remove_type(logical_type)

Remove a logical type from the TypeSystem.

TypeSystem.reset_defaults()

Reset type system to the default settings that were specified at initialization.

TypeSystem.update_inference_function(…)

Update the inference function for the specified LogicalType.

TypeSystem.update_relationship(logical_type, …)

Add or update a relationship.

Utils

Type Utils

list_logical_types

Returns a dataframe describing all of the available Logical Types.

list_semantic_tags

Returns a dataframe describing all of the common semantic tags.

General Utils

get_valid_mi_types

Generate a list of LogicalTypes that are valid for calculating mutual information.

read_csv

Read data from the specified CSV file and return a DataFrame with initialized Woodwork typing information.

init_series

Initializes Woodwork typing information for a Series, returning a new Series.

Demo Data

load_retail([id, nrows, init_woodwork])

Load a demo retail dataset into a DataFrame, optionally initializing Woodwork’s typing information.