API Reference#
WoodworkTableAccessor#
|
|
Adds specified semantic tags to columns, updating the Woodwork typing information. |
|
|
Calculates dependence measures between all pairs of columns in the DataFrame that support measuring dependence. |
Calculates dependence measures between all pairs of columns in the DataFrame that support measuring dependence. |
|
|
Calculates statistics for data contained in the DataFrame. |
Calculates statistics for data contained in the DataFrame. |
|
|
Drop specified columns from a DataFrame. |
Integer-location based indexing for selection by position. |
|
The index column for the table |
|
Infers the observation frequency (daily, biweekly, yearly, etc) of each temporal column |
|
|
Initializes Woodwork typing information for a DataFrame with a partial schema. |
Initializes Woodwork typing information for a DataFrame with a complete schema. |
|
Initializes Woodwork typing information for a DataFrame with a partial schema. |
|
Access a group of rows by label(s) or a boolean array. |
|
A dictionary containing logical types for each column |
|
Metadata of the DataFrame |
|
Calculates mutual information between all pairs of columns in the DataFrame that support mutual information. |
|
Calculates mutual information between all pairs of columns in the DataFrame that support mutual information. |
|
Name of the DataFrame |
|
Calculates Pearson correlation coefficient between all pairs of columns in the DataFrame that support correlation. |
|
Calculates Pearson correlation coefficient between all pairs of columns in the DataFrame that support correlation. |
|
A dictionary containing physical types for each column |
|
|
Return a Series with Woodwork typing information and remove it from the DataFrame. |
Remove the semantic tags for any column names in the provided semantic_tags dictionary, updating the Woodwork typing information. |
|
|
Renames columns in a DataFrame, maintaining Woodwork typing information. |
Reset the semantic tags for the specified columns to the default values. |
|
A copy of the Woodwork typing information for the DataFrame. |
|
|
Create a DataFrame with Woodwork typing information initialized that includes only columns whose Logical Type and semantic tags match conditions specified in the list of types and tags to include or exclude. |
A dictionary containing semantic tags for each column |
|
|
Sets the index column of the DataFrame. |
Set the time index. |
|
Update the logical type and semantic tags for any columns names in the provided types dictionaries, updating the Woodwork typing information for the DataFrame. |
|
Calculates Spearman correlation coefficient between all pairs of columns in the DataFrame that support correlation. |
|
Calculates Spearman correlation coefficient between all pairs of columns in the DataFrame that support correlation. |
|
The time index column for the table |
|
|
Write Woodwork table to disk in the format specified by format, location specified by path. |
Get a dictionary representation of the Woodwork typing information. |
|
DataFrame containing the physical dtypes, logical types and semantic tags for the schema. |
|
A dictionary containing the use_standard_tags setting for each column in the table |
|
Validates the dataframe based on the logical types. |
|
Returns a list of dictionaries with counts for the most frequent values in each column (only |
WoodworkColumnAccessor#
|
|
Add the specified semantic tags to the set of tags. |
|
Gets the information necessary to create a box and whisker plot with outliers for a numeric column using the IQR method. |
|
The description of the series |
|
The origin of the series |
|
Integer-location based indexing for selection by position. |
|
|
Initializes Woodwork typing information for a Series. |
Access a group of rows by label(s) or a boolean array. |
|
The logical type of the series |
|
The metadata of the series |
|
Whether the column can contain null values. |
|
Removes specified semantic tags from the current tags. |
|
Reset the semantic tags to the default values. |
|
The semantic tags assigned to the series |
|
Update the logical type for the series, clearing any previously set semantic tags, and returning a new series with Woodwork initialied. |
|
Replace current semantic tags with new values. |
|
Validates series data based on the logical type. |
TableSchema#
|
|
|
Adds specified semantic tags to columns, updating the Woodwork typing information. |
The index column for the table |
|
|
Creates a new TableSchema with specified columns, retaining typing information. |
A dictionary containing logical types for each column |
|
Metadata of the table |
|
|
Renames columns in a TableSchema |
|
Remove the semantic tags for any column names in the provided semantic_tags dictionary, updating the Woodwork typing information. |
|
Reset the semantic tags for the specified columns to the default values. |
Name of schema |
|
A dictionary containing semantic tags for each column |
|
|
Sets the index. |
|
Set the time index. |
|
Update the logical type and semantic tags for any columns names in the provided types dictionaries, updating the TableSchema at those columns. |
The time index column for the table |
|
DataFrame containing the physical dtypes, logical types and semantic tags for the TableSchema. |
|
ColumnSchema#
|
|
The custom semantic tag(s) specified for the column. |
|
Description of the column |
|
Origin of the column |
|
Whether the ColumnSchema is a Boolean column |
|
Whether the ColumnSchema is categorical in nature |
|
Whether the ColumnSchema is a Datetime column |
|
Whether the ColumnSchema is numeric in nature |
|
Metadata of the column |
Serialization#
|
Creates the description for a Woodwork table, including typing information for each column and loading information. |
Deserialization#
|
Convenience function to call read_woodwork_table. |
|
Read Woodwork table from disk, S3 path, or URL. |
Logical Types#
|
Represents Logical Types that contain address values. |
|
Represents Logical Types that contain whole numbers indicating a person's age. |
Represents Logical Types that contain non-negative floating point numbers indicating a person's age. |
|
Represents Logical Types that contain whole numbers indicating a person's age. |
|
|
Represents Logical Types that contain binary values indicating true/false. |
Represents Logical Types that contain binary values indicating true/false. |
|
|
Represents Logical Types that contain unordered discrete values that fall into one of a set of possible values. |
Represents Logical Types that use the ISO-3166 standard country code to represent countries. |
|
Represents Logical Types that use the ISO-4217 internation standard currency code to represent currencies. |
|
|
Represents Logical Types that contain date and time information. |
|
Represents Logical Types that contain positive and negative numbers, some of which include a fractional component. |
Represents Logical Types that contain email address values. |
|
|
Represents Logical Types that specify locations of directories and files in a file system. |
|
Represents Logical Types that contain positive and negative numbers without a fractional component, including zero (0). |
Represents Logical Types that contain positive and negative numbers without a fractional component, including zero (0). |
|
Represents Logical Types that contain IP addresses, including both IPv4 and IPv6 addresses. |
|
|
Represents Logical Types that contain latitude and longitude values in decimal degrees. |
Represents Logical Types that contain text or characters representing natural human language |
|
|
Represents Logical Types that contain ordered discrete values. |
Represents Logical Types that may contain first, middle and last names, including honorifics and suffixes. |
|
Represents Logical Types that contain numeric digits and characters representing a phone number. |
|
Represents Logical Types that contain a series of postal codes for representing a group of addresses. |
|
Represents Logical Types that use the ISO-3166 standard sub-region code to represent a portion of a larger geographic region. |
|
Represents Logical Types that contain values specifying a duration of time |
|
|
Represents Logical Types that cannot be inferred as a specific Logical Type. |
|
Represents Logical Types that contain URLs, which may include protocol, hostname and file name |
TypeSystem#
|
|
|
Add a new LogicalType to the TypeSystem, optionally specifying the corresponding inference function and a parent type. |
|
Infer the logical type for the given series |
|
Remove a logical type from the TypeSystem. |
Reset type system to the default settings that were specified at initialization. |
|
Update the inference function for the specified LogicalType. |
|
|
Add or update a relationship. |
Utils#
Type Utils#
Returns a dataframe describing all of the available Logical Types. |
|
Returns a dataframe describing all of the common semantic tags. |
General Utils#
Concatenate Woodwork objects along the columns axis. |
|
Generate a list of LogicalTypes that are valid for calculating mutual information. |
|
Generate a list of LogicalTypes that are valid for calculating Pearson correlation. |
|
Generate a list of LogicalTypes that are valid for calculating Spearman correlation. |
|
Read data from the specified file and return a DataFrame with initialized Woodwork typing information. |
Return a message indicating the reason that the provided schema cannot be used to initialize Woodwork on the dataframe. |
|
Initializes Woodwork typing information for a series, numpy.ndarray or pd.api.extensions. |
|
Check if a schema is valid for initializing Woodwork on a dataframe |
Statistics Utils#
Infer the frequency of a given Pandas Datetime Series. |
Demo Data#
|
Load a demo retail dataset into a DataFrame, optionally initializing Woodwork's typing information. |