woodwork.table_schema.TableSchema

class woodwork.table_schema.TableSchema(column_names, logical_types, name=None, index=None, time_index=None, semantic_tags=None, table_metadata=None, column_metadata=None, use_standard_tags=False, column_descriptions=None, column_origins=None, validate=True)[source]
__init__(column_names, logical_types, name=None, index=None, time_index=None, semantic_tags=None, table_metadata=None, column_metadata=None, use_standard_tags=False, column_descriptions=None, column_origins=None, validate=True)[source]

Create TableSchema

Parameters
  • column_names (list, set) – The columns present in the TableSchema.

  • logical_types (dict[str -> LogicalType]) – Dictionary mapping column names in the TableSchema to the LogicalType for the column. All columns present in the TableSchema must be present in the logical_types dictionary.

  • name (str, optional) – Name used to identify the TableSchema.

  • index (str, optional) – Name of the index column.

  • time_index (str, optional) – Name of the time index column.

  • semantic_tags (dict, optional) – Dictionary mapping column names in the TableSchema to the semantic tags for the column. The keys in the dictionary should be strings that correspond to columns in the TableSchema. There are two options for specifying the dictionary values: (str): If only one semantic tag is being set, a single string can be used as a value. (list[str] or set[str]): If multiple tags are being set, a list or set of strings can be used as the value. Semantic tags will be set to an empty set for any column not included in the dictionary.

  • table_metadata (dict[str -> json serializable], optional) – Dictionary containing extra metadata for the TableSchema. The dictionary must contain data types that are JSON serializable such as string, integers, and floats. DataFrame and Series types are not supported.

  • column_metadata (dict[str -> dict[str -> json serializable]], optional) – Dictionary mapping column names to that column’s metadata dictionary.

  • use_standard_tags (bool, dict[str -> bool], optional) – Determines whether standard semantic tags will be added to columns based on the specified logical type for the column. If a single boolean is supplied, will apply the same use_standard_tags value to all columns. A dictionary can be used to specify use_standard_tags values for individual columns. Unspecified columns will use the default value. Defaults to False.

  • column_descriptions (dict[str -> str], optional) – Dictionary mapping column names to column descriptions.

  • column_origins (str, dict[str -> str], optional) – Origin of each column. If a string is supplied, it is used as the origin for all columns. A dictionary can be used to set origins for individual columns.

  • validate (bool, optional) – Whether parameter validation should occur. Defaults to True. Warning: Should be set to False only when parameters and data are known to be valid. Any errors resulting from skipping validation with invalid inputs may not be easily understood.

Methods

__init__(column_names, logical_types[, ...])

Create TableSchema

add_semantic_tags(semantic_tags)

Adds specified semantic tags to columns, updating the Woodwork typing information.

get_subset_schema(subset_cols)

Creates a new TableSchema with specified columns, retaining typing information.

remove_semantic_tags(semantic_tags)

Remove the semantic tags for any column names in the provided semantic_tags dictionary, updating the Woodwork typing information.

rename(columns)

Renames columns in a TableSchema

reset_semantic_tags([columns, retain_index_tags])

Reset the semantic tags for the specified columns to the default values.

set_index(new_index[, validate])

Sets the index.

set_time_index(new_time_index[, validate])

Set the time index.

set_types([logical_types, semantic_tags, ...])

Update the logical type and semantic tags for any columns names in the provided types dictionaries, updating the TableSchema at those columns.

Attributes

index

The index column for the table

logical_types

A dictionary containing logical types for each column

metadata

Metadata of the table

name

Name of schema

semantic_tags

A dictionary containing semantic tags for each column

time_index

The time index column for the table

types

DataFrame containing the physical dtypes, logical types and semantic tags for the TableSchema.

use_standard_tags