woodwork.table_accessor.WoodworkTableAccessor.init_with_partial_schema#
- WoodworkTableAccessor.init_with_partial_schema(schema: Optional[TableSchema] = None, index: Optional[str] = None, time_index: Optional[str] = None, logical_types: Optional[Dict[Hashable, Optional[Union[str, LogicalType]]]] = None, ignore_columns: Optional[List[str]] = None, already_sorted: Optional[bool] = False, name: Optional[str] = None, semantic_tags: Optional[Dict[Hashable, Union[str, List[str], Set[str]]]] = None, table_metadata: Optional[dict] = None, column_metadata: Optional[Dict[Hashable, dict]] = None, use_standard_tags: Optional[Union[bool, Dict[Hashable, bool]]] = None, column_descriptions: Optional[Dict[Hashable, str]] = None, column_origins: Optional[Union[str, Dict[Hashable, str]]] = None, null_invalid_values: Optional[bool] = False, validate: Optional[bool] = True, **kwargs) None [source]#
Initializes Woodwork typing information for a DataFrame with a partial schema.
- Logical type priority:
Types specified in
logical_types
Types specified in
partial_schema
Types inferred by
ww.type_system.infer_logical_type
- Other Info priority:
Parameter passed in
Value specified in
partial_schema
- Parameters:
schema (Woodwork.TableSchema, optional) – Typing information to use for the DataFrame instead of performing inference. Specified arguments will override the schema’s typing information.
index (str, optional) – Name of the index column.
time_index (str, optional) – Name of the time index column.
logical_types (Dict[str -> LogicalType], optional) – Dictionary mapping column names in the DataFrame to the LogicalType for the column. Setting a column’s logical type to None in this dict will force a logical to be inferred.
ignore_columns (list[str] or set[str], optional) – List of columns to ignore for inferring logical types. If a column name is included in this list, then it cannot be part of the logical_types dictionary argument, and it must be part of an existing schema for the dataframe. This argument can be used when a column has a logical type that has already been inferred and its physical dtype is not expected to have changed since its last inference.
already_sorted (bool, optional) – Indicates whether the input DataFrame is already sorted on the time index. If False, will sort the dataframe first on the time_index and then on the index (pandas DataFrame only). Defaults to False.
name (str, optional) – Name used to identify the DataFrame.
semantic_tags (dict, optional) – Dictionary mapping column names in Woodwork to the semantic tags for the column. The keys in the dictionary should be strings that correspond to column names. There are two options for specifying the dictionary values: (str): If only one semantic tag is being set, a single string can be used as a value. (list[str] or set[str]): If multiple tags are being set, a list or set of strings can be used as the value. Semantic tags will be set to an empty set for any column not included in the dictionary.
table_metadata (Dict[str -> json serializable], optional) – Dictionary containing extra metadata for Woodwork.
column_metadata (Dict[str -> Dict[str -> json serializable]], optional) – Dictionary mapping column names to that column’s metadata dictionary.
use_standard_tags (bool, Dict[str -> bool], optional) – Determines whether standard semantic tags will be added to columns based on the specified logical type for the column. If a single boolean is supplied, will apply the same use_standard_tags value to all columns. A dictionary can be used to specify
use_standard_tags
values for individual columns. Unspecified columns will use the default value of True.column_descriptions (Dict[str -> str], optional) – Dictionary mapping column names to column descriptions.
column_origins (str, Dict[str -> str], optional) – Origin of each column. If a string is supplied, it is used as the origin for all columns. A dictionary can be used to set origins for individual columns.
validate (bool, optional) – Whether parameter and data validation should occur. Defaults to True. Warning: Should be set to False only when parameters and data are known to be valid. Any errors resulting from skipping validation with invalid inputs may not be easily understood.