woodwork.datacolumn.DataColumn

class woodwork.datacolumn.DataColumn(series, logical_type=None, semantic_tags=None, use_standard_tags=True, name=None, description=None, metadata=None)[source]
__init__(series, logical_type=None, semantic_tags=None, use_standard_tags=True, name=None, description=None, metadata=None)[source]

Create a DataColumn.

Parameters
  • series (pd.Series or dd.Series or numpy.ndarray or pd.api.extensions.ExtensionArray) – Series containing the data associated with the column.

  • logical_type (LogicalType, optional) – The logical type that should be assigned to the column. If no value is provided, the LogicalType for the series will be inferred.

  • semantic_tags (str or list or set, optional) – Semantic tags to assign to the column. Defaults to an empty set if not specified. There are two options for specifying the semantic tags: (str) If only one semantic tag is being set, a single string can be passed. (list or set) If multiple tags are being set, a list or set of strings can be passed.

  • use_standard_tags (bool, optional) – If True, will add standard semantic tags to columns based on the inferred or specified logical type for the column. Defaults to True.

  • name (str, optional) – Name of DataColumn. Will overwrite Series name, if it exists.

  • description (str, optional) – Optional text describing the contents of the column

  • metadata (dict[str -> json serializable], optional) – Metadata associated with the column.

Methods

__init__(series[, logical_type, …])

Create a DataColumn.

add_semantic_tags(semantic_tags)

Add the specified semantic tags to the column and return a new DataColumn object.

remove_semantic_tags(semantic_tags)

Removes specified semantic tags from column and returns a new column.

reset_semantic_tags([retain_index_tags])

Reset the semantic tags to the default values.

set_logical_type(logical_type[, …])

Update the logical type for the column and return a new DataColumn object.

set_semantic_tags(semantic_tags[, …])

Replace current semantic tags with new values and return a new DataColumn object.

to_series()

Retrieves the DataColumn’s underlying series.

Attributes

dtype

The dtype of the underlying series

iloc

Purely integer-location based indexing for selection by position.

logical_type

The logical type for the column

name

The name of the column

semantic_tags

The set of semantic tags currently assigned to the column

shape

Returns a tuple representing the dimensionality of the DataTable.