What is <TBD>?

<TBD> is a library that helps with data typing of 2-dimensional tabular data structures. It provides a DataTable object, which contains the physical, logical, and semantic data types. It can be used with Featuretools, EvalML, and general ML.

Quick Start

Below is an example of using a DataTable to automatically infer the Logical Types.

In [1]: import woodwork as dt

In [2]: data = dt.demo.load_retail(nrows=100)

In [3]: dt = dt.DataTable(data, name="retail")

In [4]: dt.types
                Physical Type     Logical Type Semantic Tag(s)
Data Column                                                   
order_id                Int64      WholeNumber       {numeric}
product_id           category      Categorical      {category}
description            string  NaturalLanguage              {}
quantity                Int64      WholeNumber       {numeric}
order_date     datetime64[ns]         Datetime              {}
unit_price            float64           Double       {numeric}
customer_name          string  NaturalLanguage              {}
country                string  NaturalLanguage              {}
total                 float64           Double       {numeric}
cancelled             boolean          Boolean              {}

API Reference