Implement Schema and Accessor API (#497)
Add Schema class that holds typing info (#499)
Add WoodworkTableAccessor class that performs type inference and stores Schema (#514)
Allow initializing Accessor schema with a valid Schema object (#522)
Add ability to read in a csv and create a DataFrame with an initialized Woodwork Schema (#534)
Add ability to call pandas methods from Accessor (#538, #589)
Add helpers for checking if a column is one of Boolean, Datetime, numeric, or categorical (#553)
Add ability to load demo retail dataset with a Woodwork Accessor (#556)
Add select to WoodworkTableAccessor (#548)
select
Add mutual_information to WoodworkTableAccessor (#571)
mutual_information
Add WoodworkColumnAccessor class (#562)
Add semantic tag update methods to column accessor (#573)
Add describe and describe_dict to WoodworkTableAccessor (#579)
describe
describe_dict
Add init_series util function for initializing a series with dtype change (#581)
init_series
Add set_logical_type method to WoodworkColumnAccessor (#590)
set_logical_type
Add semantic tag update methods to table schema (#591)
Add warning if additional parameters are passed along with schema (#593)
Better warning when accessing column properties before init (#596)
Update column accessor to work with LatLong columns (#598)
Add set_index to WoodworkTableAccessor (#603)
set_index
Implement loc and iloc for WoodworkColumnAccessor (#613)
loc
iloc
Add set_time_index to WoodworkTableAccessor (#612)
set_time_index
Implement loc and iloc for WoodworkTableAccessor (#618)
Allow updating logical types with set_types and make relevant DataFrame changes (#619)
set_types
Allow serialization of WoodworkColumnAccessor to csv, pickle, and parquet (#624)
Add DaskColumnAccessor (#625)
Allow deserialization from csv, pickle, and parquet to Woodwork table (#626)
Add value_counts to WoodworkTableAccessor (#632)
value_counts
Add KoalasColumnAccessor (#634)
Add pop to WoodworkTableAccessor (#636)
pop
Add drop to WoodworkTableAccessor (#640)
drop
Add rename to WoodworkTableAccessor (#646)
rename
Add DaskTableAccessor (#648)
Add Schema properties to WoodworkTableAccessor (#651)
Add KoalasTableAccessor (#652)
Adds __getitem__ to WoodworkTableAccessor (#633)
__getitem__
Update Koalas min version and add support for more new pandas dtypes with Koalas (#678)
Adds __setitem__ to WoodworkTableAccessor (#669)
__setitem__
Create new Schema object when performing pandas operation on Accessors (#595)
Fix bug in _reset_semantic_tags causing columns to share same semantic tags set (#666)
_reset_semantic_tags
Maintain column order in DataFrame and Woodwork repr (#677)
Move mutual information logic to statistics utils file (#584)
Bump min Koalas version to 1.4.0 (#638)
Preserve pandas underlying index when not creating a Woodwork index (#664)
Restrict Koalas version to <1.7.0 due to breaking changes (#674)
<1.7.0
Clean up dtype usage across Woodwork (#682)
Improve error when calling accessor properties or methods before init (#683)
Remove dtype from Schema dictionary (#685)
Add include_index param and allow unique columns in Accessor mutual information (#699)
include_index
Include DataFrame equality and use_standard_tags in WoodworkTableAccessor equality check (#700)
use_standard_tags
Remove DataTable and DataColumn classes to migrate towards the accessor approach (#713)
DataTable
DataColumn
Change sample_series dtype to not need conversion and remove convert_series util (#720)
sample_series
convert_series
Rename Accessor methods since DataTable has been removed (#723)
Update README.md and Get Started guide to use accessor (#655, #717)
Update Understanding Types and Tags guide to use accessor (#657)
Update docstrings and API Reference page (#660)
Update statistical insights guide to use accessor (#693)
Update Customizing Type Inference guide to use accessor (#696)
Update Dask and Koalas guide to use accessor (#701)
Update index notebook and install guide to use accessor (#715)
Add section to documentation about schema validity (#729)
Update README.md and Get Started guide to use pd.read_csv (#730)
pd.read_csv
Make small fixes to documentation formatting (#731)
Add tests to Accessor/Schema that weren’t previously covered (#712, #716)
Update release branch name in notes update check (#719)
Thanks to the following people for contributing to this release: @gsheni, @jeff-hernandez, @johnbridstrup, @tamargrey, @thehomebrewnerd
The DataTable and DataColumn classes have been removed and replaced by new WoodworkTableAccessor and WoodworkColumnAccessor classes which are used through the ww namespace available on DataFrames after importing Woodwork.
WoodworkTableAccessor
WoodworkColumnAccessor
ww
Include unique columns in mutual information calculations (#687)
Add parameter to include index column in mutual information calculations (#692)
Update to remove warning message from statistical insights guide (#690)
Update branch reference in tests to run on main (#641)
Make release notes updated check separate from unit tests (#642)
Update release branch naming instructions (#644)
Thanks to the following people for contributing to this release: @gsheni, @tamargrey, @thehomebrewnerd
Avoid calculating mutualinfo for non-unique columns (#563)
Preserve underlying DataFrame index if index column is not specified (#588)
Add blank issue template for creating issues (#630)
Update branch reference in tests workflow (#552, #601)
Fixed text on back arrow on install page (#564)
Refactor test_datatable.py (#574)
Thanks to the following people for contributing to this release: @gsheni, @jeff-hernandez, @johnbridstrup, @tamargrey
Add Python 3.9 support without Koalas testing (#511)
Add get_valid_mi_types function to list LogicalTypes valid for mutual information calculation (#517)
get_valid_mi_types
Handle missing values in Datetime columns when calculating mutual information (#516)
Support numpy 1.20.0 by restricting version for koalas and changing serialization error message (#532)
Move Koalas option setting to DataTable init instead of import (#543)
Add Alteryx OSS Twitter link (#519)
Update logo and add new favicon (#521)
Multiple improvements to Getting Started page and guides (#527)
Clean up API Reference and docstrings (#536)
Added Open Graph for Twitter and Facebook (#544)
Add DataTable.df property for accessing the underling DataFrame (#470)
DataTable.df
Set index of underlying DataFrame to match DataTable index (#464)
Sort underlying series when sorting dataframe (#468)
Allow setting indices to current index without side effects (#474)
Fix release document with Github Actions link for CI (#462)
Don’t allow registered LogicalTypes with the same name (#477)
Move str_to_logical_type to TypeSystem class (#482)
str_to_logical_type
Remove pyarrow from core dependencies (#508)
pyarrow
Allow for user-defined logical types and inference functions in TypeSystem object (#424)
Add __repr__ to DataTable (#425)
__repr__
Allow initializing DataColumn with numpy array (#430)
Add drop to DataTable (#434)
Migrate CI tests to Github Actions (#417, #441, #451)
Add metadata to DataColumn for user-defined metadata (#447)
metadata
Update DataColumn name when using setitem on column with no name (#426)
Don’t allow pickle serialization for Koalas DataFrames (#432)
Check DataTable metadata in equality check (#449)
Propagate all attributes of DataTable in _new_dt_including (#454)
_new_dt_including
Update links to use alteryx org Github URL (#423)
Support column names of any type allowed by the underlying DataFrame (#442)
Use object dtype for LatLong columns for easy access to latitude and longitude values (#414)
object
Restrict dask version to prevent 2020.12.0 release from being installed (#453)
Lower minimum requirement for numpy to 1.15.4, and set pandas minimum requirement 1.1.1 (#459)
Fix missing test coverage (#436)
Thanks to the following people for contributing to this release: @gsheni, @jeff-hernandez, @tamargrey, @thehomebrewnerd
Add support for creating DataTable from Koalas DataFrame (#327)
Add ability to initialize DataTable with numpy array (#367)
Add describe_dict method to DataTable (#405)
Add mutual_information_dict method to DataTable (#404)
mutual_information_dict
Add metadata to DataTable for user-defined metadata (#392)
Add update_dataframe method to DataTable to update underlying DataFrame (#407)
update_dataframe
Sort dataframe if time_index is specified, bypass sorting with already_sorted parameter. (#410)
time_index
already_sorted
Add description attribute to DataColumn (#416)
description
Implement DataColumn.__len__ and DataTable.__len__ (#415)
DataColumn.__len__
DataTable.__len__
Rename data_column.py datacolumn.py (#386)
data_column.py
datacolumn.py
Rename data_table.py datatable.py (#387)
data_table.py
datatable.py
Rename get_mutual_information mutual_information (#390)
get_mutual_information
Lower moto test requirement for serialization/deserialization (#376)
Make Koalas an optional dependency installable with woodwork[koalas] (#378)
Remove WholeNumber LogicalType from Woodwork (#380)
Updates to LogicalTypes to support Koalas 1.4.0 (#393)
Replace set_logical_types and set_semantic_tags with just set_types (#379)
set_logical_types
set_semantic_tags
Remove copy_dataframe parameter from DataTable initialization (#398)
copy_dataframe
Implement DataTable.__sizeof__ to return size of the underlying dataframe (#401)
DataTable.__sizeof__
Include Datetime columns in mutual info calculation (#399)
Maintain column order on DataTable operations (#406)
Add pyarrow, dask, and koalas to automated dependency checks (#388)
Use new version of pull request Github Action (#394)
Improve parameterization for test_datatable_equality (#409)
test_datatable_equality
Thanks to the following people for contributing to this release: @ctduffy, @gsheni, @tamargrey, @thehomebrewnerd
The DataTable.set_semantic_tags method was removed. DataTable.set_types can be used instead.
DataTable.set_semantic_tags
DataTable.set_types
The DataTable.set_logical_types method was removed. DataTable.set_types can be used instead.
DataTable.set_logical_types
WholeNumber was removed from LogicalTypes. Columns that were previously inferred as WholeNumber will now be inferred as Integer.
WholeNumber
The DataTable.get_mutual_information was renamed to DataTable.mutual_information.
DataTable.get_mutual_information
DataTable.mutual_information
The copy_dataframe parameter was removed from DataTable initialization.
Add __eq__ to DataTable and DataColumn and update LogicalType equality (#318)
__eq__
Add value_counts() method to DataTable (#342)
value_counts()
Support serialization and deserialization of DataTables via csv, pickle, or parquet (#293)
Add shape property to DataTable and DataColumn (#358)
shape
Add iloc method to DataTable and DataColumn (#365)
Add numeric_categorical_threshold config value to allow inferring numeric columns as Categorical (#363)
numeric_categorical_threshold
Add rename method to DataTable (#367)
Catch non numeric time index at validation (#332)
Support logical type inference from a Dask DataFrame (#248)
Fix validation checks and make_index to work with Dask DataFrames (#260)
make_index
Skip validation of Ordinal order values for Dask DataFrames (#270)
Improve support for datetimes with Dask input (#286)
Update DataTable.describe to work with Dask input (#296)
DataTable.describe
Update DataTable.get_mutual_information to work with Dask input (#300)
Modify to_pandas function to return DataFrame with correct index (#281)
to_pandas
Rename DataColumn.to_pandas method to DataColumn.to_series (#311)
DataColumn.to_pandas
DataColumn.to_series
Rename DataTable.to_pandas method to DataTable.to_dataframe (#319)
DataTable.to_pandas
DataTable.to_dataframe
Remove UserWarning when no matching columns found (#325)
Remove copy parameter from DataTable.to_dataframe and DataColumn.to_series (#338)
copy
Allow pandas ExtensionArrays as inputs to DataColumn (#343)
Move warnings to a separate exceptions file and call via UserWarning subclasses (#348)
Make Dask an optional dependency installable with woodwork[dask] (#357)
Create a guide for using Woodwork with Dask (#304)
Add conda install instructions (#305, #309)
Fix README.md badge with correct link (#314)
Simplify issue templates to make them easier to use (#339)
Remove extra output cell in Start notebook (#341)
Parameterize numeric time index tests (#288)
Add DockerHub credentials to CI testing environment (#326)
Fix removing files for serialization test (#350)
The DataColumn.to_pandas method was renamed to DataColumn.to_series.
The DataTable.to_pandas method was renamed to DataTable.to_dataframe.
copy is no longer a parameter of DataTable.to_dataframe or DataColumn.to_series.
Add optional include parameter for DataTable.describe() to filter results (#228)
include
DataTable.describe()
Add make_index parameter to DataTable.__init__ to enable optional creation of a new index column (#238)
DataTable.__init__
Add support for setting ranking order on columns with Ordinal logical type (#240)
Add list_semantic_tags function and CLI to get dataframe of woodwork semantic_tags (#244)
list_semantic_tags
Add support for numeric time index on DataTable (#267)
Add pop method to DataTable (#289)
Add entry point to setup.py to run CLI commands (#285)
Allow numeric datetime time indices (#282)
Remove redundant methods DataTable.select_ltypes and DataTable.select_semantic_tags (#239)
DataTable.select_ltypes
DataTable.select_semantic_tags
Make results of get_mutual_information more clear by sorting and removing self calculation (#247)
Lower minimum scikit-learn version to 0.21.3 (#297)
Add guide for dt.describe and dt.get_mutual_information (#245)
dt.describe
dt.get_mutual_information
Update README.md with documentation link (#261)
Add footer to doc pages with Alteryx Open Source (#258)
Add types and tags one-sentence definitions to Understanding Types and Tags guide (#271)
Add issue and pull request templates (#280, #284)
Add automated process to check latest dependencies. (#268)
Add test for setting a time index with specified string logical type (#279)
Implement setitem on DataTable to create/overwrite an existing DataColumn (#165)
Add to_pandas method to DataColumn to access the underlying series (#169)
Add list_logical_types function and CLI to get dataframe of woodwork LogicalTypes (#172)
Add describe method to DataTable to generate statistics for the underlying data (#181)
Add optional return_dataframe parameter to load_retail to return either DataFrame or DataTable (#189)
return_dataframe
load_retail
Add get_mutual_information method to DataTable to generate mutual information between columns (#203)
Add read_csv function to create DataTable directly from CSV file (#222)
read_csv
Fix bug causing incorrect values for quartiles in DataTable.describe method (#187)
Fix bug in DataTable.describe that could cause an error if certain semantic tags were applied improperly (#190)
Fix bug with instantiated LogicalTypes breaking when used with issubclass (#231)
Remove unnecessary add_standard_tags attribute from DataTable (#171)
add_standard_tags
Remove standard tags from index column and do not return stats for index column from DataTable.describe (#196)
Update DataColumn.set_semantic_tags and DataColumn.add_semantic_tags to return new objects (#205)
DataColumn.set_semantic_tags
DataColumn.add_semantic_tags
Update various DataTable methods to return new objects rather than modifying in place (#210)
Move datetime_format to Datetime LogicalType (#216)
Do not calculate mutual info with index column in DataTable.get_mutual_information (#221)
Move setting of underlying physical types from DataTable to DataColumn (#233)
Remove unused code from sphinx conf.py, update with Github URL(#160, #163)
Update README and docs with new Woodwork logo, with better code snippets (#161, #159)
Add DataTable and DataColumn to API Reference (#162)
Add docstrings to LogicalType classes (#168)
Add Woodwork image to index, clear outputs of Jupyter notebook in docs (#173)
Update contributing.md, release.md with all instructions (#176)
Add section for setting index and time index to start notebook (#179)
Rename changelog to Release Notes (#193)
Add section for standard tags to start notebook (#188)
Add Understanding Types and Tags user guide (#201)
Add missing docstring to list_logical_types (#202)
list_logical_types
Add Woodwork Global Configuration Options guide (#215)
Add tests that confirm dtypes are as expected after DataTable init (#152)
Remove unused none_df test fixture (#224)
none_df
Add test for LogicalType.__str__ method (#225)
LogicalType.__str__
Fix formatting issue when printing global config variables (#138)
Change add_standard_tags to use_standard_Tags to better describe behavior (#149)
Change access of underlying dataframe to be through to_pandas with ._dataframe field on class (#146)
Remove replace_none parameter to DataTables (#146)
replace_none
Add working code example to README and create Using Woodwork page (#103)
Add natural_language_threshold global config option used for Categorical/NaturalLanguage type inference (#135)
natural_language_threshold
Add global config options and add datetime_format option for type inference (#134)
datetime_format
Fix bug with Integer and WholeNumber inference in column with pd.NA values (#133)
pd.NA
Add DataTable.ltypes property to return series of logical types (#131)
DataTable.ltypes
Add ability to create new datatable from specified columns with dt[[columns]] (#127)
dt[[columns]]
Handle setting and tagging of index and time index columns (#125)
Add combined tag and ltype selection (#124)
Add changelog, and update changelog check to CI (#123)
Implement reset_semantic_tags (#118)
reset_semantic_tags
Implement DataTable getitem (#119)
Add remove_semantic_tags method (#117)
remove_semantic_tags
Add semantic tag selection (#106)
Add github action, rename to woodwork (#113)
Add license to setup.py (#112)
Reset semantic tags on logical type change (#107)
Add standard numeric and category tags (#100)
Change semantic_types to semantic_tags, a set of strings (#100)
semantic_types
semantic_tags
Update dataframe dtypes based on logical types (#94)
Add select_logical_types to DataTable (#96)
select_logical_types
Add pygments to dev-requirements.txt (#97)
Add replacing None with np.nan in DataTable init (#87)
Refactor DataColumn to make semantic_types and logical_type private (#86)
logical_type
Add pandas_dtype to each Logical Type, and remove dtype attribute on DataColumn (#85)
Add set_semantic_types methods on both DataTable and DataColumn (#75)
Support passing camel case or snake case strings for setting logical types (#74)
Improve flexibility when setting semantic types (#72)
Add Whole Number Inference of Logical Types (#66)
Add dtypes property to DataTables and repr for DataColumn (#61)
dtypes
repr
Allow specification of semantic types during DataTable creation (#69)
Implements set_logical_types on DataTable (#65)
Add init files to tests to fix code coverage (#60)
Add AutoAssign bot (#59)
Add logical types validation in DataTables (#49)
Fix working_directory in CI (#57)
Add infer_logical_types for DataColumn (#45)
infer_logical_types
Fix ReadME library name, and code coverage badge (#56, #56)
Add code coverage (#51)
Improve and refactor the validation checks during initialization of a DataTable (#40)
Add dataframe attribute to DataTable (#39)
Update ReadME with minor usage details (#37)
Add License (#34)
Rename from datatables to datatables (#4)
Add Logical Types, DataTable, DataColumn (#3)
Add Makefile, setup.py, requirements.txt (#2)
Initial Release (#1)