v0.0.5 November 11, 2020
Enhancements Add __eq__ to DataTable and DataColumn and update LogicalType equality (#318) Add value_counts() method to DataTable (#342) Support serialization and deserialization of DataTables via csv, pickle, or parquet (#293) Add shape property to DataTable and DataColumn (#358) Add iloc method to DataTable and DataColumn (#365) Add numeric_categorical_threshold config value to allow inferring numeric columns as Categorical (#363) Fixes Catch non numeric time index at validation (#332) Changes Support logical type inference from a Dask DataFrame (#248) Fix validation checks and make_index to work with Dask DataFrames (#260) Skip validation of Ordinal order values for Dask DataFrames (#270) Improve support for datetimes with Dask input (#286) Update DataTable.describe to work with Dask input (#296) Update DataTable.get_mutual_information to work with Dask input (#300) Modify to_pandas function to return DataFrame with correct index (#281) Rename DataColumn.to_pandas method to DataColumn.to_series (#311) Rename DataTable.to_pandas method to DataTable.to_dataframe (#319) Remove UserWarning when no matching columns found (#325) Remove copy parameter from DataTable.to_dataframe and DataColumn.to_series (#338) Allow pandas ExtensionArrays as inputs to DataColumn (#343) Move warnings to a separate exceptions file and call via UserWarning subclasses (#348) Make Dask an optional dependency installable with woodwork[dask] (#357) Documentation Changes Create a guide for using Woodwork with Dask (#304) Add conda install instructions (#305, #309) Fix README.md badge with correct link (#314) Simplify issue templates to make them easier to use (#339) Remove extra output cell in Start notebook (#341) Testing Changes Parameterize numeric time index tests (#288) Add DockerHub credentials to CI testing environment (#326) Fix removing files for serialization test (#350) Thanks to the following people for contributing to this release: @ctduffy, @gsheni, @tamargrey, @thehomebrewnerd
Add __eq__ to DataTable and DataColumn and update LogicalType equality (#318)
__eq__
Add value_counts() method to DataTable (#342)
value_counts()
Support serialization and deserialization of DataTables via csv, pickle, or parquet (#293)
Add shape property to DataTable and DataColumn (#358)
shape
Add iloc method to DataTable and DataColumn (#365)
iloc
Add numeric_categorical_threshold config value to allow inferring numeric columns as Categorical (#363)
numeric_categorical_threshold
Catch non numeric time index at validation (#332)
Support logical type inference from a Dask DataFrame (#248)
Fix validation checks and make_index to work with Dask DataFrames (#260)
make_index
Skip validation of Ordinal order values for Dask DataFrames (#270)
Improve support for datetimes with Dask input (#286)
Update DataTable.describe to work with Dask input (#296)
DataTable.describe
Update DataTable.get_mutual_information to work with Dask input (#300)
DataTable.get_mutual_information
Modify to_pandas function to return DataFrame with correct index (#281)
to_pandas
Rename DataColumn.to_pandas method to DataColumn.to_series (#311)
DataColumn.to_pandas
DataColumn.to_series
Rename DataTable.to_pandas method to DataTable.to_dataframe (#319)
DataTable.to_pandas
DataTable.to_dataframe
Remove UserWarning when no matching columns found (#325)
Remove copy parameter from DataTable.to_dataframe and DataColumn.to_series (#338)
copy
Allow pandas ExtensionArrays as inputs to DataColumn (#343)
Move warnings to a separate exceptions file and call via UserWarning subclasses (#348)
Make Dask an optional dependency installable with woodwork[dask] (#357)
Create a guide for using Woodwork with Dask (#304)
Add conda install instructions (#305, #309)
Fix README.md badge with correct link (#314)
Simplify issue templates to make them easier to use (#339)
Remove extra output cell in Start notebook (#341)
Parameterize numeric time index tests (#288)
Add DockerHub credentials to CI testing environment (#326)
Fix removing files for serialization test (#350)
Thanks to the following people for contributing to this release: @ctduffy, @gsheni, @tamargrey, @thehomebrewnerd
Breaking Changes
The DataColumn.to_pandas method was renamed to DataColumn.to_series. The DataTable.to_pandas method was renamed to DataTable.to_dataframe. copy is no longer a parameter of DataTable.to_dataframe or DataColumn.to_series.
The DataColumn.to_pandas method was renamed to DataColumn.to_series.
The DataTable.to_pandas method was renamed to DataTable.to_dataframe.
copy is no longer a parameter of DataTable.to_dataframe or DataColumn.to_series.
Add optional include parameter for DataTable.describe() to filter results (#228)
include
DataTable.describe()
Add make_index parameter to DataTable.__init__ to enable optional creation of a new index column (#238)
DataTable.__init__
Add support for setting ranking order on columns with Ordinal logical type (#240)
Add list_semantic_tags function and CLI to get dataframe of woodwork semantic_tags (#244)
list_semantic_tags
Add support for numeric time index on DataTable (#267)
Add pop method to DataTable (#289)
Add entry point to setup.py to run CLI commands (#285)
Allow numeric datetime time indices (#282)
Remove redundant methods DataTable.select_ltypes and DataTable.select_semantic_tags (#239)
DataTable.select_ltypes
DataTable.select_semantic_tags
Make results of get_mutual_information more clear by sorting and removing self calculation (#247)
get_mutual_information
Lower minimum scikit-learn version to 0.21.3 (#297)
Add guide for dt.describe and dt.get_mutual_information (#245)
dt.describe
dt.get_mutual_information
Update README.md with documentation link (#261)
Add footer to doc pages with Alteryx Open Source (#258)
Add types and tags one-sentence definitions to Understanding Types and Tags guide (#271)
Add issue and pull request templates (#280, #284)
Add automated process to check latest dependencies. (#268)
Add test for setting a time index with specified string logical type (#279)
@ctduffy, @gsheni, @tamargrey, @thehomebrewnerd
Implement setitem on DataTable to create/overwrite an existing DataColumn (#165)
Add to_pandas method to DataColumn to access the underlying series (#169)
Add list_logical_types function and CLI to get dataframe of woodwork LogicalTypes (#172)
Add describe method to DataTable to generate statistics for the underlying data (#181)
describe
Add optional return_dataframe parameter to load_retail to return either DataFrame or DataTable (#189)
return_dataframe
load_retail
Add get_mutual_information method to DataTable to generate mutual information between columns (#203)
Add read_csv function to create DataTable directly from CSV file (#222)
read_csv
Fix bug causing incorrect values for quartiles in DataTable.describe method (#187)
Fix bug in DataTable.describe that could cause an error if certain semantic tags were applied improperly (#190)
Fix bug with instantiated LogicalTypes breaking when used with issubclass (#231)
Remove unnecessary add_standard_tags attribute from DataTable (#171)
add_standard_tags
Remove standard tags from index column and do not return stats for index column from DataTable.describe (#196)
Update DataColumn.set_semantic_tags and DataColumn.add_semantic_tags to return new objects (#205)
DataColumn.set_semantic_tags
DataColumn.add_semantic_tags
Update various DataTable methods to return new objects rather than modifying in place (#210)
Move datetime_format to Datetime LogicalType (#216)
Do not calculate mutual info with index column in DataTable.get_mutual_information (#221)
Move setting of underlying physical types from DataTable to DataColumn (#233)
Remove unused code from sphinx conf.py, update with Github URL(#160, #163)
Update README and docs with new Woodwork logo, with better code snippets (#161, #159)
Add DataTable and DataColumn to API Reference (#162)
Add docstrings to LogicalType classes (#168)
Add Woodwork image to index, clear outputs of Jupyter notebook in docs (#173)
Update contributing.md, release.md with all instructions (#176)
Add section for setting index and time index to start notebook (#179)
Rename changelog to Release Notes (#193)
Add section for standard tags to start notebook (#188)
Add Understanding Types and Tags user guide (#201)
Add missing docstring to list_logical_types (#202)
list_logical_types
Add Woodwork Global Configuration Options guide (#215)
Add tests that confirm dtypes are as expected after DataTable init (#152)
Remove unused none_df test fixture (#224)
none_df
Add test for LogicalType.__str__ method (#225)
LogicalType.__str__
Thanks to the following people for contributing to this release: @gsheni, @tamargrey, @thehomebrewnerd
Fix formatting issue when printing global config variables (#138)
Change add_standard_tags to use_standard_Tags to better describe behavior (#149)
Change access of underlying dataframe to be through to_pandas with ._dataframe field on class (#146)
Remove replace_none parameter to DataTables (#146)
replace_none
Add working code example to README and create Using Woodwork page (#103)
Add natural_language_threshold global config option used for Categorical/NaturalLanguage type inference (#135)
natural_language_threshold
Add global config options and add datetime_format option for type inference (#134)
datetime_format
Fix bug with Integer and WholeNumber inference in column with pd.NA values (#133)
pd.NA
Add DataTable.ltypes property to return series of logical types (#131)
DataTable.ltypes
Add ability to create new datatable from specified columns with dt[[columns]] (#127)
dt[[columns]]
Handle setting and tagging of index and time index columns (#125)
Add combined tag and ltype selection (#124)
Add changelog, and update changelog check to CI (#123)
Implement reset_semantic_tags (#118)
reset_semantic_tags
Implement DataTable getitem (#119)
Add remove_semantic_tags method (#117)
remove_semantic_tags
Add semantic tag selection (#106)
Add github action, rename to woodwork (#113)
Add license to setup.py (#112)
Reset semantic tags on logical type change (#107)
Add standard numeric and category tags (#100)
Change semantic_types to semantic_tags, a set of strings (#100)
semantic_types
semantic_tags
Update dataframe dtypes based on logical types (#94)
Add select_logical_types to DataTable (#96)
select_logical_types
Add pygments to dev-requirements.txt (#97)
Add replacing None with np.nan in DataTable init (#87)
Refactor DataColumn to make semantic_types and logical_type private (#86)
logical_type
Add pandas_dtype to each Logical Type, and remove dtype attribute on DataColumn (#85)
Add set_semantic_types methods on both DataTable and DataColumn (#75)
Support passing camel case or snake case strings for setting logical types (#74)
Improve flexibility when setting semantic types (#72)
Add Whole Number Inference of Logical Types (#66)
Add dtypes property to DataTables and repr for DataColumn (#61)
dtypes
repr
Allow specification of semantic types during DataTable creation (#69)
Implements set_logical_types on DataTable (#65)
set_logical_types
Add init files to tests to fix code coverage (#60)
Add AutoAssign bot (#59)
Add logical types validation in DataTables (#49)
Fix working_directory in CI (#57)
Add infer_logical_types for DataColumn (#45)
infer_logical_types
Fix ReadME library name, and code coverage badge (#56, #56)
Add code coverage (#51)
Improve and refactor the validation checks during initialization of a DataTable (#40)
Add dataframe attribute to DataTable (#39)
Update ReadME with minor usage details (#37)
Add License (#34)
Rename from datatables to data_tables (#4)
Add Logical Types, DataTable, DataColumn (#3)
Add Makefile, setup.py, requirements.txt (#2)
Initial Release (#1)