Release Notes¶
v0.0.5 November 11, 2020
- Enhancements
Add
__eq__
to DataTable and DataColumn and update LogicalType equality (#318)Add
value_counts()
method to DataTable (#342)Support serialization and deserialization of DataTables via csv, pickle, or parquet (#293)
Add
shape
property to DataTable and DataColumn (#358)Add
iloc
method to DataTable and DataColumn (#365)Add
numeric_categorical_threshold
config value to allow inferring numeric columns as Categorical (#363)
- Fixes
Catch non numeric time index at validation (#332)
- Changes
Support logical type inference from a Dask DataFrame (#248)
Fix validation checks and
make_index
to work with Dask DataFrames (#260)Skip validation of Ordinal order values for Dask DataFrames (#270)
Improve support for datetimes with Dask input (#286)
Update
DataTable.describe
to work with Dask input (#296)Update
DataTable.get_mutual_information
to work with Dask input (#300)Modify
to_pandas
function to return DataFrame with correct index (#281)Rename
DataColumn.to_pandas
method toDataColumn.to_series
(#311)Rename
DataTable.to_pandas
method toDataTable.to_dataframe
(#319)Remove UserWarning when no matching columns found (#325)
Remove
copy
parameter fromDataTable.to_dataframe
andDataColumn.to_series
(#338)Allow pandas ExtensionArrays as inputs to DataColumn (#343)
Move warnings to a separate exceptions file and call via UserWarning subclasses (#348)
Make Dask an optional dependency installable with woodwork[dask] (#357)
Thanks to the following people for contributing to this release: @ctduffy, @gsheni, @tamargrey, @thehomebrewnerd
Breaking Changes
The
DataColumn.to_pandas
method was renamed toDataColumn.to_series
.The
DataTable.to_pandas
method was renamed toDataTable.to_dataframe
.
copy
is no longer a parameter ofDataTable.to_dataframe
orDataColumn.to_series
.
- v0.0.4 October 21, 2020
- Enhancements
Add optional
include
parameter forDataTable.describe()
to filter results (#228)Add
make_index
parameter toDataTable.__init__
to enable optional creation of a new index column (#238)Add support for setting ranking order on columns with Ordinal logical type (#240)
Add
list_semantic_tags
function and CLI to get dataframe of woodwork semantic_tags (#244)Add support for numeric time index on DataTable (#267)
Add pop method to DataTable (#289)
Add entry point to setup.py to run CLI commands (#285)
- Fixes
Allow numeric datetime time indices (#282)
- Thanks to the following people for contributing to this release:
- v0.0.3 October 9, 2020
- Enhancements
Implement setitem on DataTable to create/overwrite an existing DataColumn (#165)
Add
to_pandas
method to DataColumn to access the underlying series (#169)Add list_logical_types function and CLI to get dataframe of woodwork LogicalTypes (#172)
Add
describe
method to DataTable to generate statistics for the underlying data (#181)Add optional
return_dataframe
parameter toload_retail
to return either DataFrame or DataTable (#189)Add
get_mutual_information
method to DataTable to generate mutual information between columns (#203)Add
read_csv
function to create DataTable directly from CSV file (#222)
- Changes
Remove unnecessary
add_standard_tags
attribute from DataTable (#171)Remove standard tags from index column and do not return stats for index column from
DataTable.describe
(#196)Update
DataColumn.set_semantic_tags
andDataColumn.add_semantic_tags
to return new objects (#205)Update various DataTable methods to return new objects rather than modifying in place (#210)
Move datetime_format to Datetime LogicalType (#216)
Do not calculate mutual info with index column in
DataTable.get_mutual_information
(#221)Move setting of underlying physical types from DataTable to DataColumn (#233)
- Documentation Changes
Remove unused code from sphinx conf.py, update with Github URL(#160, #163)
Update README and docs with new Woodwork logo, with better code snippets (#161, #159)
Add DataTable and DataColumn to API Reference (#162)
Add docstrings to LogicalType classes (#168)
Add Woodwork image to index, clear outputs of Jupyter notebook in docs (#173)
Update contributing.md, release.md with all instructions (#176)
Add section for setting index and time index to start notebook (#179)
Rename changelog to Release Notes (#193)
Add section for standard tags to start notebook (#188)
Add Understanding Types and Tags user guide (#201)
Add missing docstring to
list_logical_types
(#202)Add Woodwork Global Configuration Options guide (#215)
Thanks to the following people for contributing to this release: @gsheni, @tamargrey, @thehomebrewnerd
- v0.0.2 September 28, 2020
- Fixes
Fix formatting issue when printing global config variables (#138)
- Documentation Changes
Add working code example to README and create Using Woodwork page (#103)
Thanks to the following people for contributing to this release: @gsheni, @tamargrey, @thehomebrewnerd
- v0.1.0 September 24, 2020
Add
natural_language_threshold
global config option used for Categorical/NaturalLanguage type inference (#135)Add global config options and add
datetime_format
option for type inference (#134)Fix bug with Integer and WholeNumber inference in column with
pd.NA
values (#133)Add
DataTable.ltypes
property to return series of logical types (#131)Add ability to create new datatable from specified columns with
dt[[columns]]
(#127)Handle setting and tagging of index and time index columns (#125)
Add combined tag and ltype selection (#124)
Add changelog, and update changelog check to CI (#123)
Implement
reset_semantic_tags
(#118)Implement DataTable getitem (#119)
Add
remove_semantic_tags
method (#117)Add semantic tag selection (#106)
Add github action, rename to woodwork (#113)
Add license to setup.py (#112)
Reset semantic tags on logical type change (#107)
Add standard numeric and category tags (#100)
Change
semantic_types
tosemantic_tags
, a set of strings (#100)Update dataframe dtypes based on logical types (#94)
Add
select_logical_types
to DataTable (#96)Add pygments to dev-requirements.txt (#97)
Add replacing None with np.nan in DataTable init (#87)
Refactor DataColumn to make
semantic_types
andlogical_type
private (#86)Add pandas_dtype to each Logical Type, and remove dtype attribute on DataColumn (#85)
Add set_semantic_types methods on both DataTable and DataColumn (#75)
Support passing camel case or snake case strings for setting logical types (#74)
Improve flexibility when setting semantic types (#72)
Add Whole Number Inference of Logical Types (#66)
Add
dtypes
property to DataTables andrepr
for DataColumn (#61)Allow specification of semantic types during DataTable creation (#69)
Implements
set_logical_types
on DataTable (#65)Add init files to tests to fix code coverage (#60)
Add AutoAssign bot (#59)
Add logical types validation in DataTables (#49)
Fix working_directory in CI (#57)
Add
infer_logical_types
for DataColumn (#45)Add code coverage (#51)
Improve and refactor the validation checks during initialization of a DataTable (#40)
Add dataframe attribute to DataTable (#39)
Update ReadME with minor usage details (#37)
Add License (#34)
Rename from datatables to data_tables (#4)
Add Logical Types, DataTable, DataColumn (#3)
Add Makefile, setup.py, requirements.txt (#2)
Initial Release (#1)
Thanks to the following people for contributing to this release: @gsheni, @tamargrey, @thehomebrewnerd